10,000 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public review):

      This is a well-written and fully documented methods paper.

      The authors have established a clear rationale for their new packages, especially for real-time use, and demonstrate significant speed improvements that will likely appeal to many users of tools like DLC, SLEAP, and LightningPose. The inclusion of a graphical user interface will help make the package more accessible to neuroscientists with limited computational expertise. While it may be challenging to get users to switch from their established workflows for video analysis, the speed gains offered by this package make it worth considering. The hardware aspects of the project are well-documented, and the GitHub repository for this part of the setup is also thorough. Overall, this paper provides a clear summary of the tools, their uses, setup, and benefits.

      I have a few minor questions about the collective set of tools.

      First, the GitHub repository for SqueakPoseStudio appears to be missing a testing routine and associated badge, and the package has not been formally released. This means users would need to download the repository to install it, correct? I suggest the authors consider publishing a formal release of the package, making it installable via pip, and including a basic testing routine to clearly display the package's status on the repository page. Adding a DOI from Zenodo would also be helpful. A testing routine is especially useful when updates are made, as many users avoid repositories with failing tests.

      Second, the installation instructions simply state "Create a virtualenv and install:". This may not be sufficient for many researchers, as most neuroscientists are not experienced Python programmers and require clear guidance on the environment specific to this package. The installation instructions should be expanded to provide more detailed guidance and encourage more users. It would also be helpful to verify that the setups work across Windows, Mac, and Linux.

      Third, the package defaults to UMAP for non-linear dimensionality reduction, which has some known issues. Can the package be modified to allow for alternative mapping methods, such as PaCMAP, PyDiffMap, or the more comprehensive topometry package?

      Finally, what specific GPUs have been tested with the package, and are there any limitations based on the age of the video card or the available libraries for the deep learning component of the package?

    2. Reviewer #2 (Public review):

      Summary:

      This work presents three tools: SqueakPose Studio, which is used for pose estimation; SqueakView, which is used for real-time video and sensor data capture and analysis; and MouseHouse, which is a behavioral and sensor suite for mouse experiments. Together, these tools provide a comprehensive behavioral platform for acquiring and analyzing video, sensor, and behavioral data. The work is open source and provided as a resource for the field.

      Strengths:

      (1) Squeakpose Studio was relatively easy to install and use. We were impressed that we were able to install it and test our own videos with minimal struggles. The authors provide installation tutorial videos that were very helpful.

      (2) The GUI environment for SqueakPose Studio was very usable, and the authors should be commended on the time and effort that went into improving the useability of their system. The keypoint and skeleton configuration was flexible, allowing us to define custom body part sets without modifying code directly. The pose estimation accuracy on our own videos was good right out of the box, without requiring fine-tuning or retraining. For a tool being evaluated for the first time, this was all very impressive!

      Weaknesses:

      (1) While we were able to install and test Squeakpose Studio, it was not entirely seamless. The primary installation resource is a tutorial video, and we would recommend supplementing this with a written installation checklist that explicitly lists all required software dependencies (e.g. Python, UV, Visual Studio). The tutorial video was also at times unclear in distinguishing required from optional components. For example, Visual Studio is described as not necessary, yet the tutorial demonstrates the workflow entirely within that environment, so it may be challenging for a user to follow along without that. We recommend that the authors adopt a stricter, step-by-step installation guide that is prescriptive about required software and leaves little room for confusion.

      (2) The paper also describes SqueakView and MouseHouse. Unfortunately, we were unable to evaluate these components as both require the MouseHouse hardware platform. Even without directly using MouseHouse, we noticed some incompleteness here, as we could not locate a bill of materials, component pricing, or assembly guide in the paper or associated GitHub repositories. Given that affordability and accessibility are central claims, a consolidated parts list, approximate costs, and a build guide or video would be necessary for most labs to realistically decide whether they plan to replicate the hardware and evaluate this functionality that the paper describes. In this regard, we felt that MouseHouse and potentially SqueakView were not sufficiently documented for publication.

      (3) The benchmarking comparison to DeepLabCut (DLC) introduced multiple challenges that left us unclear if the head-to-head comparison was appropriate as described. First, the dataset used for benchmarking was small and homogeneous, from the methods they used "10 min open-field tasks of single mice with bilateral photometry cables." As such, the claims about comparisons between SqueakPose Studio and DLC may be too broad, given this single test case. Specifically, this dataset does not test robustness across lighting conditions, coat colors, species, occlusions, different-shaped arenas, etc. Second, the comparison to DLC in Figure 1 does not include any quantitative statistical comparisons, which are needed to evaluate the claims that were made. For instance, the error in Figure 1e looks worse for their system than DLC, although statistical comparisons were not made. Third, there are many settings and optimizations that can be made for both systems. Without more detail, this makes it hard to know if the head-to-head comparison is really fair. Fourth - the metrics are given as very specific numbers from single runs, i.e., an inference time of 71.59 minutes in Figure 1d. This metric would be more meaningful if it reported the mean of multiple runs, with error estimation. Finally, while the code is available, the trained datasets are made available only on "reasonable request". Given the importance of these datasets to evaluating the method and allowing others to benchmark it against other systems, these should be made available on GitHub. Overall, I would recommend toning down the comparison to DLC and focusing on the strengths of Squeakpose Studio on its own merits.

      (4) The paper at times makes general statements that are beyond what is shown. For instance, discussions of use in human applications are aspirational and should be treated much more conservatively in the discussion, or possibly even removed. As it stands, the discussion implies that this system can already do "zero-shot tracking of human posture and movement", enabling "a bridge between preclinical and clinical behavioral analysis". In principle, this may be true, but even for a Discussion section, this goes far beyond the capabilities that the paper actually shows.

      (5) While the comprehensive nature of the system and its 3 parts is impressive, I felt that it also detracted from the main focus of the paper, which was Squeakpose Studio. I might recommend dropping the other two parts, as they also require a much higher bar for a user to evaluate, and only present the Squeakpose Studio in this paper, presenting this as a general resource for pose estimation. This would also allow them more space to more comprehensively benchmark SqueakPose Studio.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this manuscript, the role of the insulin receptor and the insulin growth factor receptor was investigated in podocytes. Mice, where both receptors were deleted, developed glomerular dysfunction and developed proteinuria and glomerulrosclerosis over several months. Because of concerns about incomplete KO, the authors generated and studied podocyte cell lines where both receptors were deleted. Loss of both receptors was highly deleterious with greater than 50% cell death. To elucidate the mechanism of cell death, the authors performed global proteomics and found that spliceosome proteins were downregulated. They confirmed this directly by using long-read sequencing. These results suggest a novel role for insulin and IGF1R signaling in RNA splicing in podocytes.

      This is primarily a descriptive study and no technical concerns are raised. The mechanism of how insulin and IGF1 signaling regulates splicing is not directly addressed but implicates potentially the phosphorylation downstream of these receptors. In the revised manuscript, it is shown that the mouse KO is incomplete potentially explaining the slow onset of renal insufficiency. Direct measurement of GFR and serial serum creatinines might also enhance our understanding of progression of disease, proteinuria is a strong sign of renal injury. An attempt to rescue the phenotype by overexpression of SF3B4 would also be useful but may be masked by defects in other spliceosome genes. As insulin and IGF are regulators of metabolism, some assessment of metabolic parameters would be an optional add-on.

      Significance:

      With the GLP1 agonists providing renal protection, there is great interest in understanding the role of insulin and other incretins in kidney cell biology. It is already known that Insulin and IGFR signaling play important roles in other cells of the kidney. So, there is great interest in understanding these pathways in podocytes. The major advance is that these two pathways appear to have a role in RNA metabolism.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Coward and colleagues report on the role of insulin/IGF axis in podocyte gene transcription. They knocked out both the insulin and IGFR1 mice. Dual KO mice manifested a severe phenotype, with albuminuria, glomerulosclerosis, renal failure and death at 4-24 weeks.

      Long read RNA sequencing was used to assess splicing events. Podocyte transcripts manifesting intron retention were identified. Dual knock-out podocytes manifested more transcripts with intron retention (18%) compared wild-type controls (18%), with an overlap between experiments of ~30%.

      Transcript productivity was also assessed using FLAIR-mark-intron-retention software. Intron retention w seen in 18% of ciDKO podocyte transcripts compared to 14% of wild-type podocyte transcripts (P=0.004), with an overlap between experiments of ~30% (indicating the variability of results with this method). Interestingly, ciDKO podocytes showed downregulation of proteins involved in spliceosome function and RNA processing, as suggested by LC/MS and confirmed by Western blot.

      Pladienolide (a spliceosome inhibitor) was cytotoxic to HeLa cells and to mouse podocytes, but no toxicity was seen in murine glomerular endothelial cells.

      The manuscript is generally clear and well-written. Mouse work was approved in advance. The four figures are generally well-designed, bars/superimposed dot-plots.

      Methods are generally well described.

      Comments on previous version:

      Coward and colleagues have done an excellent job of responding to all the reviewer comments.

    3. Reviewer #4 (Public review):

      This report entitled "The insulin/IGF axis is critically important (for) controlling gene transcription in the podocyte" from Hurcombe et al is based on a mouse double knockdown of the IR and IGF1R and a parallel cultured mouse podocyte model. Insulin/IGF signaling system in mammals evolved as three gene reduplicated peptides (insulin, IGF-1, and IGF-2) and their two receptors IR and IGF1R that cross-react to variable extents with the peptides, are ubiquitously expressed, and signal through parallel pathways. The major downstream effect of insulin is to regulate glucose uptake and metabolism, while that of the IGF pathways is to regulate growth and cell cycling in part through mTORC1. The GH-IGF-1-IGF1R pathway regulates post-natal growth. IGF-2 signaling is thought to play a major role in regulating intrauterine growth and development, although IGF-2 is also present at high levels in post-natal life. Thus, one would anticipate that reducing IR/IGF1R signaling in any cell would slow growth and cell cycling by reducing growth factor and metabolic mTORC1-mediated and other processes including the splicing of RNA for protein synthesis.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors describe the results of a single study designed to investigate the extent to which horizontal orientation energy plays a key role in supporting view-invariant face recognition. The authors collected behavioral data from adult observers who were asked to complete an old/new face matching task by learning broad-spectrum faces (not orientation filtered) during a familiarization phase and subsequently trying to label filtered faces as previously seen or novel at test. This data revealed a clear bias favoring the use of horizontal orientation energy across viewpoint changes in the target images. The authors then compared different ideal observer models (cross-correlations between target and probe stimuli) to examine how this profile might be reflected in the image-level appearance of their filtered images. This revealed that a model looking for the best matching face within a viewpoint differed substantially from human data, exhibiting a vertical orientation bias for extreme profiles. However, a model forced to match targets to probes at different viewing angles exhibited a consistent horizontal bias in much the same manner as human observers.

      Strengths:

      I think the question is an important one: The horizontal orientation bias is a great example of a low-level image property being linked to high-level recognition outcomes and understanding the nature of that connection is important. I found the old/new task to be a straightforward task that was implemented ably and that has the benefit of being simple for participants to carry out and simple to analyze. I particularly appreciated that the authors chose to describe human data via a lower-dimensional model (their Gaussian fits to individual data) for further analysis. This was a nice way to express the nature of the tuning function favoring horizontal orientation bias in a way that makes key parameters explicit. Broadly speaking, I also thought that the model comparison they include between the view-selective and view-tolerant models was a great next step. This analysis has the potential to reveal some good insights into how this bias emerges and ask fine-grained questions about the parameters in their model fits to the behavioral data.

      Weaknesses:

      I'll start with what I think is the biggest difficulty I had with the paper. Much as I liked the model comparison analysis, I also don't quite know what to make of the view-tolerant model. As I understand the authors' description, the key feature of this model is that it does not get to compare target and probe at the same yaw angle, but must instead pick a best match from candidates that are at different yaws. While it is interesting to see that this leads to a very different orientation profile, it also isn't obvious to me why such a comparison would be reflective of what the visual system is probably doing. I can see that the view-specific model is more or less assuming something like an exemplar representation of each face: You have the opportunity to compare a new image to a whole library of viewpoints and presumably it isn't hard to start with some kind of first pass that identifies the best matching view first before trying to identify/match the individual in question. What I don't get about the view-tolerant model is that it seems almost like an anti-exemplar model: You specifically lack the best viewpoint in the library but have to make do with the other options. I sort of understand the reasoning that this enforces tolerance of viewpoint variability, but I'm not clear on whether or not this is a version of face familiarity and recognition that the authors think has an analog in human visual processing.

      I do think that this model is interesting in terms of the differential tuning it exhibits, but don't find it easy to align with any theoretical perspective on face recognition. Specifically, do the authors think there is a stage of face processing in which tolerance as they've operationalized it in the model is extant? What I'm looking for is a concrete description of the circumstances that the authors are saying lead to this kind of model potentially being a meaningful analog of face recognition. For example, is the idea that one may become familiar with a face in some very limited set of viewpoints and then be presented with that face in other views?

      Alternatively, if the authors prefer to say that they simply thought this was a nice exercise in terms of identifying a different model and that it may not be a meaningful proxy for face recognition. I think that's fine, to be clear! I just still don't see anything in the text that convinces me of the ecological validity of this version of view-tolerance.

    2. Reviewer #2 (Public review):

      This study investigates the visual information that is used for the recognition of faces. This is an important question in vision research and is critical for social interactions more generally. The authors ask whether our ability to recognise faces, across different viewpoints, varies as a function of the orientation information available in the image. Consistent with previous findings from this group and others, they find that horizontally filtered faces were recognised better than vertically filtered faces. Next, they probe the mechanism underlying this pattern of data by designing two model observers. The first was optimised for faces at a specific viewpoint (view-selective). The second was generalised across viewpoints (view-tolerant). In contrast to the human data, the view-specific model shows that the information that is useful for identity judgements varies according to viewpoint. For example, frontal face identities are again optimally discriminated with horizontal orientation information, but profiles are optimally discriminated with more vertical orientation information. These findings show human face recognition is biased toward horizontal orientation information, even though this may be suboptimal for the recognition of profile views of the face.

      One issue in the design of this study was the lowering of the signal-to-noise ratio in the view-selective observer. This decision was taken to avoid ceiling effects. However, it is not clear how this affects the similarity with the human observers.

      Another issue is the decision to normalise image energy across orientations and viewpoints. I can see the logic in wanting to control for these effects, but this does reflect natural variation in image properties. So, again, I wonder what the results would look like without this step.

      Despite the bias toward horizontal orientations in human observers, there were some differences in the orientation preference at each viewpoint. For example, frontal faces were biased to horizontal (90 deg) but other viewpoints had biases that were slightly off horizontal (e.g. right profile: 80 deg, left profile: 100 deg). This does seem to show that differences in statistical information at different viewpoints (more horizontal information for frontal and more vertical information for profile) do influence human perception. It would be good to reflect on this nuance in the data.

      Comments on revisions:

      I am happy with the response and changes to the comments in my review. The key findings from this study are: (1) that there is bias toward the use of horizontal information across all viewpoints for face recognition in humans using an old-new recognition task. (2) In contrast, the optimal information for matching faces varies as a function of viewpoint. The view-selective model shows horizontal information is dominant for frontal views and vertical information is dominant for profile views.

      The data from the view-tolerant model is less easy to interpret as it doesn't fit with any theoretically plausible model of face recognition. It might be a useful model for a face matching task in which participants had to match unfamiliar faces across viewpoints. This might be a possible extension of the current work.

      Nonetheless, I still think this is an interesting contribution to the literature.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors address whether theta/beta ratio /TBR) can be used as a clinical biomarker for ADHD.

      Strengths:

      The data were acquired independently from 2 separate datasets, and there are sufficient subjects for adequate statistical power. The authors applied up-to-date EEG data preprocessing, state-of-the-art feature extraction, and statistical analyses, using a multiverse approach. By testing and comparing all meaningful approaches, defined a priori in the previous meta-analysis, the author convincingly demonstrates that TBR cannot be used as a clinical biomarker, and previous positive results can be explained by interactions between different factors (alpha peak frequency, aperiodic component, age).

      Weaknesses:

      There are no apparent issues with data, separate datasets, large sample sizes, and state-of-the-art data analysis.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript examines whether the theta-beta ratio as derived from EEG data relates to ADHD diagnoses. To do so, it performs a multiverse analysis across a large number of analytical choices, applied to a large EEG dataset, and corroborated in an additional validation set. The results overall show that the TBR is not a reliable indicator of ADHD diagnosis. In discussing the patterns of results across analytical choices, the authors also demonstrate some key points about what appears to be driving the ratio measures, noting that significant results appear to be driven by choices regarding aperiodic-correction and the use of individualized alpha frequencies, suggesting TBR measures can be affected by these features rather than reflecting theta and/or beta activity.

      Strengths:

      This manuscript addresses a clearly posed and important question in the literature, addressing a longstanding discussion on the relationship between TBR and ADHD, and uses a large dataset and an expansive analysis approach to provide a definitive answer. The strengths of the approach allow for a clear answer, providing a notable contribution to the field.

      Weaknesses:

      I find no notable weaknesses in the current manuscript nor any major issues that I think challenge the key findings of this manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Strzelczyk, Vetsch, and Langer tackle an incredibly important question in clinical neuroscience: the use of the theta/beta ratio as a biomarker of attention deficit hyperactivity disorder (ADHD). The theta/beta ratio is argued to be so reliable as an ADHD biomarker that, in the United States, the Food and Drug Administration has approved its use as a biomarker for ADHD diagnosis. However, there is mounting evidence that the theta/beta ratio is likely not really measuring the relative power between two oscillations - the theta rhythm and the beta rhythm - but rather reflects differences in a singular, non-oscillatory aperiodic process. In this very convincing study, Strzelczyk and colleagues take a "multiverse" analysis approach to show that aperiodic activity differences between healthy controls and people with ADHD are driving the apparent theta/beta ratio differences. While in a vacuum, where a measure is a measure and if it's related to a diagnosis it's still useful no matter what, this distinction might not seem important, from a neuroscientific perspective this is a critical distinction, because the ratio between two oscillations has fundamentally very different underlying physiological mechanisms than aperiodic differences, and this framing has a major impact on guiding research on the diagnosis and treatment of ADHD.

      Strengths:

      While smaller studies and analyses have already hinted at similar results as shown here, the current study's multiverse analysis approach is comprehensive, convincing, and very well done. The large sample size of 1,499 participants is very impressive, as is the use of an independent validation sample of 381 participants.

      Overall, the technical and statistical aspects are very well done: the multiverse approach, the validation set, the resampling methods, and even the shiny apps. The authors should be applauded for being so thorough and making their data and analyses publicly accessible.

      Weaknesses:

      To be clear, I see no breaking weaknesses in the theoretical foundations, methods, statistical analyses, or interpretations.

    1. Reviewer #1 (Public review):

      Summary:

      GPR52 is an orphan receptor implicated in neuropsychiatric disorders; however, the absence of tools capable of monitoring GPR52 activity in real time has stalled both mechanistic research and ligand discovery. This study addresses this gap by reporting the development of GPR52-1.0, a genetically encoded fluorescent sensor designed to detect activation of GPR52. The sensor was systematically engineered using the established GRAB platform, yielding a construct with micromolar sensitivity and high selectivity in cell culture. The authors largely achieve their stated aims, however the biological relevance of their aims is unclear, as GPR52 is reported to be a constitutively active receptor (PMID: 32076264, PMID: 26384023). GPR52-1.0 is a validated, specific, and sensitive sensor that functions in vitro and ex vivo. The claim that electrically stimulated endogenous GPR52 ligand release occurs in the striatum is supported by the specificity of the GPR52 antagonist block using ex vivo brain slices, however, once again this aim is clouded by evidence that GPR52 is constitutively active. The sensor is presented as a tool for future deorphanization; however, this assumes that the physiological ligand is an agonist, which is unclear based on the evidence that GPR52 is constitutively active. If the authors can explain or adapt their experiments and manuscript in the context of GPR52 constitutive activity, this will be useful work to the community. The impact of this work is likely to be moderate to high within the specialized communities studying orphan GPCRs, neuronal signaling, and neuropsychiatric disease. The GRAB sensor strategy has already generated widely adopted tools for other receptors, and a validated GPR52 sensor would fill a genuine gap. The GRAB technology makes GPR52-1.0 directly applicable to in vivo studies. It is likely that GPR52-1.0 could be replicated for other orphan receptors to facilitate their deorphanization.

      Strengths:

      (1) Systematic and rigorous sensor optimization and characterization by screening ~800 variants with iterative linker and cpEGFP mutation step. The resulting EC50 values are characterized in HEK293T and cultured neurons.

      (2) Testing GPR52-1.0 against a broad panel of neurotransmitters with no detectable off-target activation strengthens confidence in sensor specificity.

      (3) The use of a selective antagonist to confirm specificity, both in cell lines and in brain slices, strengthens the conclusions significantly.

      (4) Electrically stimulated GPR52-1.0 fluorescence changes in ex vivo striatal slices are blocked by a GPR52 antagonist. This is the most biologically significant result in the manuscript, as GPR52-related diseases can involve the striatum.

      Weaknesses:

      (1) The work, both experimentally and in its presentation, is not put into the context of what is known about GPR52 pharmacology and signaling. It is reported by multiple groups that GPR52 has high constitutive activity and does not require a ligand for high levels of signaling (PMID: 32076264, PMID: 26384023). The authors should clarify whether GPR52-1.0 senses constitutive activation and whether baseline fluorescence is stable over the timescale of their experiments. The cell and mouse work needs to be reframed and conducted in the context of the high basal activity of the receptor, or the authors need to explain the differences between their study and other studies.

      (2) The electrical stimulation used in brain slice experiments is non-specific. This could be activating many cell types and neurotransmitter systems simultaneously. The pharmacological block by the GPR52 antagonist is reassuring, but the identity of the molecules driving the signal remains unknown. It could be that GPR52 is constitutively active, and that the electrical stimulation drives higher expression of GPR52 and thus constitutive signaling. This constitutive signaling can then be inhibited by the GPR52 antagonist. In this scenario, there would be no endogenous GPR52 agonist invoked by electrical stimulation.

      (3) The ex vivo brain slice data rely on n=9 slices without reporting the number of animals that the slices come from. Given the importance of this result, more biological replicates and clear reporting of animal numbers would strengthen confidence.

      (4) The manuscript does not benchmark GPR52-1.0 against existing approaches (e.g., HTRF, BRET, or calcium mobilization assays) to contextualize its advantages in a drug-discovery or screening workflow.

      (5) The paper's title references deorphanization, but the authors have made no attempts toward this deorphanization. No candidate ligand molecules are identified or tested.

    2. Reviewer #2 (Public review):

      Summary:

      This study describes the development of GPR52-1.0, a novel genetically encoded fluorescent sensor for the orphan GPCR, GPR52. The authors also utilized this sensor in vivo in brain slices and discovered that striatal neuron excititation may activate GPR52.

      Strengths:

      (1) The design and validation of the sensor are elegant, thorough, and rigorous. The authors conducted a systematic and impressive optimization screen of numerous variants to arrive at the top-performing GPR52-1.0 sensor. The subsequent characterization is thorough, showing excellent membrane trafficking, appropriate pharmacological profiles (EC50, IC50) by the GPR52 chemical agonist/antagonist, rapid kinetics, and high specificity against a panel of common neurotransmitters. The functional characterization was also performed in multiple experimental systems.

      (2) The most exciting result is the observation that electrical stimulation may activate GPR52 in the striatum, an area where GPR52 is natively expressed. The blockade by a specific GPR52 antagonist confirms its specificity and provides the first direct evidence for activity-dependent, native GPR52 ligand in striata. This finding alone is a significant step forward and strongly justifies the sensor's development.

      (3) The manuscript is well-written and logically structured. The figures are clear and effectively illustrate the key data, from the initial screening process to the final ex vivo validation. The authors did not overstate their discoveries.

      Weaknesses:

      (1) The sensor specificity is largely based on a single agonist/antagonist, and it might be desired for future studies to confirm this by additional agonists/antagonists or by point mutagenesis that is known to influence GPR52 activation (for example, the ones reported in (PMID: 40087539).

      (2) The discovery of the existence of activity-dependent, native GPR52 ligand(s) in striata is extremely exciting. This might be further strengthened by inhibiting synaptic transmitter release with TTX, calcium channel blockers, or SNARE complex disruptors, etc.

    1. Reviewer #1 (Public review):

      Summary:

      The authors attempt to use a combination of behavioural and EEG analyses in order to investigate whether expectation of task difficulty influences spatial focus narrowing in the context of a spatially cued task, alongside an expected attention-related amplitude effect. This distinguishes the experiment from previous tasks which looked at this potential spatial narrowing in the context of more non-cued diffuse attention tasks. The authors present 2 major findings.<br /> (1) Behaviourally, they analysed the effects of cue validity and difficulty expectation on response accuracy and found that participants displayed an effect of difficulty expectation in validly cued trials, showing relatively enhanced behaviour to Hard Expectation trials, but no effect of expectation in invalidly cued trials.<br /> (2) Inverted encoding modelling on broadband EEG showed greater pre-target attentional processing in the Hard Expectation blocks. They go on to show that this enhancement comes in the form of greater amplitude of the Channel Tuning Functions (CTFs) approximately 300 to 400ms post-cue, in the absence of any spatial tuning specificity enhancement (as would be evident in a difference in CTF fit width). Together these results provide valuable findings for those investigating the separable effects of expectation and attention on target detection in visual search.

      Strengths:

      (1) This is a very solidly performed experiment and analysis, with different streams of evidence convincingly pointing in the same direction, i.e. a gain effect of Expectation in the absence of a spatial tuning effect.

      (2) EEG is competently analysed and interpreted, and the paper is well written, and simple in its motivation.

      (3) The authors report appropriately on the results in the Discussion, without overreaching.

      Comments on revised version:

      The authors have addressed all of my comments. Very interesting work, thank you!

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to determine whether people can adjust how narrowly or broadly they focus attention in advance based on expectations about how difficult an upcoming visual task will be. Specifically, they aimed to test whether expecting a more demanding search leads to a narrower focus of attention or instead strengthens attention at the relevant location without changing its spatial extent.

      Strengths:

      The study addresses a timely and interesting question about how expectations influence the preparation of attention before a task begins. The experimental design is well suited to isolating anticipatory effects by manipulating expectations about task difficulty independently of moment-to-moment stimulus information. The manuscript is clearly written, and the methods are described in sufficient detail to support transparency and reproducibility.

      Comments on revised version.

      During the review process the authors addressed my previous concerns. The revisions have improved the clarity of the analyses and the interpretation of the results, and I have no further substantive comments.

    1. Reviewer #1 (Public review):

      The manuscript titled," Sleep-Wake Transitions Are Impaired in the AppNL-G-F Mouse Model of Early Onset Alzheimer's Disease", is about a study of sleep/wake phenomena in a knockin mouse strain carrying, "three mutations in the human App gene associated with elevated risk for early onset AD". Traditional, in-depth, characterization of sleep/wake states, EEG parameters and response to sleep loss are employed to provide evidence, "supporting the use of this strain as a model to investigate interventions that mitigate AD burden during early disease stages". The sleep/wake findings of earlier studies (especially, Maezono, et al., 2020, as noted by the authors) were extended by several important, genotype-related observations, including age-related hyperactivity onset that is typically associated with increased arousal, a normal response to loss of sleep and to multiple sleep latency testing, and a stronger AD-like phenotype in females.

      The authors conclude that the AppNL-G-F mice demonstrate many of the human AD prodromal symptoms and suggest that this strain may serve as a model for prodromal AD in humans, confirming the earlier results and conclusions of Maezono, et al. Finally, based on state bout frequency and duration analyses, it is suggested that the AppNL-G-F mice may develop disruptions in mechanism(s) involved in state transition.

      The study appears to have been, technically, rigorously conducted with high quality, in depth traditional assessment of both state and EEG characteristics with the concordant addition of activity and temperature.

      The major strengths of this study derive from observations that the AppNL-G-F mice: 1) are more hyperactive in association with decreased transitions between states; 2) maintain a normal response to sleep deprivation and have normal MSLT results; and 3) display a sex specific, "stronger" insomnia-like effect of the knockin in females.

      The weaknesses stem from the study's impact being limited due to its being largely confirmatory of the Maezono et al. study with advances of import to a potentially, more focused field. Further, the authors conclude that AppNL-G-F mice have disrupted mechanism(s) responsible for state transition, however these were not directly examined. The rationale for this conclusion is stated by the authors as based on the observations that bouts of both W and NREM tend to be longer in duration and decreased in frequency in AppNL-G-F mice. Although altered mechanism(s) of state transition (it is not clear what mechanisms are referenced here) cannot be ruled out, other explanations require careful consideration. It is acknowledged in the discussion that increased arousal in association with hyperactivity would be expected to result in increased duration of W bouts during the active phase. This would also predictably result in greater sleep pressure that is typically associated with more consolidated NREM bouts, consistent with the observations of bout duration and frequency. The results from the MSLT tests and lack of increased EEG slow wave activity are problematic to interpret in the context of increased arousal (evidenced by the hyperactivity) since these phenomena, known to be enhanced in association with increased sleep pressure, may be masked by arousal (or by some other effect of the altered genotype). Perhaps, the effect on consolidation is less sensitive. Thus, understanding the underlying mechanism(s) involved is needed for conclusion(s) about sleep pressure.

      Overall, this study's findings are valuable but with respect to the claims, incomplete.

    2. Reviewer #2 (Public review):

      Summary:

      Overview of questions being answered and study design: The authors have used a knock-in mouse model to explore late in life amyloid effects on sleep. This is an excellent model as the mutated genes are regulated by the endogenous promoter system. The sleep study techniques and statistical analyses are also first rate.

      The group finds an age-dependent increase in motor activity in advanced age in the NLGF homozygous knock-in mice (NLGF), with a parallel age dependent increase in body temperature, both effects predominate in the dark period. Interestingly the sleep patterns do not quite follow the sleep changes. Wake time is increased in NLGF mice and there is no progression in increased wake over time. NREMS and REM sleep are both reduced and there is no progression. Sleep wake effects, however, show a robust light:dark effect with larger effects in the dark period. These findings support distinct effects of this mutation on activity and temperature and on sleep. This is the first description of the temporal pattern of these effects. NLGF mice show wake stability (longer bout durations in the dark period (their active period) and fewer brief arousals from sleep. Sleep homeostasis across the lights on period is normal. Wake power spectral density is unaffected in NLGF mice at either age. Only REM power spectra are affected with NLGF mice showing less theta and more delta. There are interesting sex differences with females showing no gene difference on wake bout number, while males show a gene effect. Similarly, gene effects on NREM bout number seems larger in males than in females. Although there was no difference in homeostatic response there was normalization of sleep wake activity after sleep deprivation.

      Strengths:

      Approach (model extent of sleep phenotyping), analysis

      Weaknesses:

      Summarized below. Viewed as "addressable."

      (1) The term insomnia. Insomnia is defined as a subjective dissatisfaction with sleep, and that cannot be ascertained in a mouse model. The findings across baseline sleep in NLGF mice support increased wake consolidation in the active period. The predominant sleep period (lights on) is largely unaffected, and the active period (lights off) shows increased activity and increased wake with longer bouts. There is a fantastic clue where NLGF effects are consistent with increased hypocretinergic (orexinergic) neuron activity in the dark period, and/or increased drive to hypocretin neurons from PVH.

      (2) Sleep-wake transitions are impaired: This should not be termed an impairment. Could actually be beneficial to have greater state stability especially wake stability in the dark or active period. There is reduced sleep in the model that can be normalized by short-term sleep loss. It is fascinating that recovery sleep normalized sleep in the NLGF in the immediate lights on and light off period. This is a key finding.

      Comments on revised version:

      An important point has been missed but otherwise authors have been responsive:

      The sleep predominant period for APPnlgf mice has few abnormalities in the predominant sleep (lights on) period to warrant "insomnia" as the descriptor, and this is an important point. Traditionally in dementias, there has been an emphasis to study insomnia as sleep is important for brain health and the night disturbances disturb caregivers as well, but a point that is not clearly emphasized is that this work is consistent with a new consideration in Alzheimer's and dementia sleep research that there may be early on in disease a hyperactivity of wake promoting neurons (orexin or locus coeruleus neurons), that contributes to the phenotype (maybe as "sundowning', agitation in the wake periods, but is also important to understand. Thus, it should be at least acknowledged that this may represent abnormal wake rather than a primary sleep abnormality. There is a new preprint by the Weinshenker group that demonstrates increased locus coeruleus activity in a tau model.

    3. Reviewer #3 (Public review):

      Summary:

      In this study, Tisdale et al. studied the sleep/wake patterns in the biological mouse model of Alzheimer's disease. The results in this study together with the established literature on the relationship of sleep and Alzheimer's disease progression, guided authors to propose this mouse model for the mechanistic understanding of sleep states that translates to Alzheimer's disease patients. However, the manuscript currently suffers from a disconnect between the physiological data and the mechanistic interpretations. Specifically, the claim of "impaired transitions" is logically at odds with the observed increase in wake-state stability or possible hyperactivity. Additionally, the description of the methods, quantification and figure presentation need substantial improvement. Without going over all the flaws and ways to improve the paper, I am pointing out some of my concerns below.

      Strengths:

      Selection of the knock-in model is a notable strength as it avoids the artifacts associated with APP overexpression and more closely mimics human pathology. The study utilizes continuous 14-day EEG recordings, providing a unique dataset for assessing chronic changes in arousal states. The assessment of sex as a biological variable identifies a more severe "insomniac-like" phenotype in females, which aligns with the higher prevalence and severity of Alzheimer's disease in women.

      Weaknesses:

      The study seems to lack a clear hypothesis driven approach and relies mostly on explorative investigations. Moreover, lack of quantitative analytical methods as well as shaky logical conclusions, possibly not supported by data in its current form, leaves room for major improvement effort.

      Since this paper studied sleep states, the "Methods" section is quite unclear on what specific criteria were used to classify sleep states. There is no quantitative description of classifying sleep based on clear reproducible procedures. There are many reasonably well characterized sleep scoring systems used in rat electrophysiological literature which could be useful here. The authors are generally expected to describe movement speed and/or EMG and/or EEG (theta/delta/gamma) criteria used to classify these epochs. The subjective (manual) nature of this procedure provides no verifiable validation on accuracy and interpretability regarding the results.

      One of the bigger claims is that "state transition mechanism(s)" are impaired. However, Figure 7 shows that model mice exhibit significantly more long wake bouts (>260s) and fewer short wake bouts (<60s). Logically, an "impaired switch" (the flip-flop model, Saper et al., 2010) results in state fragmentation. The data here show the opposite: the wake state has become too stable. This suggests the primary defect is not in the transition mechanism itself, but possibly in a pathological increase in arousal drive (hyper-arousal), likely linked to the dark-phase hyperactivity shown in Figures 4 and 5. Also, point to note is that this finding is not new.

      Figure 3 heatmaps lack color bars and units. As per eLife standards, spectral power must be quantitatively defined and methods well explained in the Methods section. Without these, the reader cannot discern if the "reduced power" in females is a global suppression of signal or a frequency-specific shift. Additionally, the representative example used to claim shorter sleep bouts lacks the statistical weight required for a major physiological conclusion. How does cooler color (not clear what range and what the interpretation is) mean shorter sleep bout in female mice? Authors should clearly mark the frequency ranges that support their claims. In this figure, there is a question mark following theta/delta range. Authors should avoid speculation and state their claims based on significant results. Please, also add the theta and delta ranges in the plot such that readers can draw their own conclusions.

      Figure 8 and the MSLT results show that model mice are "no sleepier than WT mice" and have a functional homeostatic rebound. This presents a logical flaw in the "insomnia" narrative. True insomnia in AD patients typically involves a failure of the homeostatic process or a debilitating accumulation of sleep debt. If these mice do not show increased sleepiness (shorter latency) despite ~19% less sleep, the authors might be describing a "reduced need" for sleep or a "hyper-aroused" state, possibly not a clinical insomnia phenotype.

      In Figure 9 LFP power shown and compared in percentages is problematic, as the LFP power distribution is known to be skewed (follows power law). This is particularly problematic here because all the frequencies above ~20 Hz seem to be totally flattened or nonexistent, which makes this comparison of power severely limited and biased towards the relative frequency in the highly skewed portion of the LFP power spectrum i.e very low frequency ranges like delta, theta and possibly beta. This ignores low, mid and high gamma as well as ripple band frequencies. NREM sleep is known to have relatively greater ripple band (100-250 Hz) power bursts in hippocampal regions and REM sleep are known to have synchronous theta-gamma relationships.

      Comments on revised version:

      The revised manuscript has made some improvements specifically in presentation of results as well as revising the title. However, more broadly authors have failed to address most of the concerns raised in the original review. As an example, the sleep scoring system is still subjective without any quantifiable and reproducible criteria. Another instance is regarding fig 9 comments, in which authors failed to address any of the raised concerns and reiterated their results. Hence, in the current form the results in the paper are incomplete with only partial support from the methods and evidence.

    1. Reviewer #1 (Public review):

      Freas and Wystrach present a computational and experimental study of ant navigation. The main innovation of the computational model is the insertion of an oscillatory element between the steering signal and the motor control that results in a trajectory whose heading oscillates around a goal direction. Additionally, the model imposes periodic cessations of forward movement and inversely couples rotational speed to forward velocity. As a result the model periodically makes larger reorientations reminiscent of those seen in behaving ants.

      The behavioral data consists of two experimental sets: experienced Melophorus bagoti foragers, recorded in 2010 and inexperienced M. bagoti foragers, recorded in 2023-2024 at the same site. The behavioral data is qualitatively compared to the model in Figures 3 through 6. In figures 3-5, all ant sets are grouped together while in Figure 6 they are separated. In Figure 6, the authors should do a careful job of making sure the reader is aware that comparisons are being made between behavioral data sets captured more than a decade apart and of justifying the validity of a quantitative comparison between these sets.

      The manuscript also describes Myrmecia ants and makes comparisons between modeled Myrmecia ants and supplemental videos of these ants (Videos 3,4). These videos are not described in the methods. While the captions describe these as ants "homing in an unfamiliar environment," the videos show tethered ants walking on a ball. Without more information and absent any analysis, it is difficult for me to understand how these videos support granular points in the text about coupling between rotation and forward velocities.

      Strengths:

      The manuscript's main thesis, that an oscillatory element interspersed between the control signal and the motor unit can reproduce aspects of ant navigation, appears supportable.

      Weaknesses:

      Qualitative agreement between aspects of a model and aspects of a behavioral measurement do not prove the correctness of a model. In the section (802), "An ancestral design? Striking parallels with crawling Drosophila larvae," the authors argue that behavioral data in larvae support their model, despite the larva's lack of a (known) central complex. C. elegans navigation can also be segmented into longer runs and shorter exploratory behaviors (Chen 2025), comparable to the runs and scans described here. C elegans definitively does not have a central complex. In general, multiple internal mechanisms are capable of producing the same macroscopic behavioral outcome. This fact limits the ability of behavioral data to confirm the details of a particular model; it does not imply that observation of similar behaviors in multiple species shows that a particular model is correct or generalizable.

      Here the ability of the behavioral data to confirm or constrain the model is further limited by the qualitative nature of the comparisons. Some of the comparisons are trivial (e.g. Figure 5E-F: any first order process will produce a Poisson distribution, and in the model a Poisson process was explicitly coded in with parameters chosen (1070) to match the behavioral data). Finally, the number of adjustable parameters (13) is comparable to the number of comparisons made; it is unclear that the model could not be adjusted to fit any set of behavioral measurements.

      While the introduction is improved, there is still room to eliminate confusion as to what aspects of the model reflect hypothesized rather than measured neural circuits. For instance, if there is data showing LAL oscillations in insects, the authors should cite it and call it out clearly. Alternately they should say that the oscillator is hypothesized based on measured bistability. They should also clarify whether they are discussing neural oscillations or motor oscillations and whether these oscillations are measured, modeled, or hypothesized.

      As one example: Lines 283-284 "This oscillator [referring to the model's intrinsic oscillator described in the previous paragraph], which is widespread in insects (Cheng, 2024; Kanzaki, 2005; Kanzaki and Mishima, 1996), resides in the lateral accessory lobes (LAL)" reads as though it is known that a neural oscillator occupies the LAL. Cheng 2024 is a brief review of behavioral oscillation. Kanzaki et al. 2005 describes numerical modeling and simulation with a physical robot. Kanzaki and Mishima, 1996 demonstrates bistability (flip-flopping) in moth descending neurons. None of these show neural oscillations and none of them describe the LAL. The authors should review the paper and be scrupulously careful that the claims made in the text are supported in the cited references. These difficulties were pointed out in a previous round of review; hopefully they can be fully corrected this time.

      Kevin S. Chen, Jonathan W. Pillow*, Andrew M. Leifer*, "State-switching navigation strategies in C. elegans are beneficial for chemotaxis," arXiv:2508.00191 31 July 2025.

    2. Reviewer #2 (Public review):

      The paper by Freas and Wystrach is an interesting computational study, exploring the detailed mechanisms of how simple neural circuits could explain complex behavioral patterns observed in navigating ants. The authors compare detailed, high speed video recordings of Australian desert ants (Melophorus bagoti) with predictions made by their new computational model and find convincing similarities between the model and the behavioral data, at a level of detail not previously studied. Particularly interesting are emerging properties of the model, yielding behavioral motifs it was not designed to reproduce, but which occur in natural ant behavior.

      A strength of the study is that the model is based on previous models, without making major novel assumptions. It combines existing models of the insect central complex with a model of the lateral accessory lobe and adds a stochastic inhibition of forward velocity to the interaction of central complex and lateral accessory lobes. In essence, the central complex provides corrective steering signals when the goal direction and the current heading of the insect are not aligned, while the lateral accessory lobes provide an intrinsic oscillator underlying the behavioral oscillations shown by walking ants at all times. These background oscillations are modulated by the steering signals from the central complex. Depending on which phase of the intrinsic oscillations coincides with the corrective signals, and how fast the ant is moving forward during this time, a complex set of behaviors emerges.

      Most prominently, scanning behaviors, which are regularly carried out by the ants, are recapitulated in great detail by the model. Additionally, other behaviors, such as full loops, emerge naturally from the model. While computational models are not to be seen as definite evidence for any biological reality, they can provide strong support for particular neural implementations. The current study is an excellent example in that it provides evidence for a serial arrangement of central complex circuits upstream of the lateral accessory lobe circuits, modulated by speed regulating input. While the latter is hypothetical, it yields a clear hypothesis that can be validated by connectomics studies and functional work in the future.

      The computational model is explained in detail and information about all model parameters is provided in an accessible way. The approach is thus transparent and reproducible, leaving it to the readers to assess the assumptions made in the model and how the studied complex behaviors emerge. This also provides the possibility to combine this new model with existing models to expand the scope and to more comprehensively capture the behavioral repertoire of ants, and insects in general.

      Importantly, the study shows that even complex behavioral motifs do not require dedicated neural modules, but can rather emerge from the interplay of already known circuits - highlighting the efficiency of insect brains and possibly providing the path towards embodied hardware solutions of such circuits in autonomous agents.

    1. Reviewer #1 (Public review):

      Overview:

      This study examines cellular computations in the dendrites of neurons in the medial superior olive (MSO) required for computing sound location based on interaural time differences (ITD). This field had, for many decades, depended on the so-called Jeffress model, which stated that an array of binaural coincidence detector neurons fire only when a given sound lateralization is balanced by a given difference in presynaptic axonal conduction time. The apparent absence of such calibrated axonal delay lines has left the field with little mechanistic handle for the strong ITD computations in MSO. This study suggests that dendritic delay along the dendrites of the bipolar MSO neurons makes a significant contribution to a calibrated delay line.

      Strengths:

      The authors used a combination of in vitro patch-clamp recordings, morphological analysis of a large dataset, and computational modelling to gain experimental access to dendritic computations. A technical tour-de-force set of distal dendritic patch-clamp recordings allowed an evaluation of this otherwise inaccessible parameter, and detailed modeling based on large datasets revealed the functional consequences. The use of this broad methodological toolbox enabled a detailed study of dendritic integration in MSO neurons and revealed a prominent role for graded variation in dendrite structure in shaping the coincidence detection in MSO neurons. In addition, the modeled effects of synaptic inhibition were quite striking and shaped our understanding of ITD coding in the MSO.

      Weaknesses:

      The paper's organization does not set up the reader very well for the major point to be made about exactly how dendritic asymmetry could bias ITD curves. This point only arises later in the paper after discussion of uncorrelated physiological measures that merely hint that what is important is "larger morphological and electrotonic structure". The paper could also benefit from a more complete description of the methodology. As an example, bridge balance goes unmentioned, and series resistance is hardly mentioned, even though both could distort the measurements of simulated EPSP amplitudes made through tiny electrodes used for dendrite recording.

    2. Reviewer #2 (Public review):

      Medial superior olivary neurons are sensitive to interaural time differences in the microsecond range, and many cellular mechanisms have been advanced to explain this temporal sensitivity. This study provides experimental and computational evidence for a new mechanism in which a range of asymmetric dendritic delays permits individual MSO neurons to represent the full range of biologically relevant ITDs. Using elegant 2-photon guided simultaneous recordings from distal dendrite and soma, along with compartmental modeling on anatomically reconstructed neurons, the authors provide compelling evidence that this mechanism contributes to microsecond-level tuning. The experimental design, analyses, and narrative are all well-crafted. It's a beautiful study. As outlined below, I have two general questions about interpretations drawn from the experimental data and modeling.

      (1) Both excitatory and inhibitory synapses on MSO neurons display significant short-term depression (Couchman et al., 2010). Given the amount of attenuation at the soma, the role that the distal inputs would play after stimulus onset has not been tested. Were simulated EPSC pulse trains with endogenous short-term plasticity kinetics injected into distal dendrites? If not, were EPSP and IPSP trains with endogenous short-term plasticity kinetics studied in the model? The fundamental question is how much distal synapses contribute to somatic spike initiation as a function of synaptic pulse number.

      (2) The model provides a credible line of evidence that synaptic inputs from distal and tertiary compartments can generate reliable increases in the time of arrival at the soma. It would be relatively simple to sequentially prune dendritic compartments to address how the time difference at which the maximal firing rate scales with tertiary or distal compartments. Similarly, one could eliminate the primary dendrites to determine whether or not they play a functional role. I would expect these chores to be largely confirmatory, but since EPSP delay and amplitude are convolved, it would increase confidence in the interpretation.

      (3) Two technical questions. The age range is fairly broad, and it is not clear at which ages the experimental recordings were obtained, especially for the key experimental graphs that show correlations between delay (Figure 1d) or tau (Figure 2e) and distance. In addition, age could be added to Supplementary Figure 1, and the data could be ordered from youngest to oldest. Second, the Methods section indicates that brain slices were gradually cooled to 25 {degree sign}C, but should specify whether or not the recordings were obtained at this temperature.

    3. Reviewer #3 (Public review):

      Summary:

      The study addresses how mammalian medial superior olive (MSO) neurons generate the internal delays required for interaural time difference (ITD) coding and sound localization. The authors demonstrate that dendritic morphology, particularly asymmetry between lateral and medial dendritic arbors, contributes to differential EPSP propagation delays and thereby shifts the optimal ITD of individual MSO neurons, using two-photon-guided paired dendritic and somatic recordings with compartmental modeling. This is a strong and potentially impactful manuscript. The work provides compelling evidence that dendritic morphology contributes to coincidence detection and ITD tuning in MSO neurons.

      Strengths:

      A major strength of the study is its technically rigorous combination of experimental electrophysiology, detailed neuronal reconstructions, and computational modeling. The use of paired dendritic and somatic recordings provides direct physiological insight into EPSP propagation, while the modeling approach allows the authors to test how cell-specific morphology influences coincidence detection. The analysis of multiple reconstructed MSO neurons further supports that dendritic asymmetry generates differential EPSP propagation delays that contribute to ITD tuning. This is a novel and potentially important mechanism that may complement classical axonal delay-line models. The study is strong in its anatomical and electrophysiological approach.

      Weaknesses:

      No major weakness. However, some aspects of the methods and interpretation would benefit from clarification. First, the assumptions used in the compartmental models should be more explicitly described, including the distribution of glutamatergic synaptic inputs and synaptic conductance parameters. It would be useful to clarify whether excitatory inputs were assumed to be homogeneously distributed along primary and higher-order dendritic branches or assigned based on known MSO input organization. Anatomical validation using VGluT staining together with dendritic labeling could strengthen the physiological relevance of the modeled input patterns. Second, the morphological analysis is informative, but additional measures of dendritic complexity could further support the conclusions. In addition to path length and membrane surface area, analyses of primary neurite number, branch points, and terminal arbors, using Sholl profiles or fractal dimension, could provide a more comprehensive assessment of lateral-medial dendritic asymmetry.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The previous concerns have been addressed.]

      The central pair apparatus of motile cilia consists of two singlet microtubules, termed C1 and C2, each of which is associated with a set of projections, referred to as the C1 and C2 projections. Each projection comprises multiple distinct structural domains, designated a, b, c, and so on. Biochemical studies combined with genetic analyses in Chlamydomonas identified three proteins as the major components of the C2a projection, and subsequent cryo-EM studies confirmed these findings.

      In this paper, the authors aim to study the homologues of these three proteins-CCDC108/CFAP65, CFAP70, and MYCBPAP/CFAP147-using knockout mouse models. Biochemical and cell biological analyses demonstrate that, as in Chlamydomonas, these proteins are components of the C2 projection and form a complex that depends on the presence of each other. In addition, the authors use affinity purification to identify two previously uncharacterized proteins and show that they are central pair apparatus proteins that associate with the aforementioned complex. Knockout mice lacking any of the three core proteins exhibit phenotypes consistent with primary ciliary dyskinesia (PCD).

      Overall, the manuscript is clearly written, and the data are convincing and support the authors' conclusions. However, given the previous findings in Chlamydomonas, this work provides limited conceptual advances to the field. Nonetheless, it represents a useful and well-documented resource for understanding the conserved organization of the central pair apparatus in motile cilia. It will be of interest to cell and developmental biologists, biochemists, and clinicians studying and treating human ciliopathies.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript investigates the protein composition and functional role of the C2a projection of the central apparatus (CA) in vertebrate motile cilia. Using three knockout mouse models (Ccdc108, Mycbpap, and Cfap70), the authors demonstrate that these genes - homologs of Chlamydomonas FAP65, FAP147, and FAP70 - are required for normal motile cilia function in ependymal and tracheal multiciliated cells. Specifically, the authors show that:

      (1) Knockout mice for each gene exhibit primary ciliary dyskinesia phenotypes (hydrocephalus and sinusitis), accompanied by abnormal ciliary motion and reduced ciliary beat frequency.

      (2) CCDC108, MYCBPAP, and CFAP70 physically interact and localize to the axonemal central lumen, consistent with the C2a projection.

      (3) Loss of any one of these proteins destabilizes the others and disrupts CA integrity in a tissue-specific manner.

      (4) ARMC3 and MYCBP are C2a-associated proteins.

      Strengths:

      (1) Clarity: the results are presented in a coherent sequence that facilitates understanding of both the rationale and conclusions.

      (2) Genetic rigor: three independent knockout mouse lines that exhibit consistent motile cilia phenotypes provide in vivo support for the proposed role of these proteins.

      (3) Integration of structural and functional analyses: combination of ultrastructural (TEM) and immunofluorescence data with CBF measurements provides convincing correlation between structural defects and impaired ciliary function.

      (4) Mutual dependency model: reciprocal destabilization of CCDC108, MYCBPAP, and CFAP70 supports their interdependence in the C2a assembly.

      (5) Expansion of the vertebrate C2a proteome: the identification of ARMC3 and MYCBP as C2a-associated proteins provides a foundation for future mechanistic studies.

    1. Reviewer #3 (Public review):

      Summary:

      This study uses large-scale all-atom molecular dynamics simulations to examine the conformational plasticity of the HIV-1 envelope glycoprotein (Env) in a membrane context, with particular emphasis on how the transmembrane domain (TMD), cytoplasmic tail (CT), protomer cleavage, and membrane environment influence ectodomain orientation and antibody epitope exposure. By comparing Env constructs with and without the CT, explicitly modeling glycosylation, and embedding Env in an asymmetric lipid bilayer, the authors aim to provide an integrated view of how membrane-proximal regions and lipid interactions shape Env antigenicity, including epitopes targeted by MPER-directed antibodies.

      Strengths:

      The authors have made a heroic effort to address the concerns raised in the first two rounds of review, and the revised manuscript is substantively improved. The addition of dynamical cross-correlation maps, expanded citation of prior computational work, clarification of the membrane composition rationale, data deposition to Zenodo, and new contextualization has improved the flow and interpretation of the manuscript throughout. Several scientifically interesting aspects of the work merit highlighting with a brief discussion on how future studies can leverage this data to build upon its impact.

      A key strength of this work remains the scope, scale, and realism of the simulation systems. The authors construct a very large, nearly complete-Env-scale model that includes a glycosylated Env trimer embedded in an asymmetric bilayer, enabling analysis of membrane-protein interactions that are difficult to capture experimentally. The inclusion of specific glycans at reported sites, and the focus on constructs with and without the CT or cleavage, are well motivated by existing biological and structural data.

      The observation that R696 orientation and its interacting partners give rise to asymmetric protomer conformations and distinct TMD tilts is a notable finding. The statement that interactions between R696 and lipid headgroups or CT residues can be strong enough to introduce a kink into the TMD is well-supported by representative snapshots and consistent with prior isolated-TMD simulations. The use of two initialization depths ("high" and "low") to probe R696 leaflet preference is methodologically interesting and the authors' interpretation - that there is a slight bias toward cytoplasmic leaflet interactions, but that these contacts could be highly dynamic over the course of viral entry - is appropriately cautious. It would be valuable to explicitly frame this as a hypothesis with testable predictions that future experimental or enhanced-sampling work could address. Similarly, the equilibration-driven kinking of the TMD core, consistent with prior isolated-TMD studies, represents a useful validation that extends those earlier observations to the intact trimeric context.

      The simulations reveal substantial tilting motions of the ectodomain relative to the membrane, with angles spanning roughly 0-30{degree sign} (and up to ~40{degree sign} in some analyses), while the ectodomain itself remains relatively rigid. This framing, that much of Env's conformational variability arises from rigid-body tilting rather than large internal rearrangements, is an important conceptual contribution. The authors also provide interesting observations regarding asymmetric bilayer deformations, including localized thinning and altered lipid headgroup interactions near the TMD and CT, which suggest a reciprocal coupling between Env and the surrounding membrane.

      The analysis of antibody-relevant epitopes across the prefusion state, including the V1/V2 and V3 loops, the CD4 binding site, and the MPER, is another strength. The study makes effective use of existing experimental knowledge in this context, for example by focusing on specific glycans known to occlude antibody binding, to motivate and interpret the simulations.

      Finally, the revised text provides clear context that situates the study's findings and discrepancies within the broader literature, strengthening the manuscript's clarity and interpretability.

      Future work in the field:

      As the authors appropriately acknowledge within in the text, these microsecond simulations capture only the closed ground state and with limited sampling due to the already computationally intensive nature of these simulations. Their simulation setup provides interesting foundational knowledge of this state and a framework for these additional important questions.

      Additionally, the authors appropriately acknowledge that CT-TMD and CT-ectodomain correlations are difficult to interpret given limited structural confidence in these regions. Future experimental and computational work in the field can extend and build upon the author's framework, particularly as the authors have made their trajectories available for the public. Re-analysis of the authors' deposited MD trajectories-such as probing for exposure of cryptic epitopes and potential allosteric coupling-could serve as valuable extensions of this work, particularly as advancements in computational analysis has reached an inflection point.

      Comments on revised version.

      Bravo! The improved clarity was a delight to read and will increase the impact this study has on the field.

    1. Reviewer #1 (Public review):

      In this work, Gaurav et al. present an extensive study of phase-separated condensates formed by the foci-forming region (FFR) of the MUT-16 protein. The authors first report in vitro experiments showing that these condensates exhibit upper critical solution temperature (UCST) behavior. They then provide a detailed analysis based on atomistic simulations of MUT-16 FFR condensates, identifying key interactions responsible for LLPS, including salt bridges, cation-π interactions, and the role of Na⁺ ions.

      Overall, the manuscript is well written. However, there are several concerns that should be addressed.

      Major Concerns:

      (1) I have several questions regarding the system preparation that require clarification. The authors state that "65 copies of the coarse-grained MUT-16 FFR were embedded in a slab-shaped simulation," but it is not clear how this initial configuration was generated. Were the molecules randomly distributed in the simulation box, or were they initially arranged in a preformed condensate? Alternatively, were they randomly inserted and allowed to self-assemble into a condensate during NpT simulations?

      In Figure 1, the atomistic snapshot appears to show a well-defined condensate at the center of the simulation box. It would be important to clarify how this configuration was obtained: Was it generated from coarse-grained simulations starting from random initial conditions? Or was a preassembled condensate used as input?

      Related to this, how do the authors ensure that the simulations are equilibrated? While 20 μs appears to be a reasonably long simulation time for coarse-grained simulations, it would be useful to demonstrate equilibration explicitly. For example, the authors could plot the center-of-mass positions (in the long axis of the simulation box) of individual proteins over time to show that all molecules reach a steady state and remain within the condensate without systematic drift.

      (2) The authors experimentally observe UCST behavior for these condensates. Do the coarse-grained or atomistic simulations reproduce this behavior?

      While atomistic simulations may be too computationally demanding to systematically explore temperature dependence, coarse-grained simulations could be used to test whether condensates are stable at lower temperatures and dissolve at higher temperatures. Such an analysis would provide valuable support for the experimental observations.

      (3) Regarding the analysis of ions, several points could be clarified and extended:

      a) It would be helpful to report the total number of ions and quantify how many are located inside vs. outside the condensate. While qualitative trends can be inferred from density profiles, quantitative analysis would strengthen the conclusions.

      b) It would also be interesting to analyze the number of contact ion pairs (e.g., Na⁺-Cl⁻ pairs), as described in J. Chem. Phys. 156, 044505 (2022). It is known that some ion models tend to overestimate ion pairing and underestimate solubility (e.g., J. Chem. Phys. 153, 010903 (2020)).

      c) In this context, the use of scaled-charge models has been shown to improve the description of ionic solutions and biomolecular systems (e.g., J. Phys. Chem. Lett. 2019, 10, 23, 7531-7536). I would suggest that, at least for one trajectory, the authors perform a test simulation using scaled charges (e.g., scaling by ~0.8) to evaluate whether ion distributions and protein-ion interactions are significantly affected.

      d) Finally, while the selected water model is known to be accurate, it would be useful to assess its performance for concentrated salt solutions. For example, the authors could estimate the density of a 6 m salt solution and compare it with experimental data or validated models (e.g., J. Chem. Phys. 151, 134504 (2019)). This would help clarify to what extent the conclusions depend on the chosen force field.

      Minor Concerns

      (1) In the Introduction, it would be helpful to elaborate further on the possible driving forces of LLPS in this region. Are there prior hypotheses or evidence pointing to specific interactions (e.g., cation-π, π-π, electrostatic interactions)? While this work addresses these questions, a brief discussion of previous experimental or theoretical insights would provide useful context.

      (2) On page 18, the authors state:<br /> "MUT-16 FFR satisfies the length (172 residues), aromatic content (20.35%), and Arg enrichment (85.71%) criteria. Its charge content (10.47%) and charge balance (38.89% positive charge fraction) are slightly below the nominal thresholds."<br /> It would be very helpful to include a schematic representation of the protein sequence highlighting these features (aromatic residues, charge distribution, etc.) in the corresponding figure, to provide a more intuitive understanding.

      (3) A question regarding ion hydration: What is the coordination environment of the ions that bridge proteins? Are they still hydrated by water molecules, or does the reduced water content inside the condensate significantly affect their solvation?<br /> Typically, Na⁺ and Cl⁻ ions have coordination numbers around 5-6 in aqueous solution. Do protein interactions and reduced solvent conditions within the condensate alter this coordination? A brief analysis or discussion would be valuable.

    2. Reviewer #2 (Public review):

      Summary:

      Gaurav et al. investigate residue-level interactions within the MUT-16 FFR condensate using all-atom molecular dynamics simulations. The authors first argue, based on sequence analysis, that MUT-16 FFR is more representative than the widely studied FUS LCD. They then characterize the UCST phase behavior of MUT-16 FFR experimentally, followed by a detailed analysis of residue-level contact frequencies and lifetimes. In addition, the manuscript examines ion-residue interactions and water-mediated interactions. Overall, this work provides a comprehensive view of the dynamic interactions within the MUT-16 FFR condensate.

      Strengths:

      Large-scale all-atom molecular dynamics simulations have been performed to investigate dynamical interactions within condensates. The analysis is comprehensive and rigorous, and the claims are strongly justified by the data.

      Weaknesses:

      The large amount of detail in the results section sometimes makes it difficult to identify the central take-home messages. I encourage the authors to more clearly highlight the principal findings and the physical insights that may generalize to other condensate-forming systems. The authors may also consider streamlining parts of the Results section to improve focus and readability.

    3. Reviewer #3 (Public review):

      Summary:

      The authors aim to characterize the molecular interaction network inside phase-separated condensates formed by the MUT-16 foci-forming region (FFR), using atomistic simulations combined with residue-resolved analyses of contact frequencies, contact lifetimes, specific non-covalent interactions, ions, and water.

      Strengths:

      The work addresses an interesting and biologically relevant system, and the combination of large-scale atomistic simulations with an extensive contact analysis has clear potential value for the broader condensate field.

      Weaknesses:

      In its current form, several technical issues need to be addressed before the main conclusions can be considered robust. Most importantly, the simulated sequence is 172 residues long, while the atomistic slab has box dimensions of only 12 nm in two directions. This length scale is comparable to the expected end-to-end distances of a disordered 172-residue chain. It is therefore not clear whether individual protein chains interact with their own periodic images, which could substantially affect overall chain dynamics and subsequently bias contact lifetimes, residue-residue interaction statistics, and the inferred condensate dynamics. The authors should check, for each chain, histograms of end-to-end distances. For chains for which more than ~2-3% of the end-to-end distances exceed ~11 nm, the authors should explicitly check for self-image interactions (for example, using "gmx mindist -pi") and report whether such interactions occur and for what fraction of the trajectory. Without this control, at least in the Supporting Information, I do not think the simulation-derived contact dynamics are sufficiently trustworthy.

      A second major concern is the treatment of ions. The manuscript makes important conclusions about Na⁺ association and Na⁺-mediated bridging, but the atomistic ion model is not explicitly stated. This is a reproducibility problem and also affects interpretation - for example, standard Amber ions are known to bind too strongly to the oppositely charged residues. In their results, one acidic residue appears to interact on average with roughly two Na⁺ ions, which is not obviously expected from charge balance alone. The authors should state the exact Na⁺/Cl⁻ parameters used, justify their compatibility with TIP4P-D and the protein force field, and explicitly interpret why such a strong Na⁺ association with acidic residues is observed.

      More generally, because the manuscript is centered on contact lifetimes, the choice of the atomistic force field needs stronger justification. Salt bridges, cation-pi contacts, pi-pi stacking, ion coordination, and water-mediated interactions are all force-field-sensitive. Since there is no direct experimental observable used here to validate the simulations, the authors should discuss the expected limitations of the chosen force field (while I do acknowledge that testing different force fields would be computationally too demanding).

      I also find the sequence-comparison section somewhat confusing. The authors compare one specific IDR, MUT-16 FFR, with the average properties of human IDRs and then frame it as more representative than FUS LCD. It is not clear how informative this is because IDR behavior depends strongly on sequence-specific patterning, molecular connectivity, and the particular interaction network of each protein. Averages over human IDRs may provide a broad context, but they do not necessarily define what is physically or biologically representative for phase separation. In addition, FUS LCD is not intended to be a representative human IDR; it is an unusually low-complexity, phase-separating domain. Therefore, the "more representative than FUS" framing should be toned down. At most, this analysis shows that MUT-16 FFR is compositionally less extreme than FUS LCD.

      The ion- and water-bridging analyses are also potentially overinterpreted. A distance-based simultaneous contact with two residues does not by itself establish functional mediation or regulation of condensate dynamics. The authors should either add appropriate controls, such as local-density-normalized baselines or randomized-contact expectations, or soften the language to describe these as geometrically defined co-contact events rather than mechanistic bridging interactions.

      Finally, the independence of the atomistic replicas is unclear. The manuscript should state whether all ten all-atom simulations were initiated from the same coarse-grained condensate configuration or from distinct CG frames. If the starting structures came from one CG trajectory, the authors should report how far apart those frames were in simulation time and provide evidence that the initial atomistic configurations are structurally independent. If only velocities differ, the simulations should not be described as fully independent structural replicas.

    1. Reviewer #1 (Public review):

      Summary:

      The authors investigated the relationship between physical activity (PA) and both structural (MRI) and cognitive brain health in the LIFE-Adult Study, with total baseline recruitment of 2576. Hippocampal volume, an MRI-derived BrainAGE marker, and scores from the Trail Making Test were used as outcomes, with the majority of participants measured at baseline and subsets also measured in a follow up session. The key findings were a lack of direct association between PA and outcomes, but longitudinal evidence for a higher BrainAge at baseline leading to lower physical capacity at follow-up. This supports a reverse-causation hypothesis in contrast to prevailing understanding of the positive effects of physical activity on brain health.

      Strengths:

      The Life-Adult study is a rich and carefully acquired dataset, with multiple follow-up time points. The statistical analyses were conducted carefully with appropriate control for confounds and multiple testing. The study design enables the important assessment for reverse causality. The authors are scrupulous in their consideration of a number of factors that could potentially bias their results, performing an age-stratified analysis, and emphasising discrepancies in PA measurements (specifically and age-reporting bias) across the dataset and other limitations.

      Weaknesses:

      This is an observational study with inconsistent measures of physical activity. Previous studies have used physical activity interventions, and might be more strongly weighted when considering evidence for these effects (specific confounders involved in interventions notwithstanding) .

      The model identifying potential reverse causality is relatively limited - it seems possible/likely that brainAge could reflect more general health status, which would expand the potential range of factors underlying this observation. The authors comment on these possibilities.

      The important quantitative actigraphy subset is small (n=227) as are the longitudinal subsets. Along with the discrepancy of physical activity/capacity at baseline and follow-up, and other complexities of the dataset, it is difficult to make firm conclusions. The authors point out that the actigraphy subset was quite inactive, and discuss this as a limitation.

    2. Reviewer #2 (Public review):

      Summary:

      This population-based cohort study found no evidence that physical activity, whether self-reported or objectively measured, positively influenced brain structure (hippocampal volume or BrainAGE) or cognitive function (Trail Making Test scores). Notably, longitudinal analyses suggested the opposite temporal relationship: a higher BrainAGE at baseline predicted higher physical capacity at follow-up, more in line with reverse causation rather than a neuroprotective effect of physical activity.

      Strengths:

      The study's statistical approach is thorough and well documented, and the inclusion of two measurements of physical activity (self-report questionnaire and objective accelerometer data) is a strength. The longitudinal aspect also represents a strength.

      Weaknesses:

      Several aspects of the measurement timing warrant consideration. Physical activity was assessed over 7-day periods, creating a potential mismatch with (commonly less dynamic) brain outcomes examined (hippocampal volume, BrainAGE), which may reflect cumulative exposures over longer timescales. Additionally, the asynchronous measurement protocol (cognitive testing preceding accelerometry, and the MRI occurring weeks after baseline visits) may introduce time lags that attenuate associations. The observed null associations may be influenced by timing misalignment rather than reflecting the absence of consistent effects of physical activity on brain health and cognition.

      Other measurement characteristics also warrant consideration when interpreting the null findings. Physical activity was assessed using short-form self-report questionnaires and averaged accelerometer MET/day values, both of which have limited reliability. Additionally, the modest accelerometer subsample size and low/insufficient variation in activity levels observed in this cohort increase the likelihood of missing effects. These factors collectively raise the possibility that true physical activity-brain health associations may have been obscured.

      The study's conclusions regarding brain health, structure, and cognitive functioning are broad despite the scope of the selection of outcomes examined. The analyses focus on hippocampal volume, BrainAGE (a global aging metric), and Trail Making Test performance (processing speed and executive function), while omitting other important neuroimaging markers such as cortical thickness, functional connectivity, or white matter microstructure. The null findings presented here cannot exclude positive effects of physical activity on broader constructs of brain health or cognitive functioning.

      While the authors appropriately note the use of different physical activity instruments across time points (IPAQ at baseline, VSAQ at follow-up) in the limitations section, the discussion should more explicitly address the interpretive challenges this creates. The observed association between higher baseline brain age gap and lower follow-up physical activity may reflect: (1) a true temporal relationship, (2) an artifact of switching from behavior-focused (IPAQ) to capacity-focused (VSAQ) measurement, or (3) some combination of both. This ambiguity substantially limits causal inference.

      Comments on the revised version:

      I have briefly reviewed the responses to the reviewer comments, as well as the tracked changes in the expanded limitations section of the revised manuscript, and these adequately address my previous concerns.

    1. Joint Public Review:

      Summary:

      Lengyel et al. present a normative model of single-neuron activity in area MT, which is known for its role in processing visual motion. The authors focus on responses to a center and a surround that move at different velocities. Both the center and surround are rigid: picture a set of dots all moving at the same velocity. The center dots are arranged in a disc; the surround dots in an annulus, and in both cases, the velocity of each is time-varying.

      The core proposal is that the brain does not process motion in a fixed coordinate system, but instead infers a latent reference frame, and that MT neurons encode motion either in retinal coordinates or relative to this inferred reference frame. The model is meant to overcome a challenge in the existing literature on area MT: on the one hand, experimental findings are heterogeneous, including both surround suppression and surround facilitation of neural responses; on the other, existing models are either designed ad hoc to capture specific phenomena or they are somewhat general (e.g., divisive normalization), but in either case they can't explain the full range of responses. This manuscript proposes that the full range of responses in MT is explained as Bayesian inference over the reference frame in which center motion speed and direction should be estimated. The model extends one introduced in a previous publication from the same lab (Shivkumar et al. 2025). That publication focused on human perception of motion; this one makes predictions about MT mean responses and across-trial variability.

      Strengths:

      Processing visual motion is important for normal visual function, including for the integration and segmentation of visual objects. This manuscript presents a normative theory, supported by recent human perceptual data, and extends it to make predictions about neural firing rate and variability in area MT. The theory is well motivated and supported by the simulation analysis and comparison to data. It provides new insight into how causal inference of relative motion reference frames can modulate neural activity in MT. The richness of the theory's prediction can guide future experiments. In particular, the theory explains both center-surround suppression and facilitation, unifying disparate empirical observations in MT for which no unified explanation had been proposed. The manuscript also demonstrates a new method to map ideal observer predictions (posterior distributions over speed and direction, which are dependent on the posterior inference over reference frames) onto predicted neural activity for center-surround stimuli, by only considering basic tuning curves measured in the center-alone condition. This is a useful methodological contribution. The manuscript offers a thorough review of CS modulation studies in MT.

      Weaknesses:

      We found this paper difficult to read for two reasons. First, math is generally explained in words. This made it extremely difficult (impossible for some reviewers) to understand the details of the model, which are important. We're not against words, but it's critical that they be accompanied by equations.

      Second, the manuscript is not self-contained in the sense that many of the motivations, assumptions, and limitations of the approach are only evident if one carefully reads the groups' prior work, Shivkumar et al. (2025). Following up on previous work isn't necessarily a flaw, but the introduction of the paper is written from a very broad perspective that does not effectively summarize the prior work and lay out the specific questions that motivate the current study. For example, it is not clear from the introduction whether the authors believe this framework can explain all sorts of center-surround interactions (including in non-motion stimuli and in other areas like the retina), or if the focus is only on area MT.

      Finally, the connection to neural data is confusing and mostly qualitative. The authors create a library of "hypothetical but plausible tuning curves" and show that their modeling framework is flexible enough to capture a variety of center-surround interactions. Although they do state that their model can't explain all possible tuning curves, it's still hard to tell whether they have particularly strong evidence for the Bayesian causal inference hypothesis.

      We also have several technical, but potentially important, comments.

      Line 427: 'Our framework not only reinterprets past findings but also generates new, testable predictions. The model makes directly testable predictions for surround modulation. Facilitation, for instance, is predicted for neurons encoding retinal-centric motion (v_center) under high sensory uncertainty. In contrast, suppression is the hallmark of neurons encoding relative motion (v^relative_center) with respect to a surround-influenced reference frame.' It seems that to test the predictions of the model, one would need to first determine if a neuron encodes retinal or relative motion, without relying on the patterns predicted by this model, and then test if the two types of neurons behave as predicted. It is unclear how one can obtain this labeling of neurons independently of the model predictions.

      Line 492: 'This offers a principled account of how the same population of neurons can support both perceptual states (integration and segmentation)'. However, because the theory assumes each neuron encodes either center velocity or center velocity relative to a moving reference frame, but not both, it does not explain that the same neuron could shift from suppression to facilitation. It may be worth considering another possibility, using V1 surround modulation as an analogy. Different neuron types are required to implement the surround computation: in mouse V1, SST interneurons are surround-facilitated, and they are necessary to implement surround suppression of pyramidal neurons https://pmc.ncbi.nlm.nih.gov/articles/PMC3621107, but their (SST) outputs are not communicated to downstream targets. In that view, facilitation is therefore not a signature of some neurons encoding a type of latent variable; it is only there as an intermediate step in the computation of the other latents (those that require suppression).

      Misspecification of either the prior or likelihood can be a problem for Bayesian inference. Discussion of this point -- and in particular evidence (say from analysis of natural scene statistics in the case of the prior) that both are well-specified -- would strengthen the manuscript.

    1. Reviewer #1 (Public review):

      In this work, Jiqi Shao and colleagues evaluate the microbial iron competition and siderophore-mediated interactions combining (a) a dynamic modeling framework based on the consumer-resource model, including multiple siderophore and siderophore-receptor types, and (b) a graph-theory framework based on directed graphs to quantify the ecological dependencies of the community (referred to as Benefit Transfer Graph). Through a plethora of simulation experiments, by changing the number of species in the community, the ratio of pure-cheaters, and the number of foreign siderophores a partial-producers can utilize (referred to in this study as 'Cheating Breadth'), the authors found:

      (1) Using simulations of small communities of 5 or fewer members, they observe that closed benefit-transfer loops (commensalism/mutualism loops) serve as the structural scaffold for diversity, observing coexistence, dominance, or dynamic fluctuations in function of the fraction of receptors in species and the number of community members.

      (2) Using simulations of large communities of 50 members, they observed a paradox on the capacity of partial producers to utilize different foreign siderophores (referred to in this study as 'The Paradox of Cheating'). They observed that broad 'Cheating Breadth' of partial-producer members increases the probability of community-wide extinction and can act as destabilizing forces. However, at the same time, 'Cheating Breath' of partial-producer members promotes species richness and community biodiversity.

      (3) The application of graph-theory framework helps to unveil ecological complexities of small and large microbial communities, explaining the aforementioned Paradox of Cheating.

      As major strengths of this work, the authors present a novel modeling framework considering the ecological complexity of siderophore-mediated interactions by differentiating types of community members (pure-producers, partial-producers, and pure-cheaters), siderophore/receptor pairs, and exploring a wide range of situations (such as the number of community members, the ratio of pure-cheaters, or the siderophore breadth of partial-producers). Moreover, the discussion and conclusions of this study are mechanistically well-founded with a graph-theory framework (Benefit Transfer Graph). All computer code and scripts to replicate the simulations, analysis, and figure generation are public in the Zenodo repository.

      However, this study still has some work to do before it meets the expected standards, presenting some weaknesses to be addressed. Please regard the following paragraph as constructive feedback aimed at improving your work. The main weakness of the actual version is the Abstract, the missing Methods section, the structure of the Results section, and the results displaying (i.e., Figures), and how partial-producers are considered as cheaters (including how they referred to the capacity of partial-producers to use different siderophores as 'Cheating Breath'). The Abstract could be significantly improved with a better introduction of the system (cooperators and cheaters, and the concept of the 'Tragedy of Commons'), a better description of the modeling framework, and other details included in 'Recommendations for the authors'. The current version of the manuscript misses a proper 'Methods' section.

      Moreover, the authors could include (1) a section with the simulated systems and parameter choices of simulation experiments, (2) the key model assumptions, and (3) a separate (and more detailed) section explaining the graph-theory framework applied in this study (Benefit Transfer Graph). Most of this information is included in Supporting Information, but including it in the main text will facilitate the comprehension of the work. The structure of the results displayed (i.e., Figures) is quite confusing, especially in the section 'Closed Benefit Loops Drive Transitions from Exclusion to Coexistence and Chaos'. Moreover, important results are included in Supportive Information when they should be in the main text. Also, the lack of a proper Method section makes it harder to follow the Results sections. I have included some recommendations/suggestions to improve the Results structure. This study reveals an interesting ecological dynamic in siderophore-mediated interactions. The authors suggest the existence (and further explanation) of the 'Paradox of Cheating'. However, this paradox (and their discussion) may come from a misunderstanding of concepts and/or terminologies used by the authors applied here (and maybe widely applied in cooperator-cheaters systems). The authors refer to the capacity of 'partial-producers' to utilize foreign siderophores (i.e., siderophores of other species) as cheating. Also, they refer to the number of foreign siderophores that a 'partial-producer' can utilize as 'Cheating Breadth'. A microbial cheater is one that has receptors for siderophore uptake but does not pay the cost of producing siderophore themselves. Because 'partial-producers' are generating at least one type of siderophore, these are not technically cheaters (although they may act as 'pure-cheaters', changing their gene expression and do not synthesize any siderophore for the community). All this may entail a misleading of the results and a potentially overstated title and conclusions of this work. Community members 'pure-producers', 'partial-producers' cheaters may be called in a different way, e.g., 'single-receptor producer', 'multiple-receptor producers' and 'nonproducers', respectively [Gu. et al. (2025), doi: 10.1126/sciadv.adq5038]. A better terminology for 'the number of foreign siderophores that a partial-producer can utilize' could be 'Siderophore Breadth', and instead of stating a 'Paradox of Cheating', it can be a 'Paradox of Multiple-receptor Producers'. The discussion of the authors aligns better with the presented results if the proposed terms 'single-receptor producer/multiple-receptor producer and cheater' are used, considering multiple-receptor producers as cooperative members rather than 'moderate cheating'. On the other hand, the Paradox of Multiple-receptor Producers (or Paradox of Cheating by the authors) could be a modeling artifact. Although some species possess multiple siderophore receptors in their genome (some studies suggest that Pseudomonas species and other environmental strains' genomes can have up to 20-30 siderophore receptors), that does not mean that they are all expressed simultaneously.

      Regardless of the weaknesses and the major points to be improved, the findings presented in this work substantially advance our understanding of complex ecological interactions between cooperators and cheaters mediated by siderophore and siderophore-receptor syntheses, especially when multiple-receptor producers are present. Moreover, the modeling and graph-theory frameworks presented by the authors can be applied in other microbial systems, such as collaboration/competition/cheating for substrates or nutrients. Fundamental modeling exercises are indispensable to unveil ground ecological rules of complex microbial communities, accelerating the advances in ecology by developing theory-based hypotheses for future experimental and environmental studies.

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates how cheating affects microbial diversity, using a chemostat model of a microbial community in which species compete for a shared iron pool through siderophore-mediated uptake. After analyzing minimal communities, the study simulates large randomly generated communities in which species either produce no siderophore or produce a single siderophore type. Producers can differ in siderophore type and production level, while all species can differ in the siderophore-specific receptor types they express. Siderophore production trades off with resource allocation to growth. Total receptor expression is normalized, so increasing expression of one receptor type reduces expression of other receptor types. A key parameter in these simulations is the average number of "cheating receptor types," i.e., receptor types that allow a species to use siderophores it does not produce itself. The authors use this parameter as one axis for characterizing cheating behavior and term it "cheating breadth." The results reveal a statistical pattern the authors report as a "paradox": increasing cheating breadth increases the frequency of whole-community extinction, but also increases the mean number of surviving species per non-extinct community. To explain this pattern, the study reduces a community's producer-receiver network into components by retaining only the link from each producer to its maximal beneficiary, i.e., the species receiving the largest growth benefit from that producer. The study finds that the core topology of such a component predicts the community's ecological fate, namely, extinction, single-species survival, or multi-species coexistence, when biomass is concentrated in that component. The study argues that increasing cheating breadth reduces the probability that a community contains components predicting single-species survival, while increasing the probabilities that it contains components predicting extinction or multi-species coexistence. This argument is used to explain why greater cheating breadth increases both community extinction risk and diversity. Based on these results, the study concludes that microbial diversity not only tolerates but requires moderate cheating.

      Strengths:

      The major strengths of this study are that it presents an interesting mathematical model of microbial interactions mediated by diverse siderophores and that it reduces simulation results to simple predictive patterns by focusing on one primary beneficiary per producer, as summarized above.

      Weaknesses:

      The study also has two major weaknesses. First, the observed diversity is not shown to be evolutionarily stable, which limits the biological relevance of the findings. The cycle structure that supports this diversity may be vulnerable to invasion by mutants that disrupt this structure and can thereby drive many species, or even the whole community, extinct. This concern is suggested by previous studies on the hypercycle, which is analogous to the cycle structure found in this study (Eigen and Schuster, The Hypercycle, Springer-Verlag, pages 32-57, 1979 https://doi.org/10.1007/978-3-642-67247-7). For example, a community with a cyclic network may be invaded by mutants that increase growth allocation at the cost of siderophore production (Maynard Smith, Nature 280:445-446, 1979 https://doi.org/10.1038/280445a0). It may also be destabilized by mutants that increase the expression of the "self-receptor," the receptor for the siderophore they produce themselves. Another possibility is a "short-circuit mutant" that expresses receptors in a way that bypasses intermediate species in a cycle (Bresch et al., Journal of Theoretical Biology 85:399-405, 1980 https://doi.org/10.1016/0022-5193(80)90314-8). Cyclic networks may remain evolutionarily unstable even when spatial self-organization is considered (Hogeweg and Takeuchi, Origins of Life and Evolution of the Biosphere 33:375-403, 2003 https://doi.org/10.1023/A:1025754907141). Without demonstrating robustness to these plausible evolutionary hazards, the study's coexistence results may have limited biological relevance.

      The second weakness is that the study treats cheating breadth as if it were a pure measure of increased cheating, framing the observed pattern as a paradox that increasing cheating breadth increases diversity within surviving communities while also increasing community extinction risk. However, increasing cheating breadth decreases the mean expression level of all expressed receptors, a confounding effect that arises from the normalization of total receptor expression. Consequently, increasing cheating breadth also reduces the mean benefit a producer gains from its own siderophore production. In other words, increasing cheating breadth spreads each producer's dependence across diverse siderophores at the cost of a reduced return on the self-produced siderophore. Once these coupled effects are recognized, the reported pattern is less paradoxical: increasing cheating breadth might be expected to increase diversity within surviving communities by distributing dependence, while also increasing extinction risk by reducing self-reliance. Therefore, the apparent paradox may arise from the way cheating behavior is parameterized rather than from a direct effect of increased cheating alone.

      Additional context:

      The present study can be considered alongside previous studies proposing that cheating can, in some contexts, promote microbial diversity by generating ecological dependencies. The Black Queen hypothesis proposes that such dependencies can be created by adaptive gene loss and reliance on functions performed by other community members (Morris et al., mBio 3:e00036-12, 2012, https://doi.org/10.1128/mbio.00036-12). A related study by Fullmer et al. discusses how mutual cheating can contribute to microbial diversity (Frontiers in Microbiology 6:728, 2015, https://doi.org/10.3389/fmicb.2015.00728).

    1. Reviewer #1 (Public review):

      Summary:

      One of the most important fundamental questions in base excision repair (BER) is how chromatin structure affects the action of specific components of the BER pathway. Previous work from this and other groups has began to address this question. In this report, the authors study the activity of Pol beta on a gapped or nicked DNA substrate 23 bases from the entry/exit site of a 603 nucleosome core particle in the presence and absence of PARP1, PARP2, HPF1, or FEN1. They show that H1 and PARP block pol beta incorporation, which is relieved by NAD+.

      Strengths:

      They show, not unexpectedly, that HPF1 and PARP activity help to displace H1, allowing Pol beta incorporation. PARP1 and PARP2 suppress Pol beta activity, which is mitigated by autoparylation. PARP2 has a strong impact on strand displacement synthesis. This is an important contribution to the field.

      Weaknesses:

      This present work incrementally builds upon their previous work, and what has been known previously about the activity of PARP1/2, HPF1, and the modification of histones.

    2. Reviewer #2 (Public review):

      Summary:

      The authors have shown some interesting data on DNA repair synthesis by PolB, acting on a BER substrate in the presence of a core nucleosome, and the effects of some accessory chromatin proteins. FEN1 and PARP proteins were also assessed for their effects on repair synthesis by PolB. However, the story for the PARP proteins seems a bit underdeveloped, or perhaps it just needs additional clarity in the writing. The concept that strand displacement synthesis by PolB in linker DNA and into the NCP is limited by these interactions is useful, although we need to bear in mind that the study does not address the role of the final repair enzyme, DNA ligase, which might itself limit the products. Likewise, the possible effects of competing DNA polymerases remain unexplored, notably the replication enzymes delta and epsilon. There are circumstances where these appear to be the main DNA repair polymerases for BER substrates. Addressing these and other issues, as listed below, would greatly improve a paper that is already fairly strong.

      Specific Points:

      (1) Substrates:

      The gap substrate was prepared by treating a U-containing substrate with UDG + APE1. Consequently, it is not exactly a gap, but a repair intermediate with a 5-abasic site on one side of the break. It should be described more clearly in the text.

      The nicked substrate was prepared by incubating the "gap" substrate with PolB and dTTP, the nucleotide to replace the excised U. It is expected that this substrate has the 5'-abasic site removed by the PolB lyase, and only one dTMP residue inserted. Has either of these expectations been verified? For example, PolB can insert more than one nucleotide in a prolonged incubation, and the enzyme has no intrinsic 3'-exonuclease to trim the extension.

      Finally, it appears that these procedures were performed with the NCP already in place; therefore, the presence of the nucleosome is expected to influence the processing done to prepare the gap and nick substrates. What do we know about that?

      (2) Figure 1c:

      The rate difference for gap vs. NCP is modest, perhaps 2-fold in the data shown. Some statistical analysis is needed to solidify this observation.

      (3) As noted on page 4, the histone tails might be important for some of the observed effects. While individual histones had no effect, the critical test would be in the context of the NCP. There are many modified or mutant histones now available that would enable this. While such experiments would be more for future work, the possibility should be mentioned in this paper.

      (4) What are the molar ratios of the various enzymes to the substrates? Can we say whether that reflects the levels that might be found in vivo? For the in vitro studies, the stoichiometry would also influence competing binding reactions. Indeed, Figure 2 indicates that the NCP substrate has multiple, competing binding sites for PolB. Why are the multiple NCP-PolB species not better resolved in EMSA (Supplementary Figure 2a)? Perhaps the higher-order ones are more unstable in the gel? That would be consistent with Table 1.

      (5) Wouldn't the incremental 3-nucleotide steps seen with PolB + FEN1 be a relatively inefficient process? Of course, one expects that the presence of a DNA ligase would effectively limit this process to just one synthesis/excision cycle. Hasn't that been tested with these substrates?

      (6) In many of the gel images, it can be hard to tell S from the +1 products, especially further from the side of the gel. Is there an independent way to verify that just a single nucleotide was replaced?

    3. Reviewer #3 (Public review):

      This manuscript by Shtanov et al. attempts to define how DNA Polymerase β performs gap-filling DNA synthesis and strand displacement synthesis in linker DNA adjacent to a nucleosome. The authors show that DNA Polymerase β strand displacement synthesis activity is stimulated in linker DNA when the 1-nt gap is positioned 23 bp away from a nucleosome core particle. The authors further show that histone H1, known to bind linker DNA, disrupts the ability of DNA Polymerase β to perform strand displacement synthesis within this context. They then provide some evidence that PARP1 and PARP2 regulate DNA Polymerase β strand displacement synthesis in linker DNA adjacent to a nucleosome, possibly pointing to a role for PARP1 and PARP2 in base excision repair sub-pathway choice. While this study has some intriguing observations, these observations are severely underdeveloped, and many of the stated conclusions are inadequately justified by the experimental data.

      Strengths:

      (1) The authors have identified that DNA Polymerase β strand displacement synthesis is stimulated in linker DNA by the presence of an adjacent nucleosome, though the generalizability of this finding is unclear (see weaknesses).

      (2) The authors convincingly show that the presence of histone H1 negatively regulates DNA Polymerase β strand displacement synthesis in linker DNA adjacent to a nucleosome core particle.

      Weaknesses:

      (1) Throughout the manuscript, the authors perform a variety of enzyme kinetic assays to show that DNA Polymerase β strand displacement synthesis is stimulated in linker DNA by the presence of an adjacent nucleosome, and that other chromatin factors (PARP1, PARP2, and histone H1) regulate strand displacement synthesis. The enzyme kinetic experiments presented have several issues that severely impact their interpretability. This includes the lack of proper substrate controls, a general lack of quantification and statistical analysis, the use of varied enzyme kinetics regimes that impede comparison between experiments, and a general lack of clarity regarding experimental replication/reproducibility.

      (2) The general context where an adjacent nucleosome core particle would stimulate DNA Polymerase β strand displacement synthesis is severely underdeveloped, which limits the generalizability of these findings. It's unclear if this stimulation is dependent on the linker DNA length, the distance of the 1-nt gap from the nucleosome core particle, or the directionality of strand displacement synthesis (towards vs away from the nucleosome core particle). Given the data presented, it's possible that stimulation of DNA Polymerase β strand displacement synthesis by an adjacent nucleosome is a phenomenon that is unique to a 1-nt gap precisely 23 nts away from the nucleosome core particle.

      (3) The conclusion that the N-terminal histone tails do not stimulate DNA Polymerase β strand displacement synthesis comes from an experiment where Gap-DNA227 was incubated with free core histones, and a reduction in strand displacement synthesis was observed. As designed, this experiment is simply unable to prove that the N-terminal tails do not stimulate DNA Polymerase β strand displacement synthesis.

      (4) The observation of apparent cooperativity in DNA Polymerase β binding to Gap-NCP227 from the mass photometry data is intriguing. However, the relationship between this observation and the stimulation of DNA Polymerase β strand displacement synthesis in linker DNA adjacent to a nucleosome core particle is unclear.

      (5) The general claims regarding differential specificity of PARP1 and PARP2 for nicks and gaps in linker DNA adjacent to the nucleosome come from experiments lacking a proper control using an undamaged linker-nucleosome substrate. This is particularly problematic as PARP1 and PARP2 are known to engage the terminal ends of DNA as they partially mimic DNA double-strand breaks.

      (6) While the authors clearly show that PARP1 and PARP2 regulate DNA Polymerase β strand displacement synthesis in linker DNA, the interpretation that this is through direct competition for 1-nt gap binding cannot be proven from the experiments presented.

      (7) The claim that the presence of histone H1 changes the yield and length of PARylated core histones is overstated. The quantification would suggest a subtle difference (particularly for PARP1), but the lack of statistical analysis related to the experiments makes interpretation challenging.

    1. Reviewer #1 (Public review):

      This work addresses a question of practical importance that had never been systematically analysed in the cryo-ET field: when collecting tilt-series data, what is the optimal angular step size between successive tilt images? Due to the upper limit in electron exposure (100 - 150 e⁻/Ų), this question is important, since finer angular sampling improves attainable reconstruction resolution (Crowther criterion) but reduces the signal-to-noise ratio of each individual image, potentially compromising both image quality and the ability to computationally align successive frames. To address this, the authors designed a thorough benchmarking study comparing five tilt increments (1{degree sign}, 2{degree sign}, 3{degree sign}, 5{degree sign}, and 10{degree sign}) while keeping the total dose and tilt range constant. They evaluated the consequences at every stage of the cryo-ET workflow - from raw image quality and tilt-series alignment, through template matching for ribosome detection, to high-resolution subtomogram averaging - with the goal of providing the community with an evidence-based recommendation for data acquisition.

      The manuscript is well written, and the experimental design is carefully thought out. The work provides valuable practical insights into cryo-ET data acquisition by demonstrating that balancing two competing demands - sufficient dose per individual tilt image and fine angular sampling - is essential to achieve high-quality tomographic reconstructions. The identification of a practical optimum at 3{degree sign} tilt increment is the key contribution of the work. It will be interesting to see in the future whether this optimum shifts for smaller molecular targets, and how emerging tilt interpolation strategies such as cryoTIGER may interact with the choice of experimental angular increment.

      The conclusions of this paper are mostly well supported by data, but some aspects of data analysis need to be clarified and/or extended, including:

      (1) Line 109: The authors state that the tilt range was kept at {plus minus}60{degree sign} relative to the lamella plane. Assuming a typical lamella pre-tilt of ~10{degree sign}, the absolute stage tilt would approach its mechanical limit. Two clarifications would be appreciated: (a) What was the average pre-tilt across all lamellae? (b) How many dark tilt images, if any, were excluded during tomogram reconstruction?

      (2) Line 148: "When analysing tomographic volumes, we found that tomograms from data with a smaller increment displayed higher SNR values (see Fig. 2B)." It would be helpful to specify which comparisons are statistically meaningful (e.g. Mann-Whitney U test?). While the difference between 1{degree sign} and 2{degree sign} appears pronounced, the differences between 2{degree sign}, 3{degree sign}, and 5{degree sign} seem minimal. From my point of view, reporting the mean SNR values +/- standard deviations for each condition would already indicate some significance. Furthermore, since SNR is expected to depend on lamella thickness, it should be clarified whether the average lamella thickness is comparable across the five datasets.

      (3) Line 167: "Indeed, the variation in maximum resolution correlates with lamella thickness across all datasets (see Fig. 2F)." The reported R² values of 0.30 (1{degree sign}), 0.38 (2{degree sign}), 0.66 (3{degree sign}), 0.61 (5{degree sign}), and 0.60 (10{degree sign}) reveal a notably weak linear relationship for the finer tilt increments. It is also difficult to assess whether the lamella thickness distributions are comparable across conditions from the current figures - visually, the 1{degree sign} dataset appears to be based on thinner lamellae, while the 10{degree sign} dataset appears to include thicker samples. A histogram of lamella thickness distributions for each condition, provided as supplementary material, would greatly aid interpretation. Given this thickness dependency, reporting mean +/- standard deviation of lamella thickness per condition is highly appreciated.

      (4) Figure 4: It should be specified which tomogram subsets were used for the Rosenthal-Henderson analysis, whether lamella thickness was taken into account in the subset selection, and whether ribosomes too close to the lamella edges were excluded. Finally, linear fits should be displayed across the full x-axis range for all tilt increments to facilitate direct visual comparison.

      (5) General: Were ribosomes located at the lamella edges excluded from the analysis? As demonstrated in the authors' own prior work (Tuijtel et al., Science Advances, 2024), Ga-FIB milling induces structural damage at the lamella surfaces. To exclude the influence on the STA results, particles near the lamella edges should be removed prior to analysis, and the criteria for this exclusion should be stated explicitly.

      The aim of the authors was to provide the cryo-ET community with an evidence-based recommendation for the choice of tilt increment, and they largely succeeded in this goal. The identification of 3{degree sign} as a practical optimum - balancing sufficient dose per tilt image for effective per-particle refinement with fine enough angular sampling for accurate tilt-series alignment - is well supported by the data and consistent across the multiple quality metrics employed. The conclusion that coarser increments (5{degree sign} and 10{degree sign}) compromise tomogram quality, template matching accuracy, and STA resolution is robust and clearly demonstrated. However, the conclusion rests entirely on a single biological system using ribosomes as the sole molecular target, which are exceptionally favourable due to their abundance, size, and electron contrast. Whether the identified optimum holds for smaller, lower-abundance, or lower-contrast targets remains an open question.

      In future, it would be particularly interesting to test whether emerging tilt interpolation strategies, such as cryoTIGER, which is particularly intriguing, can effectively compensate for coarser experimental angular sampling in post-processing. Here, the optimal experimental increment may shift, and the interaction between these two approaches represents a promising direction for future work. More broadly, as cryo-ET datasets grow larger and public repositories expand, the practical tradeoffs between acquisition time, data storage, and structural quality identified here will become increasingly relevant to the field.

    2. Reviewer #2 (Public review):

      The determination of macromolecular structures directly within their native cellular environment is becoming increasingly routine, making standardized data collection strategies essential. In this manuscript, Tuijtel et al. provide a timely and valuable contribution by benchmarking key acquisition parameters and establishing practical guidelines for in situ cryo-electron tomography (cryo-ET). Critically, the authors present a systematic framework for optimizing data collection to achieve the highest attainable resolution.

      Using Dictyostelium cells as a model system, the authors generate multiple datasets at a constant total dose while varying the tilt increment. They demonstrate that tilt-series acquired with finer increments (1-3 degrees) yield superior alignment accuracy and improved template-matching performance, resulting in higher-quality reconstructions than those collected with coarser increments (5 degrees or above). Furthermore, the authors show that for subtomogram averaging, a 3-degree tilt increment outperforms all other conditions tested, particularly after per-particle refinement as implemented in M.

      Overall, the manuscript is clearly written, and the conclusions are well supported by the data presented. I have no major concerns. There are some minor points that the authors should address, including:

      (1) The phrase "electron optical density distribution" (line 31, Introduction) should be revised to "electrostatic potential" or "Coulomb potential distribution," which more accurately reflects what is measured in cryo-EM/ET.

      (2) The authors state that the maximum tolerable electron dose is approximately 100-150 e⁻/Ų (line 34, Introduction). This is an oversimplification, as bacterial specimens, for example, have been shown to tolerate doses of 200 e⁻/Ų or higher (see Breigel et al., PNAS, 2009; https://www.pnas.org/doi/10.1073/pnas.0905181106#T1). The statement should be revised to reflect this variability.

      (3) Lines 56-57: The authors do not cite their own prior work benchmarking tilt-series acquisition strategies on in vitro samples. This earlier study provides important context and should be referenced and briefly discussed.

    1. Reviewer #1 (Public review):

      This manuscript investigates the conformational flexibility and membrane-interaction behavior of the N-terminal segment of the VP4 protein from non-enveloped viruses, such as Coxsackievirus B3, with particular emphasis on the role of myristoylation, an essential process implicated in viral entry and transmission. The authors employ a multiscale simulation framework, combining all-atom (AA) and coarse-grained (CG) molecular dynamics simulations, to characterize the behavior of VP4 peptides in both bulk aqueous and membrane environments.

      AA simulations suggest that the VP4 N-terminus remains predominantly disordered in bulk water, whereas CG simulations highlight the importance of conformational flexibility during interactions with a POPC membrane. The CG approach is further used to demonstrate an enhanced aggregation tendency of myristoylated VP4 monomers compared to non-myristoylated forms and to estimate the free-energy barriers associated with VP4 translocation across the membrane in monomeric and aggregated states. The study proposes a connection between VP4 aggregation, membrane remodeling, and peptide insertion into the membrane. Finally, well-tempered metadynamics simulations are used to explore changes in VP4 helicity during pore formation.

      Overall, the study addresses an important problem and applies appropriate computational approaches. However, several aspects of the methodology, interpretation of results, and consistency with existing literature require clarification before the conclusions can be fully supported. The authors should revise the manuscript with due attention to the comments below.

      (1) Disordered State of VP4 in Bulk Water

      Figures 1(f-g, i-j) indicate that both myristoylated and non-myristoylated VP4 peptides adopt largely disordered conformations in bulk water. This finding appears to contradict prior experimental and computational reports discussed in the Introduction, which suggest partial or transient helicity in this region. A more detailed explanation is required to reconcile these differences with the existing literature. Additionally, since α-RMSD (aRMSD) is a direct and quantitative measure of helicity, the authors may consider reporting helical content explicitly using this metric to strengthen the analysis.

      (2) Lack of Backmapped Atomistic Data for Membrane-Bound States

      Figure 2 presents membrane-bound conformations of VP4 obtained from CG simulations. While this provides useful qualitative insight, the absence of backmapped all-atom representations limits the ability to extract detailed information regarding residue-level interactions, peptide conformations, and specific binding modes at the membrane interface. Inclusion and analysis of backmapped atomistic data would significantly strengthen the mechanistic interpretation of VP4-membrane interactions.

      (3) VP4 Binding to Membrane

      Figure 2(H): The key takeaway from the exercise using multiple different rigidity for the peptide was that the different sections of the peptide have reduced membrane contacts, particularly the N-terminus. However, the contribution from each membrane component is not very apparent due to stacked transparent plots. Re-plotting using bars placed side to side or using a line representation will help to make this clearer.

      (4) Aggregation Stability in Bulk Versus Membrane Environments

      The manuscript states that the aggregation rate and stability of VP4 20-mers in bulk water are weaker than in the presence of a membrane, as shown in Figure S5. However, no clear or significant reduction in aggregation stability is apparent from the figure as currently presented. The authors should clarify which quantitative metrics support this claim and, if necessary, provide additional analysis to substantiate the reported difference.

      (5) Decoding the Role of MYR on the VP4 n-mer Aggregation

      The authors have suggested that the MYR tail plays a key role in the recruitment of VP4 peptides into the aggregate. This is based solely on visual evidence from the simulation. This can be tested directly by using a combination of MYR and non-MYR VP4 molecules, with MYR VP4 acting as membrane anchors. The change in aggregation rate or the number of clusters will give a more complete picture of this phenomenon. In the case of 20 non-MYR VP4 peptides, the aggregate forms within 2 µs, which is comparable to the complete aggregation in the case of MYR-VP4 6-mer. This further brings into question whether the faster aggregation for MYR cases is due to the proximity to the membrane or due to the lipid recruitment aspect of the MYR group.

      (6) Interpretation of Umbrella Sampling Results and Membrane Remodeling

      Figure 4 reports CG umbrella sampling results indicating a reduced translocation free-energy barrier for VP4 in aggregated (condensate) form, which is linked to membrane curvature and remodeling. Additional methodological details are required to support this interpretation:<br /> (a) What is the nature of the membrane used in the umbrella sampling simulations? Specifically, was the membrane initially flat or curved, and was the same membrane (with identical curvature and properties) used for the single, 6-mer, and 20-mer cases? Differences in membrane geometry would directly influence the translocation free-energy profiles.<br /> (b) Additional details regarding the peptide models used in umbrella sampling simulations should be provided, including peptide length, aggregation state definition, restraints applied (if any), and reference configurations, to improve clarity.

      (7) VP4 n-mer Condensate Dynamics

      The authors have performed an autocorrelation analysis of Rg of VP4 in the 6 and 20-mer condensates and found that the decay is slower in the 6-mer. This suggests a higher degree of rearrangement within the VP4 20-mer. This could be due to a faster relaxation time upon formation for the 6-mer compared to the 20-mer owing to its smaller size. It would be informative to look at whether these differences still hold when the 20-mer simulations are extended beyond 10 µs.

      (8) Comparison Between Metadynamics and Backmapped Membrane-Bound Structures

      Figure 5 presents Well-Tempered Metadynamics results for VP4 in a membrane environment. To strengthen the conclusions regarding peptide binding and conformational behavior, it would be valuable to directly compare the peptide conformations and interaction characteristics observed in the Metadynamics simulations with those obtained from the backmapped structures corresponding to Figure 2.

      (9) Interpretation of the Z-Coordinate in Free-Energy Profiles

      Figure 5(a) shows the free-energy landscape of the VP4 peptide as a function of reaction coordinates. However, the corresponding Z-position of the peptide relative to the membrane is not clearly defined. The authors should clarify whether the reported Z-values correspond to peptide conformations at the membrane surface, within the hydrophobic core, or fully translocated across the membrane, as this is essential for proper interpretation of the free-energy minima.

      (10) Helicity in Bulk Water from Metadynamics Simulations

      Figure 5(b) shows a free-energy minimum at relatively high helicity (~0.6) even at a peptide-membrane distance of approximately 3.6 nm, which appears to correspond to a bulk-water-like environment. This observation contradicts the predominantly disordered peptide behavior reported in bulk water simulations (Figure 1). The authors should provide a mechanistic explanation for this inconsistency between the bulk AA simulations and the Metadynamics results.

      (11) Folding and Insertion Free Energy of VP4

      The free energy calculation for folding of VP4 using metadynamics in the POPC membrane and the 2D free energy calculated using umbrella sampling do not show the same picture. As in the first case, the deeper insertion into the membrane promotes a higher helicity, which is not present in the 2D free energy landscape. Assuming the same scale bar for the free energy between the two plots, as that is not mentioned for the free energy obtained from the metadynamics simulations, we see a massive preference towards a helicity fraction of >0.6. This is absent, both in the aqueous and the membrane-embedded environment of the 2D free energy simulations. It will also be useful to mention the plane of the phosphate groups to demarcate the hydrophilic and hydrophobic sections of the membrane

      Final Recommendation

      The manuscript presents interesting and potentially impactful findings on the conformational dynamics and membrane interactions of VP4. However, substantial clarification and additional analysis addressing the points above are required to ensure consistency, rigor, and alignment with existing literature. I recommend major revisions.b

    2. Reviewer #2 (Public review):

      Summary:

      The authors Huang et al. studied how a small disordered VP4 protein present in the viral capsid of naked viruses, such as Coxsackievirus B3, enables the transfer of the viral genome into the host cell by breaching the host cell membrane. The authors show that post-translational myristoylation of VP4 plays a critical role in this process. Using computer simulations of VP4 and its interactions with the membrane, the authors show that myristoylated VP4 anchors to the membrane faster, aggregates faster to form dense phases via LLPS, and remodels the membrane, thereby lowering the energy barrier for the protein to insert into the membrane. The authors further showed, through simulations, that the myristoylated VP4 forms helices within the membrane with higher stability, which then form structured pores, disrupting the membrane and enabling the transfer of the viral genome into the host cell.

      Strengths:

      The strength of the manuscript is that different sets of unbiased and enhanced-sampling simulations using all-atom and coarse-grained models of the protein and membrane are performed to bridge multiple time and length scales involved in the transfer of the viral genome into the host cell. There is experimental support for most of the conclusions arrived at from the simulations.

      Weaknesses:

      The drawback is that experimental evidence was lacking to support the pore-formation proposal from the simulations.

    1. Reviewer #1 (Public review):

      Summary:

      The authors utilize genetic code expansion to tag TDP-43 and G3BP1, and evaluate this protein tagging system (ANAP) compared to antibodies and evaluate protein trafficking and stress granule formation in response to stress with sodium arsenite treatment. They find similar staining to antibodies in HeLa cells, mouse embryonic stem cells and primary mouse cortical neurons. By incorporating the intrinsically fluorescent noncanonical amino acid Anap at carefully selected sites, the authors enable live-cell and neuronal visualization of protein localization, stress-induced redistribution, and dynamic behavior without the structural and functional compromises often associated with large fluorescent protein tags. The work provides technical framework that will be useful for live imaging of tagged proteins.

      Strengths:

      A key strength is the demonstration of the specificity of the Anap fluorescence signal through appropriate controls and the agreement between Anap labeling and antibody-based detection across multiple cell types, including primary neurons. The ability to visualize stress-induced redistribution of both G3BP1 and TDP 43 in living cells highlights the practical value of this approach.

      The functional validation of TDP 43-Anap is compelling. The rescue of both cell viability and RNA splicing defects in TDP 43 knockout models provides evidence that Anap incorporation preserves core protein functions. This is important, as functional disruption is a central concern for any alternative tagging strategy applied to aggregation-prone or RNA-binding proteins.

      Weaknesses:

      While some inherent limitations of genetic code expansion remain (e.g., variable amber suppression efficiency and the inability to directly assess endogenous protein behavior), these are acknowledged and discussed appropriately. Importantly, these limitations do not undermine the central contributions of the study.

    2. Reviewer #2 (Public review):

      In this manuscript, Chen and colleagues describe a novel means of labeling two RNA binding proteins, G3BP1 and TDP-43, using genetic code expansion. Overexpressed constructs that incorporate the intrinsically-fluorescent non-canonical amino acid Anap redistribute to cytoplasmic granules upon application of external stressors such as sodium arsenite. Similar labeling and redistribution of overexpressed G3BP1 and TDP-43 was observed in cultures of mouse primary neurons.

      Genetic code expansion and non-canonical amino acid labeling have many advantages over traditional fusion proteins for tracking protein redistribution in living cells. The authors show that they are able to label exogenous G3BP1 and TDP-43 with the non-canonical amino acid Anap, and follow labeled proteins in living cells with and without stress.

      I suspect that this method could be incredibly valuable to many investigators studying the dynamics and interactions of proteins that are difficult to label or detect by conventional methods.

      Comment on revised version:

      The revised manuscript is significantly improved, with added controls and experiments to confirm expression and Anap labeling of G3BP1 and TDP-43.

    1. Reviewer #1 (Public review):

      Summary:

      This study by Damphousse, Calvin, and Redish investigates how the hippocampus represents competing future outcomes during approach-avoidance conflict. Using an ethologically relevant robotic predator foraging paradigm, the authors aimed to dissociate hippocampal activity associated with reactive defensive responses (escape) from that linked to anticipatory withdrawal decisions. The central finding is that dorsal hippocampal representations differentiate these two modes of defensive behavior within a single naturalistic assay. Specifically, the authors show that attack-triggered retreats and mid-track aborts differ in movement dynamics and hippocampal spatial decoding despite sharing a common behavioral endpoint, that hippocampal representations during pauses predict subsequent behavioral outcomes, and that these representational biases emerge before overt behavioral divergence. The main importance of the study lies in moving beyond viewing the hippocampus as merely encoding spatial location or threat salience, instead suggesting that hippocampal ensemble activity dynamically tracks and differentially weights threat-related, reward-related, and safety-oriented future states to bias behavior before overt action occurs.

      Strengths:

      The study has several notable strengths. First, the behavioral decomposition into retreats, mid-track aborts, and mid-track continues is rigorous and provides a highly interpretable analytical framework. Second, replication across two independent cohorts - despite differences in arena configuration, robot design, and extinction procedures - meaningfully strengthens confidence in the robustness of the findings. Third, the unified reanalysis pipeline across cohorts reflects strong analytical discipline, and the Bayesian decoding framework is well-suited to addressing the central representational questions. Fourth, the ethological relevance of the robotic predator paradigm is a major advantage, allowing the authors to examine a richer repertoire of defensive and decision-related behaviors than is possible in conventional fear-conditioning assays. Overall, the experiments are well designed, the data are clearly presented, and the findings make a valuable contribution to understanding how the hippocampus supports decision-making under threat.

      Weaknesses:

      The study is technically strong, but a few modest revisions would further enhance it.

      (1) First, the abstract mentions extinction and reinstatement effects, but neural analyses focus primarily on the attack phase. It would be helpful to clarify or adjust the abstract accordingly.

      (2) Second, some interpretive language ("guide," "bias") leans toward causal phrasing. Given the correlational data, using "predict" or "correlate with" would be more precise.

      (3) Third, given the relationship between running speed and hippocampal theta, considering speed-related contributions to decoding differences would be useful.

      (4) Fourth, reporting turnaround positions for mid-track abort and continue trials (Figure 7) would provide helpful context.

      (5) Fifth, a figure comparing stimulated vs. non-stimulated sessions in cohort 2 would support the claim that closed-loop stimulation had no measurable effect.

      (6) Finally, reporting effect sizes for key decoding comparisons would add clarity.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript extends previous work from Calvin et al. and examines hippocampal representations during approach-avoidance conflict in a robotic predator foraging task. The paradigm itself is very interesting and addresses an important but relatively understudied question in the navigation and foraging literature: how the brain balances risk versus reward during goal-directed behavior. While hippocampal representations of positively valenced goals and future intentions have been extensively studied, much less is known about how these representations evolve during risk-reward tradeoffs involving threat.

      The authors use a relatively simple and interpretable decoding approach together with thoughtful behavioral comparisons to ask whether future behavioral outcomes can be read out from hippocampal activity before behavior diverges. The most compelling comparison is between mid-track aborts (MTAs) and mid-track continues (MTCs), where the animals initially exhibit very similar pause behavior but ultimately either abort or continue the trajectory. The authors show that decoded location during these pauses differs prior to the overt manifestation of the behavioral decision, suggesting that hippocampal representations may reflect evolving internal evaluation processes during approach-avoidance conflict.

      Strengths:

      A major strength of the work is the behavioral paradigm itself. This type of risk-reward conflict task is relatively uncommon in the hippocampal navigation literature and provides a rich framework for examining defensive decision-making during naturalistic foraging behavior.

      The decoding analyses are also relatively simple and easy to interpret. Rather than relying on highly complex modeling approaches, the authors use straightforward comparisons of decoded spatial representations across behavioral conditions, making the results accessible and conceptually clear.

      Another strength is the use of behavioral controls to isolate comparisons between related behaviors. In particular, the comparison between MTAs and MTCs is compelling because the animals exhibit similar pause states before the behavioral outcomes diverge. This provides a useful framework for asking whether hippocampal activity reflects future behavioral outcome before the decision is overtly expressed.

      Overall, the study asks an interesting question using a novel paradigm and provides evidence that hippocampal representations during approach-avoidance conflict may reflect future behavioral trajectory.

      Weaknesses:

      The main weakness is that many of the reported effects are relatively subtle and are not sufficiently controlled for differences in speed, trajectory structure, and other behavioral variables across conditions. While the subtraction plots (green versus purple decoding differences) appear visually striking, the actual effect sizes are fairly small, making it difficult to assess how robust or behaviorally meaningful these differences are.

      Relatedly, many of the most interesting questions in this task concern how behavior unfolds dynamically within a trial, yet much of the analysis averages across events and trajectories. As a result, potentially important aspects of the behavior may be obscured.

      In particular, the manuscript would benefit from richer characterization of the animals' actual movement trajectories and spatial strategies. Because the analyses rely heavily on linearized position, it is difficult to determine whether animals behave differently in two-dimensional space across conditions. For example, during continued approaches, do animals preferentially hug the wall opposite the robot? Do different behavioral conditions show distinct lateral occupancy or trajectory structure? These types of analyses would make the behavioral interpretation substantially more compelling.

      More generally, while the results are suggestive and interesting, the relatively small decoding differences and substantial behavioral confounds make it difficult to conclude that the observed effects reflect distinct internal evaluative or threat-related states.

    3. Reviewer #3 (Public review):

      Summary:

      The study reanalyzes data from a previously published cohort together with an additional cohort to investigate hippocampal activity during approach-avoidance conflict. Unlike many prior studies that isolate reward- or threat-based learning, this task requires animals to evaluate reward and threat concurrently. The central finding is that hippocampal representations differ between hesitant behaviors that lead to approach versus avoidance outcomes, with representations of the attack zone more likely during pauses preceding abort decisions. This is an important extension of prior work on hippocampal activity and deliberation, suggesting that the hippocampal content may help shape the eventual outcome.

      Strengths:

      All behavioral findings are replicated independently across cohorts, making the behavioral results highly convincing. The design is robust, and the task is especially valuable for studying approach-avoidance conflict. The behavioral paradigm is complex and rare, and neuronal recordings in such a paradigm are of great value.

      The major strength of the study is the comparison of neural activity during hesitant behavior leading to different outcomes, namely, pauses followed by the animal aborting the approach (mid-track aborts), and pauses followed by the animal committing to the approach (mid-track continues). Hippocampal activity differed between the two pauses: the attack zone was more likely to be represented during mid-track aborts. The same effect was observed on the journey before the pause: even before the animal hesitates, hippocampal activity before a pause that led to a mid-track abort was more likely to represent the attack zone than hippocampal activity before pauses that led to continued approach. This analysis suggests that hippocampal content before and during deliberative behavior is predictive of the animal's decision.

      Weaknesses:

      The interpretation of the retreat-related decoding results is less clear. The study compares two sets of retreating behavior: on the one hand, retreat after being attacked, and on the other hand, retreat after hesitation in the absence of an attack (a mid-track abort). Hippocampal activity represents the attack zone more after the animal is attacked. However, these two retreating behaviors originate from different spatial locations: retreats always start past the "attack threshold", while mid-track aborts always start before this threshold. Given that hippocampal decoding is strongly location-dependent, this difference in position makes the neural decoding results difficult to interpret. The increased representation may be due to differences in physical location, rather than the distinct processing of immediate threat and an anticipatory return state.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript by Sustar et al. takes a methodical approach to document the types of glutamate receptor subunits that reside in Drosophila muscles, examining developmental stages spanning from larvae to adults. Prior work thoroughly documented the subunits operating in Drosophila larval body wall muscles. Most subsequent research focused on the glutamate receptor heterotetramers found in the body wall, composed of GluRIIA/C/D/E or GluRIIB/C/D/E subunits, along with auxiliary subunits like isoforms of Neto.

      For the current work, the authors report that the larval muscle glutamate receptor composition is not universal for all Drosophila muscles. They examine the following muscle systems: larval body wall, adult abdomen, adult leg coxa, and adult indirect flight. They also briefly examine adult muscle structures associated with the proboscis, neck, and haltere. The authors find that the receptor subunits in the adult abdomen (mostly) match those in the larval body wall. This makes sense given that the adult abdominal muscles are derived from the larval body wall. Yet not much else matches the larval body wall. For example, all (or most) of the GluRII-type subunits are missing from the adult indirect flight muscles. Leg muscles have GluRII-type subunits, but they do not have all of them expressed prominently, and they are missing GluRIIB. Additionally, leg muscles express a glutamate-gated chloride channel, which could be a source of inhibitory glutamatergic transmission. Interestingly, when it comes to non-abdominal adult muscles, one general theme seems to be an active promoter (GAL4 driver) for the kainate-type glutamate receptor called Clumsy. The authors propose that Clumsy could be key to understanding how functional GluR complexes are assembled in adult insects.

      Strengths:

      (1) Documenting the types of glutamate receptors that operate in diverse insect muscle systems is important because it uncovers fundamental information.

      (2) Much of the prior research focus has been on how the body wall muscle tetramers assemble and operate. It is a strength to demonstrate the other receptor solutions used by adult NMJs.

      (3) The work uses GAL4 drivers and immunohistochemistry (when possible) in combination to draw conclusions.

      (4) The muscle anatomical analyses are of high quality. This allows the research group to reach refined conclusions.

      (5) The confocal-level images of synaptic active zones and their apposed glutamate receptor clusters are of high quality.

      Weaknesses:

      (1) There is a strawman argument that is used repeatedly to highlight the significance of the work. The argument implies that the field broadly assumes (or "tacitly" assumes) that the larval body wall glutamate receptor composition extrapolates to all muscles of the fly, including the adult. This reviewer cannot find evidence that this assumption or argument has been explicitly promulgated by others. More likely, others have not examined these muscles directly, and thus, they have not speculated one way or the other.

      (2) Related - to the extent that there has been any tacit assumption about GluRIIC/D/E-anchored receptors being ubiquitous among adult muscles, tacit doubt was raised by Rivilin et al., 2004 (cited by the authors but not as a source of doubt) and by RNAseq datasets like FlyAtlas from 2022 (replicated in Figures s11 and s12). To be clear, the current analysis is better than a bulk transcript analysis from adult tissues. But rather than "overturning" a field or being paradigm-shifting, the current data seem confirmatory of FlyAtlas - and confirmatory of Rivlin et al., 2004, which explicitly concluded that larval and adult NMJs were different .

      (3) One can draw expression-level conclusions from these data. But genetic tests (e.g., would clumsy losses of function impair leg muscles?) could help the authors and the field draw stronger conclusions about the roles of some of these glutamate receptor gene products. The current dataset falls short of definitively establishing the function of alternate glutamate receptor modules.

      (4) The confocal synaptic images are of high quality. They are good enough that one could analyze how well Brp directly apposes a specific glutamate receptor subunit for all the associated imaging data underlying Figures S1-S8. No such analysis is done, but understanding what components seem to directly oppose the site of release could lead to better conclusions.

      Overall Assessment and Discussion:

      The data in this study are of high quality, and the results support the main conclusion: adult muscle glutamate receptor clusters do not recapitulate the "canonical" larval body wall clusters. This is important, and the data stand on their own. That is the most important part. This reviewer does have suggestions on how to put the current work in proper context; the current draft appears to overstate the novelty of the findings. Additionally, some sentences need editing for accuracy. None of those concerns impeach the excellent foundational data.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents a broad survey of glutamate receptor composition at the neuromuscular junction in Drosophila across developmental stages and muscle types. The topic is clearly important, and the central observation-that adult muscles differ substantially from the canonical larval NMJ-is interesting and potentially impactful. The dataset is extensive and will likely be of value to the community. However, in my view, there are significant limitations in how the data are generated and interpreted, which at present reduce the strength of the conclusions.

      Strengths:

      The study addresses a relevant and timely question and provides a large and systematic dataset. The finding that adult muscles diverge from larval NMJ organization is compelling and challenges a widely held assumption in the field. The breadth of approaches, including genetic reporters, immunohistochemistry, endogenous tagging, and transcriptomic data, is, in principle, a strong aspect of the work and allows for a broad overview of receptor expression across tissues and developmental stages. Even in its current form, the manuscript provides useful descriptive information that will be of interest to the community.

      Weaknesses:

      A major concern is the reliance on a heterogeneous combination of detection methods (GAL4 reporters, antibody staining, endogenous tagging, and RNA), which are treated largely as equivalent lines of evidence. These approaches differ substantially in what they measure and in their sensitivity and specificity. While convergence across methods can in principle be convincing, here this convergence is often inferred from the shared absence of signal. This is problematic because all methods used are susceptible to false negatives for different reasons. As a result, the repeated conclusion that specific GluR subunits are "absent" from adult muscles, including those previously considered essential, is not fully justified by the data presented.

      This issue is not only theoretical. The manuscript itself seemingly contains examples where methods disagree, demonstrating that detection is incomplete and method-dependent. These discrepancies could be better integrated into the interpretation. Instead, negative results across methods are often taken as strong evidence for absence, which overstates the certainty of the findings.

      In addition, antibody validation appears to rely largely on prior work in larval tissue. Given the structural and biochemical differences in adult muscles, it is not clear that staining performance is equivalent, particularly in cases where the signal is weak or undetected. This further complicates the interpretation of negative results.

      More generally, the manuscript moves in several places from descriptive observations to functional or mechanistic implications that are not directly supported. The suggestion that adult muscles operate with fundamentally different receptor assemblies is intriguing, but remains speculative without functional validation. At a minimum, the distinction between observation and interpretation should be made more explicit.

      I thus think that the current conclusions need to be more carefully constrained. Ideally, the study would be strengthened by at least one functional experiment, such as electrophysiological recordings from adult NMJs or perturbation of candidate receptors like GluClα or Clumsy. This would help to anchor the expression data in synaptic function.

      In summary, this is an interesting and potentially important study, but the current manuscript somewhat overinterprets heterogeneous and partly indirect evidence. It will already be useful in its present form, but could be more convincing if the authors more rigorously account for methodological limitations and moderate their claims accordingly.

    3. Reviewer #3 (Public review):

      The Sustar et al. manuscript catalogs glutamate receptor composition across distinct Drosophila NMJs: larval and adult abdominal NMJs, as well as NMJs on adult leg and flight muscles. This work is important and probably overdue. The larval NMJ is the exemplar NMJ in this system, and the identity of "essential" and "alternative" subunits at this stage is assumed by many to hold across developmental stages and NMJ types. Here, the authors show that there is surprising diversification among NMJ types and that the notion of essential/alternative subunits only holds true at larval NMJs.

      The study will generate interest in the Clumsy GluR subunit, which has not been well-characterized at all, but is widely expressed at adult NMJs. They also find striking extrasynaptic expression of glutamate-gated chloride channel GluRClalpha in adult leg and flight muscles, raising questions about its role. The study is interesting, logical, and well-written. The figures are clear, and the discussion was particularly thoughtful. I have a couple of comments that the authors could consider.

      (1) They cite Rivlin et al., (2004) in the Introduction as the sole previous study to investigate the molecular composition of adult NMJs, but do not mention this work again. In the Discussion, it would be helpful to compare/contrast their finding with those of the earlier work.

      (2) Were these analyses done in adults of consistent ages? It seems possible that the GluR subunit composition could be different in very young adults or in aged flies. The age of the animals should be mentioned in the Methods.

      (3) The broad expression of GluCl:V5 in adult leg and flight muscles is surprisingly robust and appears to light up the edges of all muscle fibers. Would the authors comment on the controls that were done to ensure that this staining is real and specific to animals carrying that V5 endogenous tag?

      (4) The snRNAseq data in Figure S12 differ a bit from the IHC/GAL4 data summarized in the table in Figure 2. In particular, the data suggests that Ukar and Grik are widely expressed in adult muscles. Is there a reason not to include an "snRNA seq" column in Figure 2 alongside the data from GAL4 lines and IHC? To my mind, it is about as reliable as GAL4 lines that often capture only a subset of the full expression pattern. In this case, the snRNAseq data suggest that Ukar/Grik are likely at adult flight muscle NMJs, which might be important since NMJ was negative for everything except Neto-beta by IHC.

    1. Joint Public Review:

      Summary:

      Kalburge et al. investigate a task in which human subjects make a decision based on the accumulation of noisy evidence. Tasks like this have been studied for decades, but always with the same essential ingredient: noisy moment-by-moment evidence has to be integrated internally by the subjects, and so is not observed by the experimenter.

      In this study, the authors depart from this scenario and make the evidence visible. Specifically, subjects see a pigeon moving stochastically on a screen, and they have to determine whether the net motion is to the right or to the left. This provides the experimenter direct access - on a trial-by-trial basis - to the bounds the subjects use to make their decision.

      The authors apply this paradigm across a range of tasks, each one differing in how the signal-to-noise ratio (SNR; defined to be the ratio of the drift rate of the pigeons to the standard deviation of the noise) changes over time and across trials. The tasks range from the standard case of constant SNR to the non-standard case where the SNR changes abruptly in the middle of the task.

      The authors determined, on a trial-by-trial basis, the bounds used by the subjects. Setting the bounds optimally when the SNR changes over time or across trials is a non-trivial problem; not surprisingly, then, the subjects were suboptimal. However, they weren't very suboptimal; instead, their behavior was "satisficing" (in the words of the authors), meaning their bounds were reasonably close to the optimal ones. Since the loss is relatively flat near the maximum, and finding the optimal bounds is hard, this is a sensible strategy.

      Strengths:

      The main strength of this work is the introduction of a new paradigm that supports a trial-by-trial measure of the decision bound. This allows direct measurement of the bound at decision time within individual trials. This, in turn, allows experimenters to determine whether the decision bound differs across decision time or fluctuates for the same decision time across trials. This is harder, although not impossible, to do with tasks in which decision bounds have to be estimated across multiple trials, especially when the SNR is changing.

      The authors use this paradigm to show that the decision bounds are mostly constant when the SNR is constant within and across trials. This has been shown indirectly before by fitting models with different parametric boundary shapes, but not directly by measuring the boundary separately for different decision times (but see Kira, Yang, and Shadlen, 2015). They also demonstrate that variability in these bound estimates arises from measurement noise rather than trial-by-trial variability in bound heights, something that could not have been done with previous paradigms.

      They furthermore replicate findings that subjects adjust their bounds, including weak collapse, to changing reward contingencies and SNRs, further validating their paradigm. And finally, the work demonstrates an apparent within-trial bound change if the SNR changes (predictably) mid-trial, as predicted by their previous work (Barendregt et al., 2022). This is -- to our knowledge -- the first confirmation of this prediction.

      Weaknesses:

      There are two non-technical weaknesses.

      First, comparison to optimal behavior was mainly qualitative; a quantitative comparison would greatly strengthen the work.

      Second (although not exactly a weakness), the work does not leverage the full potential of trial-by-trial estimates of the decision bound, which is a missed opportunity. To our understanding, the only finding that relied on trial-by-trial access to the bound was that the variability in the bound estimate was a major source of measurement noise. Their finding that the bound changes to reward contingencies and SNR, on the other hand, did not require such a trial-by-trial estimate. However, with this task (and not standard paradigms), the authors could determine how the bounds change during learning, which would give insight into the learning rules that participants use to adjust their bounds.

      There are also a few technical issues.

      (1) The authors argue that they don't observe a collapsing bound when the SNR varied across blocks (Figure 5). However, they only seem to perform this analysis on the difference in boundaries between trials with different SNRs (Figsures 5B, D). Observing a zero difference implies that the boundary shape is the same across SNRs, but does not rule out a collapse.

      (2) The evidence for a within-trial boundary change for conditions with a within-trial SNR change could be stronger. The data shown in Figures 6C, D is very noisy, and there are no error bars. For individual participants, is the estimated change in bound larger than the variability in bound estimates before and after the SNR changepoint? Are there potentially other measures that could be used to make the point of a clear change in boundary within individual trials more convincing?

      (3) The work assumes that bound height estimates are biased due to the bounded accumulation nature of the decision process, and it corrects for these biases with a simulation-based correction (Methods and Figure 7). To our understanding, this correction assumes that the decision time is the first time that this boundary is crossed. However, the authors do not demonstrate that this is the strategy that participants use; they need to explicitly rule out the possibility that there are significant pigeon excursions across the boundary before the decision time.

      (4) The authors did not consider other stopping rules, such as a decision based on the last few trials. Showing that a stopping rule based purely on the bound fits the data better than other possible rules would strengthen the manuscript.

    1. Reviewer #1 (Public review):

      This manuscript presents compelling evidence from a wild chickadee population linking heritable spatial cognition to extra-pair paternity success, supporting sexual selection via good genes in a food-caching species. The integration of RFID cognition tests with ddRAD paternity assignment is methodologically strong and timely for behavioral ecology, though causal mechanisms and confounds warrant clarification.

      Overall, a major revision of the manuscript is recommended, addressing the points below.

      (1) Confirmation of manipulation and treatment effects. The central claim hinges on spatial cognition driving EP siring, but direct evidence that cognition predicts observed copulations (vs. post-copulatory mechanisms) is absent. While territories do not cluster by performance (Figure S4), quantify male aggression/movement data during fertile periods to rule out intrusion-based EPP. The authors should provide metrics like nearest-neighbor distances for EP sires or playback responses linking cognition to dominance, as in prior chickadee work. Without this, causal female preference remains correlational.

      (2) Female cognition-EPY link inconsistency. Poor female cognition predicts more EPY (first-20-trials: offspring-level χ²=6.21, P=0.013; nests: χ²=6.79, P=0.009), but not for full-task (P>0.5). The authors should discuss why (e.g., learning speed vs. memory stability) and add exploratory correlations (female errors vs. EPY proportion). They should soften claims in the Discussion section of "female-driven" without consistent support and should frame this as a hypothesis.

      (3) Cognitive task sensitivity and validity. Mean errors aggregate learning curves effectively, but single feeder-assignment (non-preferred) confounds neophobia/motivation with spatial ability. The authors should report trial-by-trial improvements (Figure S7 subset) or criterion-to-learn metrics. Justify excluding high-error birds (<3 mean); sensitivity analysis needed to check bias toward high performers.

      (4) Paternity assignment robustness. ddRAD-CERVUS with bimodal LODs (Figure S8) is solid, but unassigned EPY (social-genotyped but no sire) implies missing sires (~?% of EPY?). Include all alive males as candidates yearly? Test power simulations for LOD thresholds. 2019 exclusion justified, but multi-year SNP alignment could boost resolution.

      (5) Mechanistic speculation vs. data. Discussion invokes hippocampus genes (GWAS priors) and good genes, but no offspring cognition/survival data. Label as hypotheses; suggest tracking EPY recruitment. No brood size costs for EP sires is key, but monitor long-term nest investment (e.g., feeding rates).

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors ask whether spatial cognition is under sexual selection in mountain chickadees. To do so, the authors examined a large dataset that includes a) spatial cognition data for both males and females (obtained via use of a clever RFID-based feeder system) and b) social and extra-pair paternity nesting data. As predicted, males with higher spatial cognition sired more extra-pair offspring, and extra-pair sires had, on average, higher spatial cognition scores than the males they cuckolded. Interestingly, females with lower spatial cognition scores were more likely to seek extra-pair copulations, potentially to compensate for their own low spatial cognition. Surprisingly, there was no difference in spatial cognition scores between males that sired their own offspring and those that lost paternity at the nest. Also surprising was the fact that there were no differences in patterns of extra-pair paternity and spatial cognition between high- and low-elevation sites. The latter is particularly surprising in that spatial cognition should be under stronger selection at the high elevation site. Overall, this is a fascinating study that demonstrates that spatial cognition - a trait under natural selection as it directly impacts winter foraging and survival behaviour -is also under sexual selection.

      Strengths:

      The authors have a robust dataset (n = 732 offspring sampled over 3 years), high-quality spatial cognition data collected with a procedure that has been well-honed over the years, and couple the data with solid statistical procedures that address many potential covariates and potentially confounding factors. In addition, the authors are careful in the discussion to elaborate on the many potential alternative explanations from the results and questions that are likely to arise in the minds of readers (e.g., how are females assessing male spatial ability?)

      Weaknesses:

      Overall, no major weaknesses were identified in this study. As always, there are editorial issues that I would encourage the authors to consider, including presentation of data/results and clarification on some statistical issues. Overall, however, this is an excellent study that will make an important contribution to our understanding of the evolution of cognition and targets of sexual selection.

    3. Reviewer #3 (Public review):

      Summary:

      The authors presented evidence that spatial cognition in this population is under sexual selection, with extra-pair males, primarily chosen by the females, having better spatial cognition than males they cuckolded and males with better spatial cognition having more extra-pair young.

      Strengths:

      This cognitive ecology study was conducted on a well-known long-term study population of free-ranging mountain chickadees. This strong base, alongside a thorough study design and extensive statistical analyses, enabled the authors to address research questions that few other labs can address, making this a potentially powerful study of broad general interest.

      Weaknesses:

      Throughout the manuscript, there is a focus on the "mean number of location errors per trial over the first 20 trials". Performance changes across trials, so why weren't learning vs peak performance analyzed separately? Similarly, authors also describe results in the context of the entire task, but sometimes in the context of the first 20 trials - why is one prioritised over the other, and why is the emphasis not always consistent? Are the results across the two generally the same? A more thorough explanation addressing all these points is necessary.

      Lines 429-432: Why was a categorical (i.e., chi-square test) and not a numerical comparison implemented? A numerical statistical test would capture more of the variation (i.e., the number of years separating the social and EPY males).

    1. Reviewer #1 (Public review):

      Summary:

      This is important and significant work because it helps describe the complexity of interactions between system components where two herbivores interact with vegetation. Whereas other studies have shown that the larger ungulate (yaks, Bos grunniens, in this case) can facilitate the abundance and population growth of the smaller (the semi-fossorial lagomorph, Ochotona curzoniae, plateau pika hereafter), this study flips the tables and shows that, at least under some conditions, moderate densities of the plateau facilitate the nutritional condition of yaks.

      The study was not designed to investigate the reasons that pikas clip Stellera chamaejasme. That said, based on other studies and general knowledge of the ecology of these pikas, it is likely that they clip (although do not eat) this plant because its relatively large size hinders predator detection. This species of pika does better where vegetation height is low than where it is higher.

      Strengths:

      Notably, the strong inference the authors can claim for their results is supported by the careful experimental design. A weaker paper would have simply noted correlations between pika burrow density and yak feeding efficiency without experimental removal. This paper, to its credit, not only used experimental removals but also documented the various intermediary results that support the ultimate conclusions. The statistical approaches used appear to be appropriate. (Readers are encouraged to read the full Materials and Methods, which are available in the Supplementary Materials section.)

      Weaknesses:

      Although the study was well designed and executed, and its conclusions appear strongly supported, readers interested in the management implications of the Qinghai-Tibetan Plateau should be mindful of its limitations. First, the study site, at approximately 3,200 m elevation, was relatively low by Qinghai-Tibetan Plateau standards. Stellera chamaejasme becomes less common at elevations > 4,000 m, where a majority of livestock grazing occurs. Thus, it would be instructive to learn, through follow-up studies, whether similar facilitation occurs where unpalatable (and mildly poisonous) species in such genera as Astragalus, Oxytropis, and Thermopsis replace S. chamaejasme as the problematic plant for pastoralists. Second, the authors make no mention of wild ungulates, so it is unclear what, if any, role they may have played in this system. At least one study in Qinghai Province, albeit at a slightly higher elevation, showed that not only pikas, but also Tibetan gazelles (Procapra picticaudata), which were commonly observed on grazed pastures, grazed more frequently on some dicots avoided by domestic sheep than did the livestock themselves (Harris et al. 2015). It would also be instructive to learn if similar facilitation as observed here applied to the other principal livestock species in the area, domestic sheep (which are often herded together with smaller numbers of domestic goats). Finally, as suggested by this study, the interactions between all components of the system are complex and interactive. If pika facilitation of yak nutrition at the densities documented results in herders increasing yak density, might the increased herbivory from the domestic animals provide the conditions for the pika population to increase beyond the densities observed here, and thus toward the levels where facilitation yields to competition?

      Citation:

      Harris RB, Wang, WY, Badinqiuying , Smith AT, Bedunah DJ (2015) Herbivory and Competition of Tibetan Steppe Vegetation in Winter Pasture: Effects of Livestock Exclosure and Plateau Pika Reduction. PLoS ONE 10(7): e0132897. doi:10.1371/journal.pone.0132897

    2. Reviewer #2 (Public review):

      Summary:

      This study uses a combination of field sampling and manipulative experiments to test for facilitative impacts of pikas on yaks via suppression of a poisonous forb. The authors found that, when Stellera forbs were present, yak weight increases over the growing season were greater in the presence of pikas compared to in their absence. This occurred because, although pikas do not consume Stellera, they clip it and use it in nest/burrow construction, thereby decreasing its relative abundance in the plant community. Thus, overall, the study contributes to our understanding of how herbivores of different size classes indirectly affect each other via the use of shared resources.

      Strengths:

      It is well known that large herbivores on grasslands impact smaller animals, but the reciprocal interaction is rarely tested. Thus, this study asks a valuable question, and the experiment is well-designed to test it. The authors also do a good job of demonstrating the potential conservation impacts of their research.

      Weaknesses:

      What the authors tested is really cool, but their claims go far beyond what they can say based on their experimental design. For example, the authors claim to show that pika impacts on yaks display density-dependent transitions from competition to facilitation. However, their experiment only looked at the presence (at moderate densities) and absence of pikas, and they only tested for facilitation, not competition.

      The paper would also benefit from changes to the framing in the introduction and discussion. For example, the authors pitch the work as a test of the stress-gradient hypothesis. However, there is no abiotic stress gradient in the study, which is an essential component of the SGH. They also pitch the work in terms of density dependence, but there is no significant variation in population densities beyond the presence-absence binary. The paper would be stronger if they focused their framing around the literature on facilitative interactions across mammals of different size classes, especially indirect facilitation via use of shared resources, which is what this paper is really about.

      Finally, the paper has significant weaknesses in the experimental and statistical methodology. Most importantly, there are inconsistencies in what is visualized in the figures compared to the model results. For example, the results section in several places notes a lack of significant interaction terms in the model but shows interactions in the p-values on the figures. The authors also plot smoothed lines rather than their model results and then draw interpretations from those lines that cannot be tested in the models that they used. There are also missing details that are important for model interpretation, including the distributions used and the sample sizes. Another major concern with experimental design is in the forage nutrient analyses. The authors picked plants along a grazing trail, then measured nutrient content without standardizing based on plant species, so any differences across treatments could be because of what they happened to grab rather than overall forage quality.

    1. Reviewer #1 (Public review):

      The multi-species approach of testing the model in macaque and mouse is excellent, as it improves the chances that the observed findings are a general property of mammalian visual cortex. It would be useful to delineate however any notable differences between these species, which are to be expected given their lifestyle.

      The overall performance of the model appears to be excellent in V1, with over 80% performance, but falls substantially in V4. It would be important to consider the implications of this finding; for example, in the context of studying temporal lobe structures that are central to recognizing objects. Would one expect that model performance decreases further here, and what measures could be taken to avoid this? Or is this type of model better restricted to V1 or even LGN?

      While the manuscript delineates novel axes of inhibitory interactions, it remains unclear what exactly these axes are and how they arise. What are the steps that need to be taken to make progress along these lines?

      Comments on revised version.

      The authors have adequately addressed the points I raised in my review during the revision.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed all the comments raised in the previous round of review. The revised manuscript includes new labeling experiments revealing boundary compression at the cardiac poles consistent with the authors predicted dynamic model of heart tube formation.]

      Summary:

      The study by Raiola et al. conducted a quantitative analysis of tissue deformation during the formation of the primitive heart tube from the cardiac crescent in mouse embryos. Using the tools developed to analyze growth, anisotropy, strain, and cell fate from time-lapse imaging data of mouse embryos, the authors elucidated the compartmentalization of tissue deformation during heart tube formation and ventricular expansion. This paper describes how each region of the cardiac tissue changes to form the heart tube and ventricular chamber, contributing to our understanding of the earliest stages of cardiac development.

      In order to understand tissue deformation in cardiac formation, it is commendable that the authors effectively utilized time-lapse imaging data, a data pipeline, and in silico fate mapping. The study clarifies the compartmentalization of tissue deformation by integrating growth, anisotropy, and strain patterns in each region of the heart.

    2. Reviewer #2 (Public review):

      The authors address an important challenge in developmental biology: the quantitative description of tissue deformation during organogenesis. They have developed a new pipeline to quantify early heart tube morphogenesis in the mouse, with cellular resolution. They adopt an elegant approach by integrating multiple 3D time-lapse datasets into a dynamic atlas of cardiac morphogenesis in order to compute spatio-temporal deformation patterns. The main findings highlight a strong compartmentalization of cell behaviors, with tissue growth and anisotropy exhibiting complementary and spatially segregated patterns. Using these data, the authors developed an in-silico fate mapping tool to interrogate cell displacement within the myocardium. This virtual model provides new mechanistic insights into how the bilateral cardiac primordia converge and transform into a three-dimensional heart tube. The authors identify "belt-like" constraints at the arterial and venous poles that prevent tissue expansion and thus shape the ventricular barrel morphology.

      The computational framework is highly innovative and impressive, providing an unprecedented 3D model of tissue deformation during heart morphogenesis. It also opens avenues for testing hypotheses regarding tissue growth and the forces that cause cell motion.

      Overall, this carefully performed study provides a new model for exploring tissue deformation during organogenesis and will be of broad interest to computational and developmental biologists.

    3. Reviewer #3 (Public review):

      Summary:

      The manuscript by Raiola and colleagues entitled "Quantitative computerized analysis demonstrates strongly compartmentalized tissue deformation patterns underlying mammalian heart tube formation" takes a highly quantitative approach to interrogating the earliest stages of cardiogenesis (12 hours, from early cardiac crescent to early heart tube) in a new and innovative way. The paper presents a new computational framework to help identify both regional and temporal patterns of tissue deformation at cellular resolution. The method is applied to live embryo imaging data (newly generated and from the group's previous pioneering work). In the initial setup, the new model was applied directly to raw time-lapse data, and the results were compared to actual cell tracks identified manually, showing close correlations of the model with the manual tracking. Next, they integrated spatial and temporal information from different embryos to generate a new model for tissue movement, driven by parameters such as tissue growth and anisotropy. Key findings from their model suggest that there are distinct compartments of tissue deformation patterns as the bilateral cardiac crescent develops into the linear heart tube, and that the ventricular chamber forms by a defined expansion pattern, as a 'hemi-barrel shape', with the arterial and venous poles (IFT and OFT) acting as the harnessing belts constraining the expansion of the chamber further. Lastly, the model is tested for its ability to predict future residence of cardiac crescent cells in the heart tube, which it seems to be able to do successfully based on fate tracking validation experiments.

      The manuscript provides an exceptionally careful analysis of a critical stage during heart development - that of the earliest stages of morphogenesis, when the heart forms its first tube and chamber structures. While numerous studies have interrogated this stage of heart development, few studies have performed time-lapse imaging, and, to my knowledge, no other report has performed such in in-depth quantitative analysis and modeling of this complex process. The computational model applied to normal heart development of the myocardium (labelled by Nkx2-5) has revealed multiple new and interesting concepts, such as the distinct compartments of tissue deformation patterns and the growth trajectories of the emerging ventricle. The fact that the model operates at cellular resolution and over a nearly continuous time period of approximately 12 hours allows for unprecedented depth of the analysis in a largely unbiased manner. Going forward, one can imagine such models revealing additional information on these processes, performing analyses of subpopulations that form the heart, and maybe most importantly, applying the model to various perturbation models (genetic or otherwise). The manuscript is very well written, and the data display is accessible and transparent.

      No major weaknesses are noted with the study. It would have been very exciting to see the model applied to any kind of perturbation, for example, a left-right defect model, or a model with compromised cardiac progenitor populations. However, the amount of live imaging required for such analyses renders this out of scope for the current study.

    1. Reviewer #1 (Public review):

      [Editors' note: This version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The manuscript by Rayan et al. aims to elucidate the role of RNA as a context-dependent modulator of liquid-liquid phase separation (LLPS), aggregation, and bioactivity of the amyloidogenic peptides PSMα3 and LL-37, motivated by their structural and functional similarities.

      Strengths:

      The authors combine extensive biophysical characterization with cell-based assays to investigate how RNA differentially regulates peptide aggregation states and associated cytotoxic and antimicrobial functions.

    2. Reviewer #2 (Public review):

      In this paper, Rayan et al. report that RNA influences cytotoxic activity of the staphylococcal secreted peptide cytolysin PSMalpha3 versus human cells and E. coli by impacting its aggregation. The authors used sophisticated methods of structural analysis and describe the associated liquid-liquid phase separation. They also compare to the influence of RNA on aggregation and activity of LL-37, which shows differences to that on PSMalpha3.

      Major comments on the previous version:

      (1) The premise, as stated in the introduction and elsewhere, that PSMalpha3 amyloids are biologically functional, is highly debatable and has never been conclusively substantiated. The property that matters most for the present study, cytotoxicity, is generally attributed to PSM monomers, not amyloids. The likely erroneous notion that PSM amyloids are the predominant cytotoxic form is derived from an earlier study by the authors that has described a specific amyloid structure of aggregated PSMalpha3. Other authors have later produced evidence that, quite unsurprisingly, indicated that aggregation into amyloids decreases, rather than increases, PSM cytotoxicity. Unfortunately, yet other groups have in the meantime published in-vitro studies on "functional amyloids" by PSMs without critically challenging the concept of PSM amyloid "functionality". Of note, the authors' own data in the present study that show strongly decreased cytotoxicity of PSMalpha3 after prolonged incubation are in agreement with monomer-associated cytotoxicity as they can be easily explained by the removal of biologically active monomers from the solution.

      In their revision and in the rebuttal, the authors have further described their concept regarding what they call "functionality" of PSMalpha3 amyloids. They now admit that monomers are the active cytolytic form, like other researchers have stressed, whereas amyloids are not. This represents a considerable difference to earlier papers in which they ascribed functionality, i.e. cytolytic capacity, to PSMalpha3 amyloids, a claim that has raised considerable controversy. Now, they use the term "functional " to describe that PSMalpha3 amyloids, while not cytolytic, can be reversed to a cytolytic monomeric state, calling them a "dynamic reservoir". There is no evidence that such a reservoir is necessary for the cytolytic activity of the monomers to be established; also, there is no evidence that in a biological system, such an amyloid reservoir exists. To continue calling PSMalpha3 amyloids "functional" based on this - considerably changed - concept of the authors appears inappropriate, given the finally admitted absence of cytolytic activity of the PSM amyloids in addition to the continuing complete lack of evidence of any biological relevance of PSM amyloid formation.

      (2) That RNA may interfere with PSM aggregation and influence activity is not very surprising, given that PSM attachment to nucleic acids - while not studied in as much detail as here - has been described. Importantly, it does not become clear whether this effect has biologically significant consequences beyond influencing, again not surprisingly, cytotoxicity in vitro. The authors do show in nice microscopic analyses that labeled PSMalpha3 attaches to nuclei when incubated with HeLa cells. However, given that the cells are killed rapidly by membrane perturbation by the applied PSM concentrations, it remains unclear and untested whether the attachment to nucleic acids in dying cells makes any contribution to PSM-induced cell death or has any other biological significance.

    1. Reviewer #1 (Public review):

      Summary:

      Here the authors address the organization of reach-related activity in layer 2/3 across a broad swath of anterodorsal neocortex that included large subregions of M1, M2, and S1. In mice performing a novel variant water-reaching task, the authors measured activity using two-photon fluorescence imaging of a GECI expressed in excitatory projection neurons. The authors found a substantial diversity of response patterns using a number of metrics they developed for characterizing the PETHs of neurons across reach conditions (target locations). By mapping single-neuron properties across cortex, the authors found substantial spatial variation, only some of which aligned with traditional boundaries between cortical regions. Using Gaussian mixture models, the authors found evidence of distinct response types in each region, with several types prominent in multiple cortical regions. Aggregating across regions, four primary subpopulations were apparent, each distinct in their average response properties. Strikingly, each subpopulation was observed in multiple regions, but subpopulation members from different regions exhibited largely similar response properties.

      Strengths:

      The work addresses a fundamental question in the field that has not previously been addressed at cellular resolution across such a broad cortical extent. I see this as truly foundational work that will support future investigation of how the rodent brain drives and controls reaching.

      The quantification is thoughtful and rigorous. It is great that the authors provide explanation for and intuition behind their response metrics, rather than burying everything in the Methods.

      The Discussion and general contextualization of the Results is thorough, thoughtful, and strong. It is great that the authors avoid the common over-interpretation of classical observations regarding cortical organization that are endemic in the field.

      All things considered, this is the best paper regarding spatial structure in the motor system I have ever read. The breadth of cellular resolution activity measurement, the rigor of the quantification, and the clear and open-minded interrogation of the data collectively have produced a very special piece of work.

      Weaknesses:

      There are two important issues left unaddressed that the authors plan to address in their future work. The first is the relation between observed neural activity patterns and movement kinematics, and in particular how much the activity variation across target locations may relate to the kinematic differences across these different conditions, as opposed to true higher-order movement features like reach direction. The second issue is how to interpret the results in relation to existing ideas about behavioral organization in motor/premotor cortex.

      Comments on revised version:

      The authors have done an excellent job addressing my previous concerns. I have no additional concerns with the manuscript.

    2. Reviewer #2 (Public review):

      Summary:

      The functional parcellation of cortical areas is a critical question in neuroscience. This is particularly true in frontal areas in mice. While sensory areas are relatively well characterized by their tuning to sensory stimuli, the situation is much less clear for motor areas. This has become even more ambiguous since recent studies using large-scale neuronal recordings consistently report mixed sensory and motor-related activity throughout the brain and motor mapping studies have shown that movements evoked by cortical stimulation are by no means limited to motor areas alone. Here, the authors use a correlation approach combining large-scale functional imaging at cellular-resolution with movement-tracking in mice executing a reaching task. Across multiple recording sessions in the same animals, the authors have imaged a large portion of the sensorimotor cortex at cellular resolution in mice performing a reaching task, recording the activity of nearly 40,000 neurons. By aligning the calcium signal of each neuron to three task events-the Go cue triggering the reach, the onset of paw lift, and the contact between the paw and the target-for different target positions, the authors identified different response patterns distributed differently across cortical areas. They defined a set of features that describe the neurons' response pattern, representing the temporal dynamics and tuning properties for the different target positions. These features were used to construct cortical maps, and the authors show that, interestingly, gradient maps obtained from the first derivative of the feature maps reveal sharp discontinuities at the boundaries between anatomically defined cortical areas. Using dimensionality reduction of the neuronal response features, the authors found that, despite clear differences in their average response properties, individual neurons from the same cortical areas do not form distinct clusters in the reduced-dimensional space. In fact, most areas contain heterogeneous neuronal populations, and most neuronal populations are present in multiple areas, albeit in different proportions. Interestingly, the authors identified four neuronal subpopulations based on the distance between the components of the Gaussian mixture model used to model the distribution of neurons within each area. One of these subpopulations is almost exclusively represented in the anterior M2 cortex, while another is broadly distributed across the different areas.

      Strengths:

      This article is based on an impressive dataset of nearly 40,000 neurons covering a large portion of the sensorimotor cortex and on innovative analytical approaches. This study is likely the first to clearly demonstrate boundaries between cortical areas defined based on the responses of individual neurons. This innovative approach to functional mapping of cortical areas potentially opens up new perspectives for higher-resolution mapping of frontal cortical areas, using a broader repertoire of sensory and motor evoked responses.

      Weaknesses:

      One limitation of this study - inherent in most cell imaging studies - is that it only takes into account the activity of neurons in superficial cortical layers. One might think that taking into account neuronal activity across the different layers would allow for an even finer functional cortical segmentation.

      Comments on revised version:

      The authors have answered all my questions and this new version has largely improved in clarity.

    1. Reviewer #1 (Public review):

      Summary:

      Simoens and colleagues use a continuous estimation task to disentangle learning rate adjustments on shorter and longer timescales. They show that participants rapidly decrease learning rates within a block of trials in a given "location", but that they also adjust learning rates for the very first trial based on information accrued gradually about the statistics of each location, which can be viewed as a form of metalearning. The authors show that the metalearned learning rates are represented in patterns of neural activity in the orbitofrontal cortex, and that prediction errors are represented in a constellation of brain regions including ventral striatum, where they are modulated by expectations about error magnitude to some degree. The work opens the door to future work focusing on how exactly these signals contribute to adaptive behavior.

      Strengths:

      The authors build on an interesting task design allowing them to distinguish moment-to-moment adjustments in learning rate from slower adjustments in learning rate corresponding to slowly gained knowledge about the statistics of specific "locations". Behavior and computational modeling clearly demonstrate that individuals adjust to environmental statistics in a sort of metalearning. fMRI data reveal representations of interest including those related to adjusted learning rates and their impact on the degree of prediction error encoding in the striatum.

      Weaknesses:

      It was nice to see that the authors could distinguish differences between the OFC signals that they observed and those in the visual regions based on changes through the session. However, the linkage between these brain activations and a functional role in generating behavior remains somewhat unclear, opening the door for alternative interpretations.

      Comments on revised version.

      I appreciate the authors responses and they have largely addressed my concerns. I understand the concerns about power with regard to the individual differences/behavioral analyses included in the rebuttal. However, my personal view, which is perhaps a matter of taste, is that the paper would benefit from a description of these results - along with a clear description of why the authors are hesitant to draw a strong interpretation from the negative result.

    2. Reviewer #2 (Public review):

      Summary:

      Across two experiments, this work presents a novel spatial predictive inference paradigm that facilitates the investigation of meta-learning across multiple environments with distinct statistics, as well as more local learning from sequences of observations within an environment. The authors present behavioral data indicating that people can indeed learn to distinguish between noise levels and calibrate their learning rates accordingly across environments, even on initial trials when revisiting an environment. They complement their behavioral results with computational modeling, further bolstering claims of both local and global adaptation. Additional fMRI results support the role of OFC in this meta-learning process, with central OFC activity reflecting similarity between environments. This similarity emerges over time with task experience. Holistically, this paradigm and these data add to our understanding of how humans dynamically adapt their behavior on different timescales.

      Strengths:

      The novel paradigm represents a clever and creative expansion of spatial predictive inference tasks. The cover story was well chosen to facilitate an intuitive understanding of both the differences between environments, and the estimation of the mean within environments.

      Additionally, the authors present complementary results from two experiments, which strengthens the behavioral findings. This is especially effective as the initial experiment's results were a bit noisy, and the modifications within the second experiment increased both power and the specificity/accuracy of participant predictions. Taken together, the behavioral results provide convincing evidence that participants did distinguish environments based on their underlying statistics and adapted their initial behavior accordingly.

      Beyond this, the combination of behavioral results, computational modeling, and neuroimaging enhances the impact of the work. It paints a fuller picture of whether and how humans meta-learn the global statistics of environments, and this is an important direction for the field of adaptive learning.

      Weaknesses:

      Throughout much of the paper, the authors refer to the distinctions between environments primarily as differences in "initial learning rates" or "environment-specific learning rates." The optimal initial learning rate did indeed differ across environments -- the result of differences in underlying task statistics. These differences in task statistics result in distinct optimal initial learning rates and also vary with aspects of spatial position (e.g. vertical position in the example figure). The authors convincingly show that OFC activity increasingly reflects these variables throughout task experience. Given that these variables vary together, future work will be needed to distinguish whether particular variables drive these dynamics, or whether together they combine to evoke the representational differences.

      The current work is also quite suggestive of meaningful individual differences in both local and global adaptive learning, in line with other prior work on predictive inference. This is perhaps underexplored in this data set, but certainly leaves the topic ripe for follow up going forward.

      Finally, more information on all clusters that survived multiple comparisons correction would be useful, even in the absence of a priori hypotheses. For instance, there is commentary in the discussion section on the ACC, but this is not mentioned in the results, and it is unclear whether there were other undescribed clusters that survived correction.

    1. Reviewer #1 (Public review):

      Summary:

      Cai et al. investigated the role of ripples in the hippocampus and coupled between the hippocampus and the neocortex in visual short-term memory (VSTM) using a similar lures match-to-sample task. The main findings are that hippocampal, but not neocortical ripples, ramp up during the maintenance period, peaking shortly before the memory response is given. This ramping-up effect was stronger for correct compared to incorrect trials. Furthermore, the authors show that stimulus category could be better decoded during coupled hippocampo-neocortical ripples compared to uncoupled ripples. These results provide compelling novel evidence for a role of ripples in supporting human visual short-term memory.

      Strengths:

      (1) State-of-the-art intracranial EEG in 13 patients during a well-designed visual short-term memory task, with simultaneous hippocampal and neocortical recordings.

      (2) Thorough analysis pipeline with validation to detect ripple events, and distinguish them from spurious ripple activity (i.e., as induced by IEDs).

      (3) Use of multivariate classifiers to resolve the neural representation of the stimuli.

      Weaknesses:

      It is difficult to find clear weaknesses in this paper, as the analyses are thorough, the results are clear, and the writing is excellent. However, some more sanity checks on the validity of ripples could have been conducted (i.e., making sure that ripple events have multiple peaks in the unfiltered raw signal at the ripple frequency). Also, the time window for coupled ripples appears to be a bit long, which makes it questionable to what degree these ripples are coupled (i.e., the time window is ~5 times longer than the duration of a ripple event). Lastly, the ramping-up effect could have been more clearly depicted in the figures, but that's a fairly minor point.

    2. Reviewer #2 (Public review):

      Summary:

      Liu et al. record intracranial EEG from the hippocampus and lateral temporal lobe in thirteen neurosurgical patients while they perform a delayed match-to-sample visual short-term memory task. The central question is whether hippocampal sharp-wave ripples (brief high-frequency oscillations well established in the long-term memory consolidation literature) also contribute to the active maintenance of visual representations over a short delay. The authors report three main findings: hippocampal ripple rates progressively ramp up across the 7-second maintenance period, hippocampal ripples temporally co-occur with ripples in the lateral temporal lobe, and these coupled events coincide with above-chance category-level decoding of the memorized stimulus in the lateral temporal lobe. The findings are interpreted within the dynamic coding framework of working memory, which predicts discrete reactivation bursts rather than sustained firing during maintenance. The question is timely, and the use of intracranial recordings affords a level of temporal and spatial resolution unavailable to non-invasive methods.

      Strengths:

      The study addresses a genuinely important and underexplored question: whether a neural mechanism best characterized in the context of offline memory consolidation is also engaged during active online maintenance. The use of intracranial recordings in humans is well suited to this question, providing the millisecond temporal resolution and regional specificity needed to detect transient high-frequency events. The dissociation from long-term memory, tested by splitting remembered trials according to whether the item was later recalled in a cued-recall test, directly addresses what would otherwise be a significant confound, and the finding that ripple dynamics during maintenance are unrelated to subsequent long-term memory performance adds specificity to the interpretation. The coupled ripple analysis is methodologically grounded, and the finding that coupled but not isolated ripples coincide with elevated memory decoding is mechanistically informative. The multivariate decoding approach applied to lateral temporal lobe spectral power provides a meaningful index of memory reactivation that goes beyond simple univariate rate measures. The control analysis and the alternative ripple detection method provide useful robustness checks. The public availability of preprocessed data and analysis code on OSF is commendable.

      Weaknesses:

      (1) Theoretical motivation for examining ripples in visual short-term memory.

      A fundamental question that the paper does not adequately address is why hippocampal ripples, a mechanism strongly associated with offline memory consolidation during sleep, where they coordinate the transfer of hippocampal representations to cortex through temporally compressed replay, should be recruited for the online maintenance of visual information over a seconds-long delay. The Introduction acknowledges this gap but does not close it. The dynamic coding framework is used to motivate the ramping-up prediction, but this framework is agnostic about the specific neural mechanism responsible for reactivation bursts. In particular, the literature cited by the authors predicts high-frequency population activity or gamma bursts, but not specifically hippocampal ripples. The reasoning that "ripples share key properties with postulated reactivation bursts" risks being circular: it amounts to saying that ripples could be the relevant mechanism because the relevant mechanism has properties that ripples also have. A stronger theoretical motivation would require either evidence that the replay or reactivation computations that ripples support during offline states are also engaged during active short-term maintenance, or a mechanistic account of how the circuit processes underlying ripple generation are recruited differently across these two contexts.

      This concern is compounded by what the authors present as one of their main controls. The finding that ripple dynamics during maintenance are not associated with subsequent long-term memory performance is treated as a reassurance that the observed effects are specific to short-term memory. But if ripples are canonically a long-term memory consolidation mechanism, the observation that they are engaged by a short-term memory task while appearing disengaged from concurrent long-term memory encoding is itself a finding that demands explanation. Resolving this tension is important for the paper's contribution to be correctly interpreted by the field.

      (2) Ripple detection and specificity.

      Even granting that ripples could in principle contribute to short-term memory maintenance, the study does not establish that the detected events are physiological sharp-wave ripples rather than broadband high-frequency activity. The detection band (70-180 Hz) substantially overlaps with the high-gamma range, which is a well-established proxy for local neural population activity and coding, and is broader than the 80-120 Hz band used by several of the cited papers, including Vaz et al. (2019), Ngo et al. (2020), Chen et al. (2021), Staresina et al. (2023), and Kunz et al. (2024). Without demonstrating that detected events have the hallmark features of physiological sharp-wave ripples, a clear narrowband spectral peak, and characteristic waveform morphology, it is difficult to conclude that the observed effects reflect a ripple-specific mechanism rather than a more general high-frequency population activity phenomenon. The reported mean rate of 0.29 Hz is somewhat higher than rates reported in some recent work, such as Chen et al. (2021, ref 74) and Kunz et al. (2024, ref 15). It is worth noting that van Schalkwijk and Helfrich (2026, Nature Communications) demonstrated that a large proportion of awake ripple detections in the human medial temporal lobe reflect false positives arising from aperiodic 1/f noise, with task-related modulations of this noise floor producing spurious detections. The authors present an 80-120 Hz control analysis as a robustness check, but this inverts the appropriate logic: if 80-120 Hz is the more validated band, as the cited literature suggests, it should serve as the primary analysis rather than a supplementary one.

      (3) Internal inconsistency with the dynamic coding framework.

      The authors invoke the dynamic coding framework, which predicts that reactivation bursts should ramp up toward the end of the retention interval in the region where memory representations are actively maintained. The hippocampal ramping-up result is presented as confirming this prediction. However, the lateral temporal lobe, the region where above-chance category decoding is found and memory reactivation is attributed, shows no corresponding ramp-up. The authors acknowledge this asymmetry but do not offer a mechanistically satisfying explanation, and the suggestion that the effect might exist in unsampled subregions cannot be evaluated with the current data. This leaves the framework's core prediction unconfirmed in the region that is claimed to maintain the representations.

      (4) Coupled ripples, directionality of hippocampal-lateral temporal coupling, and the ramping-up paradox.

      The conclusion that coupled hippocampal-lateral temporal ripples coordinate memory reactivation creates a logical tension that the paper does not resolve. If hippocampal ripples drive lateral temporal reactivation only when co-occurring with lateral temporal ripples, and hippocampal ripples ramp up in a memory-predictive fashion, then the absence of lateral temporal ripple ramping up implies that the hippocampal ramp-up is not primarily expressed through the coupled ripple mechanism, undermining the coherence of the two main findings. The coupled ripple analysis further quantifies only temporal co-occurrence and provides no evidence about the direction of influence. Without demonstrating that hippocampal ripples systematically precede lateral temporal ripples (i.e., the expected signature of hippocampus-to-cortex information flow), the central claim that hippocampal ripples drive lateral temporal reactivation remains an interpretive assumption. Directly testing whether lateral temporal ripples specifically coupled to hippocampal ripples show a ramping temporal profile during maintenance (even if overall lateral temporal ripple rates do not) is necessary to establish whether the lateral temporal lobe engages in hippocampally-gated reactivation bursts in the manner the framework predicts. Additionally, reporting the distribution of peak lags between hippocampal and lateral temporal ripple peaks, and testing whether hippocampal ripples systematically precede lateral temporal ripples, is similarly necessary to support the directional interpretation.

      (5) Trial-level analysis clarity.

      The paper reports that ripples occurred in 54%, 79%, and 27% of trials during encoding, maintenance, and retrieval, respectively, but does not state whether subsequent analyses were conducted on trials thresholded by ripple occurrence. Given that occurrence rates vary substantially across stages and conditions, this inclusion criterion has implications for interpreting rate differences and should be stated explicitly.

      (6) Statistical model specification.

      The methods describe the ramping-up analysis using both a "logistic" link function and a "Poisson link function" in different places, with the dependent variable described inconsistently as ripple occurrence and ripple count. These are not equivalent, and the distinction matters for interpreting the reported coefficients. Additionally, the regional dissociation in Figure 3 appears to be assessed by fitting separate models to each region and comparing results informally. This does not constitute a direct test of whether slopes differ between regions and risks the well-known error of inferring a difference based on one p-value being significant while another is not. A direct region × time interaction test would more cleanly support the claimed dissociation.

    3. Reviewer #3 (Public review):

      Summary:

      Liu, He, et al. present results suggesting hippocampal ripples support short-term working memory. The basic finding that hippocampal ripples increase during a 7s working memory maintenance period is intriguing and previously not shown as far as I know, but a lack of control analyses within the task, across brain regions, or as compared to alternative oscillatory signals makes the overall evidence weak. The author needs to more thoroughly evidence this signal via several analyses (suggested below) to strengthen their finding. The paper moves on to a hippocampal-cortical ripple coupling analysis that needs further methodological details and corrected statistics to make a meaningful contribution. As is, the ripple coupling results don't seem to necessarily relate to the hippocampal ripples found in the maintenance period, making the manuscript somewhat incoherent and of low impact in its current form.

      Major issues:

      (1) The framing sets up "visual short term memory" (VSTM) and "long term memory" (LTM) as two different things. A long line of research with humans possessing MTL/hippocampus damage shows the hippocampal memory system contributes to working memory only when the task is difficult enough to warrant its recruitment (see Hannula et al. 2006 J. of Neuroscience, Pertzov et al. 2013 Brain, or particularly Jeneson et al. 2012 Learning & Memory and J. of Neuroscience). This theory therefore, suggests that the hippocampus contributes to working memory via LTM mechanisms, as opposed to it possessing two different roles (VSTM and LTM). While the authors might disagree with this framing, at a minimum, they should describe this line of work. As is, it's difficult to know how their task fits into this literature since it's a cross between a pattern separation probe (identify repeats from lures), working memory (7 s delays), and subsequent cued associate recognition. Addressing why they used this combination of task features would help frame its place in the literature.

      (2) The basic idea of looking for hippocampal ripples as a marker for working memory maintenance is new, with no prior literature (that I know of in rodents or in the handful of human intracranial ripple papers) to build on. That said, I suspect hippocampal ripples act as a proxy for hippocampal activation, providing a possible explanation for the hippocampal ripple increase shown during the Maintenance period. The effect they show is well supported by the mixed effects modeling (MEM), making it a potentially meaningful finding, but considering the novelty, it's rather important that control analyses rule out alternative possibilities. I suggest two important ones and a third related to the lack of parametric manipulations in the next paragraph. First, the authors frame the paper by suggesting hippocampal ripples share features with beta/gamma burst theories of working memory maintenance. In that case, the obvious question is why use a ripple detector instead of measuring gamma (or beta) activity as in this previous work? Some work has suggested hippocampal ripples act differently than high-frequency activity (see Sakon et al. 2024 J. of Neuroscience), so an analysis contrasting ripples and gamma seems rather important. Second, and relatedly, the authors only compare the hippocampus and lateral temporal cortex (LTC), likely because these tend to be sites with strong coverage in epilepsy cases. That's ok, but typically there is also reasonable coverage in other MTL areas like entorhinal cortex and amygdala, which would serve as important controls to show what they're measuring likely relates to sharp-wave ripples (a hippocampal phenomenon) and not something more generic like gamma or HFA (as shown in Sakon et al. 2024, Howard et al. 2003 Cerebral Cortex, Axmacher et al. 2007 reference 26, Meltzer et al. 2008 Cerebral Cortex, etc.).

      (3) Related to the last point, since there are no parametric manipulations (e.g., different delay durations, different set sizes, varying lure difficulties) there's no way to assess increased hippocampal ripples with stronger loads, which would be important for determining the hippocampal dependence of their task in the first place. Do the authors have any justification for this task as an assessment of hippocampal working memory? I could imagine using a top vs. bottom tercile of lure discrimination difficulty (as assessed across all participants or control non-patients) to compare hippocampal activity. But only after the first trial, each pair is used since only then would the patient have awareness of the difficulty of the upcoming comparison. Or maybe something could be done by comparing VSTM performance by splitting patients based on how they performed at the LTM test.

      (4) Also related to the VSTM vs. LTM framing, the authors use an "LTM" cued category recognition task--presumably done at the end of the repeat/lure recognition task--as a way to argue that the hippocampal ripple effects they see relate to VSTM and not LTM. The LTM task is disappointingly underdescribed, where even in the methods (lines 588-592) I cannot figure out when this task was probed, how many trials were done in comparison to the VSTM task, etc. Considering they use the LTM task to support their VSTM interpretation, it's rather crucial to understand precisely what they did. As is, the comparison they do present relies on a statistical error, where they compare p-values (n.b. https://www.nature.com/articles/nn.2886) instead of performing a direct interaction test (lines 177-180). Specifically, if they want to say their signal relates more to VSTM subsequent memory rather than LTM subsequent memory, they need to run a model of the form: ripple_rates ~ remembered + test_type + remembered*test_type (where test_type is either their VSTM or LTM task).

      (5) As noted, the increase in hippocampal ripples during maintenance seems substantial, and the MEM confirms a significant increase over time. That said, the presentation of the data is atypical, with an example raster from one channel followed by average time courses of ALL participants below it. Why not show full raster plots for all participants? Ripples are so sparse that all the data in the task can be visualized in a single raster easily. A swarm plot indicating inter-patient variability in the maintenance signal also seems crucial. As is, there is no way to assess how much of the signal depends on a small subset of channels or patients.

      (6) To compare ripple rates across task phases, they average over the bounds of each phase (lines 657-660) and input these into their MEMs. This approach makes sense for quantifying what we see in the ripple plots (Figure 2), except for Encoding, where they average over the entire 3 s window, even though there is clear tuning only from ~0-1 s. Using the tuned region and not the entire window is standard and would be more appropriate for the comparisons to maintenance, retrieval, etc (e.g., line 147-148 doesn't check out when looking at the figure), otherwise you are averaging over a seeming ripple inhibition from 1-2 s. They perform a cluster-based permutation test as is, so that a window or something a bit wider would be appropriate.

      (7) The authors pivot to a hippocampal-cortical ripple coupling analysis to build the argument that the hippocampal ripples shown in Figure 2 support memory maintenance in the cortex. They use a window of -500 to 500 ms from hippocampal ripples to assess coupling. This is quite wide, since it doesn't seem plausible that a cortical ripple 500 ms from a hippocampal ripple means they synchronize. They cite two papers to justify the analysis, both of which use {plus minus}500 ms windows, but for spindle-ripple coupling, not ripple-ripple, so are miscited. Later in the paper, they switch to {plus minus}50 ms for another coupling analysis, raising the question of why they used {plus minus}500 ms in the previous analysis to begin with. If they want to claim cortical ripples are tuned by hippocampal ripples all the way up to 500 ms away, they should show the rasters (as in Figure 4a) and timecourse ripple rates, but going beyond {plus minus}500 ms to show that ripples in the {plus minus}50-500 ms range are above, say 500-1000 ms to justify their window selection. I will point out that there IS previous work that used {plus minus}500 ms to measure cortical-cortical ripple coupling (Dickey et al 2022 PNAS, which should be cited regardless, as I believe the first hippocampal-cortical ripple paper showing memory effects), although the figures in that paper suggest anything beyond {plus minus}250 ms returns to baseline (see Figure 2A-B).

      (8) Lines 239 to 243 comparing p-values instead of an interaction test.

      (9) I don't understand what "Further analysis based on the identified cluster" means (line 271). I see in Figure 5c that their broadband classifier identified a window of optimal decoding, but did they use only activity in this cluster to train the subsequent classifier (Figure 5d)? If so, this is not described in the methods. And if it is done that way, I don't think the logic makes sense. As mentioned in comment 6, the ripples during encoding tune to 0-1s after image presentation. So it doesn't make sense to use a 1.85-2.25 s window for ripple-locked decoding-they should just be using the 0-1 s window (or whatever their cluster-based permutation test shows in Figure 2b). Otherwise, it would appear they are studying two different phenomena.

      (10) As is, the results in Figure 5d need to be redone. First, the results described on lines 271-275 once again suffer from comparing p-values. They need to run an interaction model if they want to claim Maintenance shows stronger ripple-locked decoding than Encoding (it almost certainly will not, since Encoding appears to show some evidence of decoding (p=0.118)). Second, even if they do change the framing to say Encoding and Maintenance show significant decoding, is it meaningful if Retrieval fails to? If you cannot decode the same information at the time of retrieval as is theoretically being held in working memory during the delay, the coupled ripple reactivation story wouldn't appear to make sense. They do show significant Retrieval decoding in Figure 5a-b, but since I don't really understand how they settled on the "identified cluster" in Figure 5c, I'm not sure what to make of the difference between these decoders.

      (11) Finally, as mentioned in the summary, the analyses in Figures 2-3 seem disjointed from those in Figures 4-5. Part of this has to do with the switch to a broadband classifier, then a switch back to coupled ripples, and then, as I already mentioned, decoding results with time windows that don't align with the hippocampal ripple effects they showed earlier. Further, since the main point of Figures 2-3 is to establish a ramp in hippocampal ripples across maintenance, shouldn't they be trying to show how the decoding changes over the course of the Maintenance period? It would also help the interpretation of Figure 5 to see how the coupled ripples change over time in Figure 4 (as they showed them in Figure 2).

      Minor issues:

      (1) Instead of citing a software package like Emmeans, the statistical test being performed should be explained.

      (2) Decoding % accuracy in the heatmaps in Figure 5 and supplementary would be more intuitive, particularly since Figure 5b uses accuracy anyway.

      (3) Figure 2b is misleading with an unnecessary change in the y-axis for retrieval.

      (4) In Figure 2d, a significant cluster is mentioned, but not drawn onto the figure as in Figure 2b.

    1. Reviewer #1 (Public review):

      Summary:

      This study combines representational similarity analysis (RSA) with 7T layer-specific fMRI and EEG to examine how neural representations in specific cortical layers of EVC and LOC correspond to the temporal dynamics of visual processing. The authors interpret these correspondences as reflecting feedforward and feedback processes, based on their relative timing and their similarity to representations in different layers of a deep neural network (DNN).

      Strengths:

      The combination of RSA with laminar fMRI is a promising approach for dissociating the functional roles and dynamics of different cortical layers within the same functional region, and it holds considerable potential for elucidating computational mechanisms both within and between levels of the visual hierarchy. However, several issues should be addressed before the authors' conclusions can be fully supported.

      Weaknesses:

      (1) The authors report that the representation in the LOC superficial layer resembles EEG-derived neural representations at ~400 ms post-stimulus, and that this similarity is best explained by representations in the higher layers of the DNN. From these two observations, they conclude that activity in the LOC superficial layer is driven by feedback signals. However, neither line of evidence directly dissociates feedforward from feedback contributions.

      Specifically, late-stage representations in LOC could instead reflect the outcome of local recurrent computation, given that the superficial layer also serves as an output layer of the local cortical circuit. Moreover, the correlation with the DNN peaks at higher layers rather than being dominated by them, and feature tuning in higher DNN layers does not necessarily map onto higher-order cortical regions such as PFC.

      While a feedback contribution to the LOC superficial layer is consistent with theoretical predictions and known cortical anatomy, the current evidence is indirect. I would recommend that the authors either tone down this conclusion or, at a minimum, explicitly clarify the strength and limitations of the evidence in the Discussion.

      (2) I could not find information regarding the fMRI slice orientation or whether temporal regions beyond LOC were covered. The reported FOV (192 × 192 mm) seems quite large if only EVC and LOC were targeted. Did the authors acquire data from other object-selective regions in the temporal cortex, and if so, did they analyze these?

      It would strengthen the feedback interpretation considerably if the RDM of the LOC superficial layer could be shown to resemble RDMs from more anterior temporal regions, which would be consistent with feedback originating from higher-order object-processing areas.

      (3) Related to the previous point, LOC is a relatively large region, and based on the figures, it appears that the LOC ROI may contain two subregions. It would be helpful for the authors to show the location and extent of the LOC ROI in example participants.

      If the ROI does indeed span two subregions, do these subregions share the same laminar profile and temporal dynamics?

      (4) The authors report no feedback-related information in EVC, which contrasts with a number of prior fMRI studies that have demonstrated object-related feedback signals in EVC. One plausible explanation for this discrepancy is task relevance: in the present study, participants performed only a fixation color-change task, whereas in previous work they were required to attend to object features or identity (e.g., Morgan et al., 2019, J Neurosci; Kok et al., 2016, Curr Biol; Mohsenzadeh et al., 2018, eLife; Hou et al., 2026, eLife). Task demands on object processing may substantially modulate the strength of feedback signals to EVC, and this possibility warrants discussion.

      (5) A substantial body of work has used specialized paradigms to dissociate feedforward and feedback signals in EVC (e.g., Williams et al., 2008, Nat Neurosci; Fan et al., 2016, PNAS; Hou et al., 2026, eLife). These studies are directly relevant to the current work but are not cited.

      (6) Multidimensional scaling (MDS) visualizations of the RDMs (as in, e.g., Mohsenzadeh et al., 2018) are not included in the manuscript. These visualizations are important for interpreting the representational format across different layers of LOC and EVC, and I would encourage the authors to include them.

    2. Reviewer #2 (Public review):

      Summary:

      Carricarte and colleagues set out to identify and functionally characterize feedforward (FF) and feedback (FB) information flow during object perception in humans, a question that has been difficult to address non-invasively because FF and FB signals overlap rapidly in time and across regions. The authors capitalize on the canonical cortical microcircuit-FF terminations primarily in middle layers, FB terminations primarily in superficial and deep layers, to spatially separate these signals using sub-millimeter (0.9 mm isotropic) GE-BOLD fMRI at 7T in early visual cortex (EVC) and lateral occipital complex (LOC). They combine these layer-resolved fMRI patterns with millisecond-resolution EEG (from a previously published dataset using the same 24 images) via representational similarity analysis-based EEG-fMRI fusion, and use a Vision Transformer (DeiT) trained on ImageNet to characterize the feature complexity of the resulting spatiotemporal signatures.

      The authors first review their approach at the macroscale, replicating the expected EVC-then-LOC temporal hierarchy and the EVC-low/LOC-high feature complexity gradient. They then apply the same framework at the mesoscale of cortical layers, reporting: (a) early middle-layer signals in both EVC (~100 ms) and LOC (~160 ms) consistent with FF processing, (b) a later superficial-layer signal in LOC (~400 ms) interpreted as FB; (c) a layer-uniform feature-complexity profile in EVC (peaking at low-mid DNN layers across all depths); and (d) a feature-complexity dissociation in LOC, where middle-layer signals correspond to mid-to-high DNN layers and superficial-layer signals to high DNN layers. They argue that this complexity shift, combined with the timing difference, indicates interareal FB into LOC.

      Strengths:

      (1) The combination of layer-fMRI at 7T, EEG, and DNN-based representational analysis is well motivated through RSA. Each modality compensates for a known limitation of the others (fMRI: poor temporal resolution; EEG: poor spatial resolution; DNN: surrogate for representational format), and the RSA framework provides a principled common currency. Relatedly, the two-step macroscale-then-mesoscale design, in which the macroscale fusion replicates established findings before the same approach is applied at the layer level, is a sound and welcome scientific strategy that strengthens confidence in the combined-modality inferences.

      (2) The authors include multiple complementary controls: partialing out lower layers to mitigate vascular draining, voxel-count matching across layers, an alternative DNN (AlexNet), an alternative time-window definition based on between-layer differences, and time-resolved commonality analyses. The convergence across these analyses is reassuring.

      (3) Methodological transparency: The authors are forthright about partial-volume effects, foveal-confluence aggregation, and the indirect nature of the temporal estimates derived from EEG-fMRI fusion.

      Weaknesses:

      The central interpretive claim-that the late (~400 ms), superficial-layer LOC signal indexes interareal feedback that increases representational complexity-is intriguing, but in my view it is not yet fully supported by the evidence presented based on the following context.

      (1) Eye movements as a possible confound for late signals. Stimuli were presented for 1 second, and fixation was enforced only behaviorally via a color-change task on a central cross. No eye-tracking is reported for either the fMRI or EEG datasets. While this approach is not uncommon, the absence of gaze monitoring introduces ambiguity when the goal is to decouple feedforward and feedback contributions at fine temporal resolution in EEG recordings. Under these conditions, multiple image-driven saccades within a trial are plausible, and saccade patterns are likely to be systematically image-specific, given the small (n = 24) and heterogeneous naturalistic stimulus set. Critically, the temporal window over which RDM correlations are interpreted as feedback coincides with the period during which observers typically make 2-4 fixations (average fixation durations of ~250-330 ms; Rayner, 1998; Henderson, 2003), meaning the late EEG-fMRI fusion peaks fall in a window where image-locked saccadic activity and successive foveation-driven feedforward responses would be expected to accumulate. Late peaks could therefore reflect cumulative feedforward responses across successive foveations rather than top-down feedback. The manuscript would be strengthened by providing eye-tracking data (if available), control analyses leveraging post-hoc indicators, or a discussion citing prior evidence that EEG/fMRI response profiles in this paradigm are robust to such eye movements.

      (2) Decoding accuracy along the visual hierarchy raises questions about whether LOC is adequately engaged. Pairwise decoding accuracy is substantially higher in EVC than in LOC (Figure 1D), and the noise ceiling for LOC RDMs is markedly lower than for EVC across all layers (Supplementary Figure 4D-F). This pattern inverts the canonical hierarchical gradient of progressively stronger object decoding along the ventral visual stream, as well as the analogous gradient observed in DNN late layers that underlies the commonality analyses. As written, it is unclear how the manuscript reconciles this with its emphasis on LOC's role in higher-order, feedback-modulated representations with greater tolerance or increased complexity--unless decoding accuracies should be understood as image-level discrimination rather than at the level of object-category discrimination. A parsimonious alternative is that the 24-image set is too small or too coarse to reveal category-level representations in LOC robustly, such that LOC RDMs may be driven by lower-level or background/contextual variance and noise. This concern has direct bearing on the mesoscale commonality analyses supporting the "feedback transmits high-complexity features" conclusion. I would encourage the authors to (a) report split-half reliability of LOC RDMs alongside the commonality analyses, and either (b) acknowledge that the feature-complexity inferences are conditional on LOC RDMs faithfully capturing object structure rather than residual contextual/low-level variance, or (c) discuss how replication with a richer stimulus set might bear on the feedback-content interpretation.

      (3) The interareal feedback interpretation could be more robustly defended against intra-areal alternatives. In EVC, the authors carefully consider non-feedback explanations for layer-specific dynamics, including lateral connections modulating gain and superficial GE-BOLD bias, and conclude these are sufficient. The same skepticism is not extended to LOC, where the corresponding superficial-layer signal is interpreted as interareal feedback, with speculative sourcing to DLPFC. Slow (unmyelinated) horizontal/lateral propagation in superficial cortical layers (e.g., Davis et al., 2024) can, in principle, produce delayed superficial-layer signals on the timescale observed here without any interareal contribution. This asymmetry is compounded by the treatment of the absence of sustained EVC activity following the middle-layer peak, which is dismissed as a "limitation of the spatial and temporal sensitivity of our measurements" (lines 388-390). If feedback to EVC truly cannot be resolved with this method, the corresponding feedback claim in LOC-imaged with the same protocol warrants comparable caution. The manuscript would benefit from either presenting positive evidence that distinguishes interareal feedback from intra-areal recurrence (e.g., frequency-band signatures, source-resolved EEG, or coupling with frontal regions), or qualifying the conclusion to "delayed superficial-layer activity consistent with either interareal feedback or intra-areal recurrence."

      (4) The predictive coding framing is invoked but not well-grounded. The Discussion (lines 349-357) includes a theoretical implication of predictive coding. Predictive coding makes content-specific claims-feedback carries predictions, feedforward carries error signals relative to those predictions, and dissociating these requires manipulations of expectation, congruence, or predictability, none of which are present in the current design. The observed layer-wise timing differences do not bear evidence for rejecting non-predictive accounts. I would suggest either removing this framing or explicitly noting that the present data neither support nor refute predictive coding.

    3. Reviewer #3 (Public review):

      Summary.

      Carricarte and colleagues use 0.9mm 7T fMRI in EVC and LOC, fused with previously collected EEG using the same stimulus set, in order to dissect feedforward and feedback contributions to human object processing through their layer-specific termination patterns. They report a feedforward signal in middle layers of EVC (~100ms) and LOC (~160ms), and a later signal in superficial LOC (~400ms) that they interpret as interareal feedback. Using commonality analysis with a Vision Transformer, they argue that this late signal carries higher-complexity features than the earlier signal, and conclude that feedback actively increases representational complexity in LOC.

      Strengths.

      The empirical work is methodologically ambitious. Sub-millimeter 7T coverage of both EVC and LOC, combined with layer-resolved EEG-fMRI fusion, represents a substantial technical achievement. The authors first reproduce established macroscale EEG-fMRI fusion patterns at 7T before extending the approach to the layer level. The figures throughout are beautifully designed and convey complex analyses with clarity. The empirical core of the paper - that LOC contains layer-distinct dynamics at distinct times, with the late signal carrying representational structure that differs in some way from the early signal - is supported by the data, though with caveats imposed by the LOC noise ceiling.

      Weaknesses.

      The authors' interpretation of these data (interareal feedback that reflects feature-complexity, related to the functional role of these signals) is not adequately supported and requires either reframing or substantial additional evidence.

      Feedback vs. recurrence. The late superficial-LOC signal is interpreted as interareal feedback, but the data are equally consistent with within-area recurrence, lateral connections, or sustained feedforward dynamics. A reader expecting evidence of higher-area signals returning to early-time middle layers - a signature of interareal feedback - finds none in either region.

      "Functional role" overclaim. The paper repeatedly claims to characterize the "functional role" of feedforward and feedback, but contains no behavioral linkage, no perturbation, and no analysis relating signals to perceptual outcomes; the fMRI task is explicitly orthogonal to object processing. What is demonstrated is spatiotemporal dynamics and representational format - both valuable, neither equivalent to functional role.

      DNN analysis. The DNN analyses use several non-standard modeling choices that introduce more uncertainty than clarity. In the main analyses, the authors only use four sampling points from a single model (DeiT-small): transformer blocks 1, 7, and 12, plus the classification head. Then, the authors make their headline claims about complexity by comparing block 12 and the classification head; within the model, this is a distinction between an embedding layer and a supervised category readout, not a feature-complexity gradient. As such, the author's interpretation conflates semantic layers with representational "complexity." A more convincing use of this modeling strategy would be to demonstrate these effects in multiple models that might disentangle these factors-e.g., supervised (ResNet/ViT), self-supervised (DINOv2), and vision-language (CLIP) models-then to visualize these brain-model relationships across all layers. Alternatively, there are many suitable model-free analyses that could demonstrate the unique representational information within LOC without introducing any model-related concerns.

      Reliability of LOC layer-resolved RDMs. The lower-bound noise ceiling for LOC mesoscale RDMs is approximately 0.05 across layers, with deep-LOC reliability essentially at zero. The central layer-resolved dissociation rests on RDMs that individual subjects barely reproduce; consequently, the deep LOC layer is dropped from the commonality analysis (Figure 4C shows only middle/superficial layers, while Figure 4B shows all three for EVC) because the data cannot support it. This is not damning, but it is consequential, and not sufficiently addressed in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      Hüppe and colleagues characterized the network of neurons in the central nervous system of Antarctic krill that contained pigment-dispersing hormone (PDH), an important output factor in the circadian clock of insects. These neurons in the brain are putative clock neurons since a subset also expressed the clock genes period and cryptochrome 2. As one of the ocean's major contributors to biomass, krill is an ecologically important marine species that experiences challenging daily and seasonal environmental fluctuations in its high-latitude habitat. A comprehensive study of krill's internal clock may help to understand the extent of its resilience to the rapidly changing climate.

      The authors used antibody staining against PDH across the whole central nervous system and additional in situ hybridization for cry2 and per mRNA, with a focus on the supraesophageal ganglion. There, they identified the major neuropils in the eye stalks and central brain of Antarctic krill. The resulting staining pattern aligns with the identified circadian clock network in insects and PDH-expressing networks in other crustaceans, making these neurons highly likely candidates for krill clock neurons.

      Strengths:

      (1) This study provides the first clues about the circadian clock architecture in a non-model organism in chronobiology, Antarctic krill, with a clear 3D reconstruction of the putative clock network.

      (2) The authors effectively place their results within the extensive body of literature on arthropod circadian clock networks to argue that the neurons they describe are likely the circadian clock in krill.

      Weaknesses:

      (1) The data presented here are not sufficient to support the claim that the described network is the circadian clock because functional evidence is missing.

      (2) Additionally, the study falls short of identifying any elements of the positive limb of the canonical circadian clock transcriptional-translational feedback loop, e.g., clk or cyc, in the PDH-expressing neurons.

      (3) No sample sizes are reported, making it difficult for readers to assess the generalizability of the presented data.

    2. Reviewer #2 (Public review):

      Summary:

      This study advances our understanding of the neuronal basis of the circadian clock in pancrustaceans. It extends our knowledge on the pigment-dispersing hormone system and provides links to information on the expression of core clock components, cryptochrome 2, and period. The data are sound and well-documented.

      Comments:

      The neuronal components of the arthropod circadian clock system have been analysed extensively in insects. Much less information on this system is available on malacostraca crustacea crustaceans. However, considering that malacostracan crustaceans and insects go back to a common pancrustacean ancestor and considering that we know that the brain architecture in these two groups shares many commonalities (see, e. g., extensive reviews by N. J. Strausfeld), we have to expect that crustaceans and insects share many of the characteristics of the circadian system. This is the case, e. g., for the network of pigment-dispersing hormone-positive neurons. The authors cite these studies, although late in the paper (discussion, line 339ff), and I suggest to move this info into the introduction: "339 ff: The arborization pattern of the PDH-network has been described in various malacostracan crustaceans, including Carcinus maenas (Alexander et al., 2020; Mangerich & Keller, 1988; Mangerich et al., 1987), Cancer productus (Hsu et al., 2008), Orconectes limosus (de Kleijn et al., 1993; Mangerich & Keller, 1988; Mangerich et al., 1987), Homarus americanus (Harzsch etal., 2009), Cherax destructor, Procambarus clarkii (Sullivan et al., 2009), and Procambarus virginalis (Luna et al., 2010)."

      The strength of this paper is that it extends our knowledge on the PDH system and brings together neuroanatomical information on PDH-positive neurons with information on the expression of core clock components, cryptochrome 2, and period. That way, it advances our understanding of the neuronal basis of the circadian clock in pancrustaceans. The data are sound and well documented, and the authors are to be applauded for the superb dissection presented in Figure 1.

      Below, please find some essential suggestions on how to further improve the paper.

      (1) Framing of the study:

      I know that krill is a key element of the Southern Ocean's food webs, but my sense is that discussing the current findings in a context of resilience of this species to global ocean change means largely overselling this study:

      - Lines 47, 48: "and the resilience of this key species in a rapidly changing Southern Ocean."

      - Lines 70 ff: "Hence, understanding the mechanisms of adaptation, including biological clocks, is crucial for predicting how species, populations, and whole ecosystems will respond to climate change."

      - 154 ff: "The Southern Ocean environment experiences rapid change (Abram et al., 2025; Meredith et al., 2019; Thomalla et al., 2023). To assess krill's resilience to environmental changes, understanding the mechanisms that govern daily and seasonal timing in krill is essential."

      - 325 ff: "The rhythmic adaptation of krill to its high-latitude environment is key to its success in the Southern Ocean, which in turn represents a cornerstone for the well-being of the whole krill centred ecosystem. To predict krill's resilience to rapid environmental changes, it is essential to understand the mechanisms that govern daily and seasonal timing in krill."

      - 597 ff: "A detailed mechanistic understanding of the flexibility of clock-based processes is therefore essential to predict krill resilience in a changing Southern Ocean."

      My understanding is that duration of day length is one of the most predictable environmental drivers, and - despite the seasonal changes of day length - nevertheless a very stable one compared to fluctuations of environmental drivers such as temperature or salinity (see, e.g. this recent review on environmental driver fluctuations on nervous system functioning in crustaceans: Stein W, Harzsch S (2021) The Neurobiology of Ocean Change - insights from decapod crustaceans. Zoology: 125887. https://www.sciencedirect.com/science/article/pii/S094420062030146X).

      I do not see how global ocean change may significantly change day length, and what this study has to do with understanding this species' resilience against ocean change. I suggest that you explain in more detail why the light day length will change in the future or strongly tone this aspect. Statements such as Line 76 ff: "Due to their disproportionate importance for ecosystem function, understanding the resilience of ecological key species is essential in assessing the fate of ecosystems in the future." are completely out of focus here and, again, trying to oversell the current study.

      (2) Uncited essential studies of crustacean neuroanatomy, missing connection to contemporary crustacean neurobiology:

      - Line 157: "despite the ecological importance of E. superba, only very little is known about its neurobiology".

      - Line 329: "However, so far, little was known about the neurobiology of krill in general."

      I agree that this species' brain is understudied, but this makes it even more important to cite the little information that IS available. Please consider this essential reading for any crustacean neurobiologist: "Sandeman, D.C., Scholtz, G., Sandeman, R.E., 1993. Brain evolution in decapod crustacea. J. Exp. Zool. 265, 112-133." to find information on the basic brain anatomy in E. superba.

      The manuscript in many places seems to reinvent the wheel and raises the impression that our knowledge of crustacean brain morphology is close to zero. The authors in places seem to operate in a vacuum, and I find it disturbing that in a study on the crustacean brain, very few references are provided to studies on crustacean brain anatomy, such as the following essential book chapter: "Schmidt, M., 2016. Malacostraca. In: Schmidt-Rhaesa, A., Harzsch, S., Purschke, G. (Eds.), Structure & Evolution of Invertebrate Nervous Systems. Oxford University Press, Oxford, pp. 529-582. https://www.researchgate.net/publication/315366157"

      In terms of brain anatomy, I would like to know if the authors have a hypothesis on whether and how their target species' brain structure may be similar or different to the brains of other "shrimps" as described, e. g., in the following studies. If so, please elaborate in the introduction:

      Krieger J, Hörnig MK, Sandeman RE, Sandeman DC, Harzsch S (2020), Masters of communication: The brain of the banded cleaner shrimp Stenopus hispidus (Olivier, 1811) with an emphasis on sensory processing areas. Journal of Comparative Neurology 528(9): 1561-1587.

      Meth R, Wittfoth C, Harzsch S (2017) Brain architecture of the Pacific White Shrimp Penaeus vannamei Boone, 1931 (Malacostraca, Dendrobranchiata): correspondence of brain structure and sensory input? Cell and Tissue Research 369(2): 255-271.

      (3) Lacking rigor and command of crustacean brain nomenclature

      I suggest that for their brain nomenclature, the authors should rigorously stick to that laid out by Sandeman et al. 1992 (not yet cited in the ms): Sandeman, D.C., Sandeman, R.E., Derby, C.D., Schmidt, M., 1992. Morphology of the brain of crayfish, crabs, and spiny lobsters: a common nomenclature for homologous structures. Biol. Bull. 183, 304-326.

      More specifically, in lines 41, 163, 199, 204, 207, and throughout the paper, the authors use the terms "Optic lobes" or "optic lobe neuropils". To the best of my knowledge, "optic lobe" is not a term used in crustacean neuroanatomy at all (as opposed to insects). Lamina, medulla, and lobula are collectively referred to as "visual neuropils" (see Krieger, J., Hörnig, M. K., Sandeman, R. E., Sandeman, D. C., & Harzsch, S. (2020). Masters of communication: The brain of the banded cleaner shrimp Stenopus hispidus (Olivier, 1811) with an emphasis on sensory processing areas. Journal of Comparative Neurology, 528(9), 1561-1587. https://doi.org/10.1002/CNE.24831). The medulla terminalis and mushroom bodies are referred to as "lateral protocerebrum". All afore-mentioned neuropils are summarized as "eyestalk neuropils" (compare nomenclature in Schmidt 2016 as referenced above).

      Line 170, 172, 175 ff, and Figure 1. "abdomen", "abdominal ganglia": Contra the book chapter by Siegel 2016 "Introducing Antarctic Krill Euphausia superba Dana, 1850", his Fig. 1.2, the "tail" of crustaceans in most books on crustacean anatomy is not called "abdomen" but instead "pleon"; hence the name "pleopods" for the appendages of the pleon (instead of "abdomipods"). What is more, I suggest using the terms "pleon ganglia" instead of "abdominal ganglia", following the terminology suggested in "Harzsch S, Sandeman D, Chaigneau J (2012) Morphology and development of the central nervous system. In: Forest J and von Vaupel Klein JC (Eds.). Treatise on Zoology - Anatomy, Taxonomy, Biology. The Crustacea Vol. 3. Brill, Leiden pp. 9-236."

      Line 174: "thoracic ganglia". In Figure 1, there is a labelling mistake as these ganglia are named "thoracaic ganglia".

      Line 176, and throughout the paper: "supraesophageal ganglion". Following the standard nomenclature for crustaceans (see, e. g., Schmidt, M., 2016. Malacostraca. In: Schmidt-Rhaesa, A., Harzsch, S., Purschke, G. (Eds.), Structure & Evolution of Invertebrate Nervous Systems. Oxford University Press, Oxford, pp. 529-582. https://www.researchgate.net/publication/315366157", this structure (as in insects) is typically called a "brain". For terminology, also consult the following nomenclature paper: "Richter, S., Loesel, R., Purschke, G., Schmidt-Rhaesa, A., Scholtz, G., Stach, T., Vogt, L., Wanninger, A., Brenneis, G., Döring, C., Faller, S., Fritsch, M., Grobe, P., Heuer, C. M., Kaul, S., Møller, O. S., Müller, C. H. G., Rieger, V., Rothe, B. H., Stegner, M., Harzsch, S. (2010). Invertebrate neurophylogeny: Suggested terms and definitions for a neuroanatomical glossary. Frontiers in Zoology, 7. https://doi.org/10.1186/1742-9994-7-29".

      Line 212, and throughout the paper - hemielliposoid body: please refer to Harzsch Krieger 2011 and the numerous references to studies by Strausfeld cited therein in crustaceans. Strausfeld has provided compelling evidence that the crustacean hemiellipsoid body is equivalent to the insect mushroom body, so this term should be replaced. Harzsch, S., & Krieger, J. (2021). Genealogical relationships of mushroom bodies, hemiellipsoid bodies, and their afferent pathways in the brains of Pancrustacea: Recent progress and open questions. Arthropod Structure & Development, 65, 101100. HYPERLINK "https://doi.org/10.1016/J.ASD.2021.101100" https://doi.org/10.1016/J.ASD.2021.101100.

      Legend, figure 2, and others, and throughout the paper: "The olfactory neuropiles comprise the lateral antennal neuropile (LAN, ochre), the olfactory lobes (OL, yellow), and the antennal neuropile (AnN, green)." This is a strange terminological mix that you should urgently revise according to the standard terminology by Sandeman et al. 1992 (as referenced above). The LAN is the lateral antenna 1 neuropil. The AnN is the antenna 2 neuropil. The AnN is NOT deutocerebral but tritocerebral.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Scheib et al. identify distinct calcium dynamics in the somata and tuft dendrites of layer 5 pyramidal cells in mice performing a licking task. Animals are trained to lick water ports on the left or right following an acoustic cue, and can adjust their targeting when the ports are displaced. For tongue premotor cortical neurons projecting to the ventromedial thalamus, calcium transients in tuft dendrites are tightly locked to the direction-instructive cue, while somatic calcium signals are more broadly dispersed and more frequently synchronized with tongue motion and port contact. Finally, when the targets are shifted, tufts exhibit a sparse but large corrective signal on an improperly-targeted first lick, and the changes in population activity in the tufts and somata differ after adaptation to the new port locations.

      Strengths:

      In my opinion, this is a very strong manuscript which reports several novel and significant observations, contains high-quality data and (for the most part) reasonable analyses, and is clear and well-written. Most prior studies of cortical sensorimotor processing have measured the output of neurons using extracellular recording - an approach which obscures potentially important signaling differences between neuronal compartments. This study leverages cutting-edge imaging techniques in mice to document large, time-dependent differences between calcium signals at cortical somata and tuft dendrites. This phenomenon could have major implications at the cellular level for synaptic plasticity, and at the systems and behavioral levels for motor adaptation. As described below, I have only one major technical concern (which should be addressable with additional analysis), along with several relatively minor suggestions for improving the manuscript.

      Weaknesses:

      At a conceptual level, the authors may wish to elaborate a bit on what sensorimotor computation they think the circuit is implementing, and how their results help explain this implementation. Several possibilities are raised: tuft activation could "prime" the pyramidal cells in advance of movement initiation (line 319ff), or could track errors to engage plasticity (line 351ff) and solve the credit assignment problem (line 362ff). It might be helpful to make one of these proposals more concrete with a computational model, but this is not strictly necessary.

      My only major technical concern relates to the analyses in Figures 4F-H, 5G-I, and 6H-K (c.f. equations 2-5). Typically, one identifies population-level factors by projecting neural activity onto fixed dimensions of interest; this makes it possible to see how activity evolves over time along interpretable coordinates. Here, however, the coding directions are redefined at each time point, so the "choice" activity at time t is actually a different signal from the "choice" activity at t+1. This procedure is a bit like comparing the activity of one neuron at one time point with the activity of a different neuron at a later time point. It also makes the physiological interpretation more complicated: if the dimensions are fixed, one can see how a downstream neuron could "read out" the signal by computing a weighted sum of the activity of upstream neurons, but it is harder to see how this could happen if the weights are always rotating.

      A few comments on the behavioral task and results. After the port shift, the error rate is quite high, and doesn't diminish much between the early and late epochs (approximately 42% and 38% error rate, respectively; Figure 1I). That is, mice do not seem to fully master the task. Clearly, animals do alter their aim, but even this does not seem to change much between early and late periods (Figure 1J). I recommend that the authors show the behavioral data at a finer level of granularity (e.g., by plotting the change in exit trajectory on all individual trials across sessions, with a loess fit) to allow an assessment of the adaptation rate and when adaptation saturates. It would also be more conventional to refer to the behavioral changes as "motor adaptation," instead of "skill learning." (The latter would be appropriate if the port offset were randomized across trials, and animals received two separate cues for direction and offset, but I suspect this task would be too difficult for mice to learn.)

      This is perhaps a semantic point, but it might not be entirely accurate to refer to the activity evoked by the directional cue as "sensory." Typically, a "sensory" response should encode some feature of a stimulus - in this case, the frequency of a tone. Here, it seems likely that the cue-aligned activity reflects the instructed lick direction, rather than the auditory information per se. (Presumably, these premotor neurons do not have well-behaved auditory tuning curves.) By comparison, in macaques performing center-out reach tasks, activity in dorsal premotor cortex rapidly ramps up following a visual cue instructing the direction of an upcoming reach, but one usually wouldn't refer to this activity as "visual" or "sensory" (though this is sometimes done). I suggest the authors either use "Instruction" or similar (e.g., in Figure 4F), or clarify in the text whether they think the activity is a genuine auditory response or something else.

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to compare functional encoding in the tuft dendrites and somata of a specific cortical cell type during motor planning and learning.

      Strengths:

      The investigation of a specific projection type (L5 ET) is a strength that aids reproducibility and interpretation. The elegant approach to increasing the depth of field of dendritic imaging is another strength. The data analyses are largely clear in their methods, scope, and interpretation. The writing is extremely clear and appropriately referenced, with an excellent Introduction, in particular.

      Weaknesses:

      It is not obvious whether the selected labeling strategy avoids labeling Layer 6 CT neurons, which would contaminate dendritic recordings. The images provided suggest enrichment in L5, but a discussion of this important potential caveat is warranted, especially since within-cell comparisons of apical dendrites to somata were not performed.

      The application of DeepInterpolation to dendritic data appears to be novel, and little detail or vetting is provided. The reader is left guessing: Was the model retrained or fine-tuned on dendritic data? How does the denoising affect the resulting segmentation and activity traces? Is denoising necessary for this workflow?

      The activity patterns of the recorded cells appear to lack the characteristic ramping during the delay epoch previously reported in both calcium imaging and electrophysiology studies. Given that a major contribution to the significance of the work is to constrain models of ALM function, a discussion of how the data aligns with previous measurements in the same circuit would improve the work.

      It would be very informative to compare differences in signals between dendrites and somata of the same cells. Consistently tracing dendrites to their respective somata would assuage worries of potential contamination from dendrites of deeper cells and enable more direct comparisons of signal transformations between dendrites and somata. It would be good to understand the relationship between dendritic calcium signals and backpropagating action potentials in this task. The authors detect less frequent calcium events in tufts versus somata; is this due to selective backpropagation of action potentials? The dynamics of this process were recently investigated by Adam Cohen's group in vivo and in vitro, and measurements in the present settings could be compared to such work.

      The Coding Direction analyses presented in this work, while consistent with previous literature on population codes in ALM, are at odds with the nature of the measurements here. The changes in representation that occur between the dendrites and soma of an individual cell are probably best thought of in terms of the dynamics of signals themselves within individual neurons, rather than in the information encoded across a population.

      This work is largely observational, describing signals that might reflect computational transformations and/or instruct plasticity, but those possibilities have not yet been deeply investigated. The manuscript does a good job of laying out these as future directions.

    3. Reviewer #3 (Public review):

      Summary:

      This article by Scheib et al. investigates how layer 5 extratelencephalic (ET) neurons in the frontal cortex encode sensorimotor information during motor learning, focusing on differences between their apical tuft dendrites and somas. The authors alternated recordings among these ET neuronal compartments in the mouse anterior lateral motor cortex (ALM) during a cued directional licking task with a target port shift. They found that while tuft dendrites predominantly encode sensory cues, with a subset selectively active during corrective actions, somatic activity was more strongly associated with action timing. Additionally, learning induced divergent plasticity: tuft dendrites increased their selectivity but decreased response gain, maintaining stable net selectivity, whereas somas showed increased net selectivity early in learning. Together, these findings reveal distinct sensorimotor representations and learning-related plasticity in dendritic and somatic compartments, providing insight into how compartment-specific activity in the frontal cortex may contribute to motor skill acquisition.

      Strengths:

      The authors developed an innovative imaging approach and a comprehensive data analysis pipeline to address a knowledge gap in the literature. By alternating imaging of dendritic tufts and somas in the same animals, they compare compartment-specific activity during motor learning and identify distinct encoding of task variables and learning-related plasticity across these compartments. Interestingly, a subset of dendritic tufts shows activity associated with corrective actions. The findings are discussed in the context of current theories of dendritic computation, credit assignment, and motor learning, providing a useful foundation for future mechanistic studies.

      Weaknesses:

      No major weaknesses were identified.

    1. Reviewer #1 (Public review):

      Summary:

      The non-social task was a classic risky decision-making task with a binary choice between an option with a sure gain and a risky option with a probabilistic gain or loss. In the social task, the sure option was an individual gain (as in the non-social option) and the probabilities in the risky option, which were shown to participants, were framed as probabilities of other previous participants (i.e., "partners") to cooperate or not; a probabilistic gain (when the partner cooperated) also led to a gain of the partner, while a probabilistic loss meant that the partner would receive the amount lost by the participant. This loss was framed as "betrayal." The authors show differences in how probabilities and amounts (of gains/losses) affected choices, RTs, and ERPs (P3 and LPP).

      Strengths:

      Since participants faced decisions with the same individual payoffs in a non-social and a social condition, this setup made it possible to use identical standard analyses for choices, RTs, and ERPS as well as (almost) identical economic models for the two conditions.

      Weaknesses:

      (1) The task does not include many components that are usually considered central for cooperation or "betrayal" and this is not discussed appropriately. At the same time, the "emotional aspects" of the operationalized "betrayal" are not directly assessed.

      a) The standard economic game for cooperation is the prisoner's dilemma, in which participants make independent choices at the same time without getting any explicit information on the cooperation probability of their partner before they make their decisions. Furthermore, most of the time the interactions are repeated. Actually, the trust game as one other frequently used economic game, also includes a back and forth of transfers between the partners. So, here, I am not so convinced by the operationalization of a low cooperation probability, which is shown before the decision, as "betrayal." The authors should motivate and explain their rationale more clearly in reference to such other tasks.

      b) The setup of the task, especially the fake interaction with the fake partners, should be made clearer in the main text (before reporting the results). I would argue for including the task picture in the main text.

      c) In general, I am in favour of taking participants' choice behaviour as the main outcome measure. But given the strong implications of "emotional costs" made by the authors, I would have expected some ratings of "betrayal" on a trial-by-trial basis. I would at least include this as a shortcoming.

      d) Also, given the framing of the study, I would have expected some exploratory analyses regarding individual differences with respect to, e.g., social value orientation, etc. I would at least include this as an outlook.

      (2) The standard statistical analyses could be improved.

      a) It is good that the authors have rather long sections using standard regression analyses. But they are a bit lengthy, and the modelling should be more prominent.

      b) In a couple of places, the authors say something like "this is significant, but that is not." Here, it has been made very clear that the interaction term needs to be looked at. As far as I can see, this has not always been done.

      c) For this binary choice, the difference in expected value (EV) between the sure and the risky options is one crucial comparison. But the authors never take that into account. This difference does not depend on the amount, which the authors dub "principal." That is, the sure option simply has an EV of x, i.e., the amount. The risky option has the EV = p2x + (1-p)0.5x, with p being the probability of gain/cooperation. That is, the two options have the same EV at p=1/3, independent of x. This should be made clear.

      d) Relatedly, RTs should depend on the differences in EV (and not so much on p or on x per se). This can be seen by the more or less quadratic relationship between p and RTs (Fig 1A), with a peak around a p of 1/3.

      e) RTs are often log-transformed. It should be briefly mentioned why this was not done here.

      (3) The modelling evidence is relatively weak. This is my main point.

      a) (Cumulative) prospect theory should be introduced.

      b) The models seem overly complicated with many free parameters. I would have expected some simpler versions and more comparisons between models that differ in just one parameter.

      - e.g., it is really nice that the authors used a probability weighting function. BTW: Please describe this more clearly in the introduction and in the results. But for this limited range of probabilities, this might be too much.

      - e.g., why directly assume two different exponents in the utility function for gains and losses, and in addition a loss aversion parameter lambda? Only lambda would be a better starting point here.

      c) The differences in AIC (Figure 2A) seem rather minuscule, and the distribution of winning models is not very peaked. I am not convinced that Model 3 is the winning model.

      d) Crucially, and related to the previous points, judging from Fig 2C, the "betrayal" parameter kappa seems to be zero for about half of the participants. The authors should look into this.

      - Would a model just like model 3 but without kappa (i.e., kappa set to zero) perform better? Is this just model 2?

      - How is kappa set in the non-social condition?

      - This massive skew, to say the least, is never discussed.

      - A correlation is definitely not warranted.

      (4) The ERP results seem to me rather superficial. But I am not an EEG expert.

      a) The authors do not seem to look at the outcome phase, which could be interesting for differences in reward/loss processing in the two task versions.

      b) Again, differences in EV seem to be more important from a conceptual point than probabilities or amounts; see my comment 2d.

      c) Also, the authors report ERPs for the two task types separately but do not seem to run proper comparisons between them, see my comment 2b.

      (5) Preregistration: It should be made very clear early on that this study was not preregistered.

      (6) Quality checks: The authors should check if some participants are outliers in terms of the number of missed trials, always choosing the same option, etc. It is notoriously difficult to find good post hoc reasons for excluding participants (one reason why replications and preregistrations are important). In any case, the data quality should be checked and described a bit more.

    2. Reviewer #2 (Public review):

      Summary:

      This paper investigates risk and cooperation decisions by integrating computational modeling with event-related potential (ERP) measures. Participants completed two tasks involving financial risk and cooperation under possible betrayal. The comparison between social and non-social decision-making is interesting and potentially valuable. However, the conceptual framing, theoretical grounding, and modeling rationale require substantial clarification.

      Strengths:

      (1) The paper introduces comparable tasks to probe social vs. non-social decision making.

      (2) The authors use a model to identify a psychological distinction and test its validity using neural data.

      Weaknesses:

      (1) Conceptual framing and theoretical clarity

      The primary theoretical contribution of the paper is currently unclear. Specifically, it is not clear what key difference the authors hypothesize between risk and cooperation conditions. This distinction should be grounded in prior literature.

      The manuscript states: "Indeed, mutual cooperation maximizes social welfare, whereas betrayal benefits the trustee but comes at the trustor's expense in the Trust Game (Joyce et al., 1995)." However, the authors do not discuss the substantial literature on the Trust Game, which is used here but not explicitly acknowledged.

      • The original Trust Game framework and behavior in one-shot settings (e.g., Berg et al., 1995).

      • The persistence of cooperation even when defection is economically optimal (e.g., Berg et al., 1995; Fehr & Fischbacher, 2003).

      • The influence of trustworthiness of the partner on cooperation decisions has been previously studied (Ma et al., 2022).

      • Differences between social and non-social decision-making contexts have also been reported with matched tasks (Liu et al., 2024).

      (2) Distinction between constructs (risk, loss aversion, betrayal aversion)

      The introduction introduces multiple related constructs-risk aversion, loss aversion, and betrayal aversion-but does not clearly differentiate them. A theoretically grounded distinction is needed.

      In particular:

      • The manuscript introduces multiple related constructs, or maybe the terms are used interchangeably? The distinction between risk aversion, loss aversion, defection aversion, and betrayal aversion should be clearly defined.

      • Betrayal aversion versus loss aversion is introduced but not clearly differentiated. Importantly, it should be clarified that this distinction is not experimentally manipulated but instead inferred through computational modeling. This point is currently not made explicit, which leads to confusion in the introduction

      • The computational model should be introduced clearly in the introduction. Without explaining how these constructs are operationalized in the model, the framework is difficult to follow.<br /> The statement "In the risk task, losses were solely impersonal" is also unclear. It seems the authors may mean "personal or non-social" rather than "impersonal" as rewards are always personally relevant.

      (3) Hypotheses and preregistration

      The manuscript would benefit from more theoretical rationale for hypotheses. For example:

      • What is the basis for hypothesizing that financial loss aversion and betrayal aversion independently affect cooperation choices?

      • Why should these constructs be separable and modeled independently?

      • Additionally, the absence of preregistration is a limitation that should be acknowledged even more.

      • Given the flexibility of the modeling approach and number of parameters, this is particularly important.

      • For instance, the rationale for focusing on decision times is also not clearly explained and should be better motivated.

      (4) Computational modeling

      There are several concerns regarding the modeling approach:

      • The choice of model comparison metric should be justified. Why is AIC used rather than BIC, which penalizes model complexity more strongly? This is particularly relevant given the inclusion of additional parameters to capture processes not directly measured by the task.

      • Full model recovery analyses are missing. A full model recovery is necessary to demonstrate that competing models produce distinguishable behavioral patterns. This needs to be shown in order to justify the specificity of the winning model

      • How correlated are the parameters across participants, particularly loss and betrayal parameters?

      • More broadly, it is unclear how well loss aversion and betrayal aversion can be differentiated based on behavior alone. If these constructs are separable, they should predict distinct aspects of behavior.

      (5) ERP analyses

      The ERP results (e.g., P300 and LPP) seem to suggest that betrayal aversion is relevant in both time periods and similarly.

      • Do neural signals differentially reflect betrayal aversion versus loss aversion earlier and later on?

      • Are there significant interaction effects between betrayal and loss aversion for each ERP component?

    3. Reviewer #3 (Public review):

      Summary:

      In this study, the authors aim to address two questions. First, do people avoid cooperation primarily because of betrayal aversion beyond loss aversion? Second, can the effects of betrayal aversion and loss aversion be dissociated at the behavioral and neural levels? To address these questions, the authors compared individuals' choices of taking risks in a nonsocial risk task with those in a social cooperation task, with the two tasks matched in success probability and principal amount. They fitted computational models that include betrayal-aversion and loss-aversion terms and related the model parameters to ERP measures. Based on these analyses, the authors concluded that betrayal aversion has a stronger effect on cooperation than loss aversion and that betrayal is encoded earlier than loss in the brain. This is an important research question, and the attempt to combine computational modeling with ERP analysis is valuable. However, the current data analyses may not be able to support all the conclusions the authors made. For instance, the claims concerning the dissociation between betrayal aversion and loss aversion are not yet sufficiently supported by the evidence.

      Strengths:

      (1) The research question is theoretically important. Distinguishing betrayal aversion from loss aversion is important for research on trust, cooperation, and risky decision-making.

      (2) The approach of integrating behavioral measures, self-report ratings, computational modeling, and ERP data is valuable and gives the study significance.

      (3) The behavioral findings are broadly consistent. Participants reported stronger emotional responses in the cooperation task and were less willing to accept risk in the cooperation condition. These findings are generally in line with previous work on betrayal aversion and provide a reasonable manipulation check for the contrast between social and nonsocial risk.

      Weaknesses:

      (1) The manuscript states that the two tasks are matched in probability and principal amount, but the cooperation task additionally introduces partner outcomes, betrayal, and prosocial components. The Methods section states that, in the cooperation task, if both players cooperate, the principal is doubled and then split equally; if the partner betrays, half of the participant's principal is transferred to the partner. The model also includes an expected-other-reward term, namely, V_other=ω[p⋅2X+(1-p)⋅1.5X]. This raises an interpretive concern: if the two tasks differ not only in whether the source of uncertainty is social, but also in partner outcome, intentionality, and potential inequity structure, then the fitted "betrayal aversion" parameter may in fact reflect multiple motives rather than betrayal aversion alone. In the current experimental design, the "betrayal aversion" parameter may not be uniquely interpretable as a pure betrayal-specific construct, and the current evidence is insufficient to support such a specific interpretation.

      (2) Participants were informed that the cooperation probabilities were derived from previous real participants, whereas in fact these probabilities were randomly generated. In addition, six participants explicitly expressed doubts about the authenticity of the social interaction, yet the authors retained these participants with only the brief statement that this "did not affect the results." For such a critical manipulation, this explanation is too brief. I recommend that the authors report robustness analyses excluding skeptical participants. Since six participants reportedly doubted the authenticity of the social interaction, and some participants also performed poorly on the catch trials, it would be important to show whether the main behavioral, modeling, and ERP findings remain after excluding these participants. This is especially important because the manuscript's central interpretation depends on the assumption that the cooperation task was genuinely experienced as social.

      (3) The descriptions of the sample size are inconsistent across sections. The Participants section states that, after excluding one participant for misunderstanding the instructions, the final sample consisted of 49 participants; however, the behavioral results section later states that only 42 participants were included in the final analyses due to recording problems. This discrepancy is important because readers need to know clearly which sample was used for the behavioral analyses, which for the model fitting, and which for the ERP analyses; whether these analyses were conducted on the same participants; and whether the exclusion criteria were consistent across analyses. The manuscript needs a more transparent description of sample size and exclusion criteria.

      (4) The authors need to do more thorough analyses to validate their models. In addition to AIC and parameter recovery, I would encourage the authors to include other model comparison metrics where possible, such as BIC and exceedance probability, as well as model-recovery analyses. The authors should also do model-based simulation analyses to show that the winning model can capture the contextual effects observed in real data.

      (5) The authors should explain the rationales for the choice of ERP time windows and component selection in more detail. The current ERP analyses are time-locked to principal onset, and P3/LPP are extracted from fixed time windows. The authors should explain why this is the most appropriate time-locking point for examining betrayal- and loss-related computations, and why alternative time-locking points, such as probability-cue onset or other key task events, were not used. More importantly, the time windows of P3 and LPP are defined arbitrarily in the current analyses. The authors need to apply a more principled approach to define ERP components. It looks like the P3 and LPP are from the same ERP component in Figure 3.

      (6) The manuscript has several internal inconsistencies in terminology, figure references, and result descriptions. These issues weaken the clarity of the arguments and reduce the readability of the manuscript.

      (7) The authors partially achieved their aims. The study does provide evidence that social risk and nonsocial risk are not treated equivalently, and it also offers a computational framework that is informative for the field. This is an important topic, and the overall approach is promising.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Demeshkina and Ferré-D'Amaré showed that extrachromosomal circular DNA (eccDNA) and chromatin-associated proteins are present in stress granules, based on proteomic and sequencing analyses. Using HCR-FISH combined with imaging, the authors showed the colocalization of eccDNA with stress granule proteins. Furthermore, they found that CRISPR machinery targeting the eccDNA component of stress granules disrupts stress granule assembly, and that this effect is largely independent of Cas9 endonuclease activity. Notably, expression of cytoplasmic chromatin factors restores stress granule formation in the presence of CRISPR machinery in yeasts. This also rescues the growth defect caused by hypoxic stress, which correlates with impaired stress granule formation. Together, this manuscript provides insight into the presence of eccDNA in cytoplasmic membraneless organelles, specifically stress granules, and suggests a functional role for eccDNA within these structures under stress conditions.

      Strengths:

      The authors used a panel of ribonucleases to demonstrate that stress granule cores isolated from yeast and HEK293 cells are resistant to plasmid-safe DNase, an enzyme that does not degrade circular double-stranded DNA. To further support the presence of extrachromosomal circular DNA (eccDNA) in stress granules, they performed Circle-Seq on stress granule cores. The gel electrophoresis and sequencing experiments complement each other well, providing consistent evidence for eccDNA within these granules. Overall, this study provides insight into potential cytoplasmic roles for eccDNA, an area that remains largely unexplored.

      Weaknesses:

      (1) Figure 1F suggests that stress granule cores are susceptible to DNase I but not to plasmid-safe DNase (psDNase). However, its smearing pattern in the psDNase condition appears similar to that in the DNase I treatment shown in Figure 1E, although psDNase produces more discrete bands. The authors should comment on these differences between Figures 1E and 1F, or consider revising Figure 1F to improve consistency with Figures 1E and 1D.

      (2) The authors should clearly define "colocalization". Does it refer to complete spatial overlap between two signals (i.e., VCP and T30), or partial overlap (i.e., AHNAK DNA and G3BP)? Figure 3 and the associated text are descriptive. Quantitative analysis would strengthen the conclusions. For example, the authors could analyze the fraction of molecules localized to stress granules or provide Pearson's correlation coefficient or similar measurements.

      (3) The authors used a CRISPR-based approach to target the Ty1 LTR retrotransposon, an abundant stress granule eccDNA, and they observed a loss of stress granule formation. However, this phenotype may be specific to Ty1 eccDNA rather than representative of all eccDNA species present in granules. In particular, the title "Cytoplasmic circular DNA is a key constituent of stress granules" implies a broader role. To support this claim, the authors should consider approaches that more globally deplete eccDNA rather than targeting a single eccDNA.

      (4) The authors should provide additional experimental evidence to support the claim that eccDNA is packaged in a chromatin-like state. The rescue of stress granule formation by ectopic expression of modified chromatin-associated proteins (CHD1NES and GCN5NES) following CRISPR treatment does not necessarily demonstrate that eccDNA is packaged like chromatin under basal conditions.

    2. Reviewer #2 (Public review):

      Summary:

      The authors report the presence of extrachromosomal circular DNAs (eccDNAs) within the core of stress granules purified from both yeast and mammalian cells.

      Strengths:

      This study is important for understanding the molecular mechanisms underlying stress granules containing eccDNAs and is likely to have a major impact on future research. A major strength of the study is the extensive experimental validation performed in yeast cells. In particular, cytoplasmic CRISPR-mediated targeting of eccDNAs suppresses stress granule formation and impairs recovery from hypoxic stress in yeast cells.

      Weaknesses:

      The conclusions would be further strengthened by validating the functional findings in an additional model system, such as mammalian cells.

      Comments:

      (1) Section: "Stress granule cores contain eccDNA"

      a) The presence of eccDNAs would be more convincingly demonstrated using an orthogonal validation approach, such as DNA FISH targeting MYC and Centromere 8 (CEN8) on metaphase spreads from HEK293T cells (as performed in PMID: 34819668).

      b) The study would also benefit from assessing the presence of eccDNAs in the extracellular medium. For example, DNA could be extracted from conditioned media and analyzed by PCR using primers spanning eccDNA breakpoint junctions (as performed in PMID: 40074906; PMID: 36123406).

      (2) Section: "eccDNA-CRISPR abrogates stress granules"

      These findings should be further validated under additional stress conditions, such as drug-induced stress (like methotrexate) or nutrient deprivation in the cell medium.<br /> In addition, the same set of experiments should be performed in HEK293T cells to support the broader relevance of the observations.

    1. Reviewer #1 (Public review):

      In this manuscript, the authors study optimal chemotactic navigation of bacteria in disordered environments. Most previous work has studied bacterial chemotaxis in free liquid, but navigation in obstructed environments is gaining more attention. Here, the authors first used the classic swim plate assay to select E. coli for chemotaxis in soft agar at two agar concentrations. In the higher concentration, they observed that the population's migration speed increased and the mean run duration decreased over selection cycles. Importantly, the growth rate did not change, so the change in migration speed was due to improved chemotaxis. Then, using a strain in which they could control the mean run duration with an inducible promoter, they measured population migration speed as a function of mean run duration, observing a peak. In liquid, theory predicts a peak when the run duration is comparable to the time scale of rotational diffusion. Here, the peak is at a much shorter run duration, and the optimal run duration decreased with agar concentration. A key feature in previous studies of bacterial motion in obstructed environments has been the dynamics of cell trapping and escape via tumbling. By directly visualizing the flagella in single cells, the authors found that the majority of trap events in semisolid agar did not end with a tumble. This is important because it means that the peak in the migration speed has a different origin from the peak typically seen in the diffusion coefficient, which is due to a balance between longer runs and less time spent trapped. Instead, using a minimal theoretical model, the authors argue that the peak in the migration speed is due to a balance between longer runs, which improve chemotaxis, and having those runs terminate with a tumble rather than a trap event, because runs that end with trapping do not result in up-gradient bias. Qualitatively similar behavior is seen in simulations of a more complex model of chemotaxis.

      Overall, we find the results to be significant and the evidence to be strong. We have some comments, which the authors need to address to improve/clarify their work:

      (1) The authors' model predicts that, because cells spontaneously escape traps without tumbling, the diffusion coefficient should depend monotonically on mean run length even though the chemotaxis coefficient is non-monotonic. It would strengthen the paper if the authors could show this to be true in experiments. Part of the reason for this comment is that the flagella labeling experiments were done in agar that was rapidly cooled in a freezer and then thawed, whereas the migration experiments were performed in agar cooled at room temperature. Our (anecdotal) understanding is that the cooling rate dramatically affects the properties of the agar mesh. Verifying that diffusivity is monotonic in mean run length would therefore show that cells' spontaneous escape from traps is not an artifact of the cooling protocol.

      (2) Two agar densities were used in their study (0.2%, 0.3%). As shown in Figure 1, while cells in the 0.3% agar showed significant improvements during the directed evolutionary experiments, the cells in 0.2% agar didn't. Correspondingly, the evolved average run time did not show significant changes in the 0.2% agar, but it decreased in the 0.3% agar. What is the reason for this difference? Does it mean the cells are already optimized for the 0.2% agar medium?

      (3) Related to the previous comment, the comparison between Figure 1 and Figure 2 should be made clearer. In Figure 2, a peak performance at an intermediate run time is shown, with the optimal run time decreasing with the agar density. Qualitatively, this result, i.e., the existence of the peak performance, gives the evolution experiments shown in Figure 1 a nice explanation. However, quantitatively, the run times shown in Figures 1 and 2 are quite different. For example, for the 0.3% agar case, the change of run time decreases from ~0.6sec. in cycle-1 to ~0.4sec in cycle-40. However, in Figure 2, the optimal run time is ~0.9sec., which means that the migration speed would decrease if the run time is decreased from 0.6sec to 0.4sec. We understand this may only be considered as a qualitative result. However, it does raise the question of what the molecular mechanisms are that drive the directed evolution, which the authors should address.

      (4) In Figure 3B, the distributions of speed in different media (liquid versus agar) for cells with bundled and split flagella are shown. While the distribution for the bundled flagella shows nicely the emergence of the trapped state (peak near zero speed), the distribution for the split flagella shows a significant shift of the distribution. Does this mean the agar medium also changes the tumble state significantly? In fact, we are puzzled by the observation that in bulk liquid, the run speed distribution for cells with split flagella seems to be quite similar to that of cells with bundled flagella, which might indicate problems in determining run speed.

      (5) Finally, none of the points plotted have error bars. Error bars would allow the readers to evaluate i) whether the changes in mean run speed during selection are significantly resolved and ii) whether the peaks in the migration speeds are significantly resolved.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript by Bai and colleagues investigates how Escherichia coli navigates and explores agar gels through chemotaxis and what parameters of bacterial swimming are tuned under selection pressure for rapid migration (i.e., reaching the edge of the agar plate quickly). Prior studies have examined related questions to a substantial degree. Examples include "Migration of Chemotactic Bacteria in Soft Agar: Role of Gel Concentration" (https://pmc.ncbi.nlm.nih.gov/articles/PMC3145277) and numerous other studies in this area (e.g., "Migration of bacteria in semi-solid agar" https://www.pnas.org/doi/10.1073/pnas.86.18.6973). From such studies has emerged the paradigm/model that reorientation (i.e., tumbling) is essential when bacteria navigate agar, which is considered a model for "complex" environments, because run-only bacteria become trapped in the agar matrix and are unable to migrate far. This new manuscript provides some evidence that this paradigm may be overly simplified or incomplete. As I understand it, the authors propose that migration is influenced to a greater extent by bias in the chemotactic run, where runs up attractant gradients are longer. The authors incorporate these data into a new model for chemotactic navigation and claim that this work establishes a general principle for how bacteria optimize active transport through complex environments.

      I will first note to the editor and authors that I am not qualified to assess the detailed mathematics of the model, and my review therefore focuses on the biology and phenotypes described. Nevertheless, in my view, this manuscript, in its current form, has several important limitations. For each point, I provide suggestions for additional experiments that could strengthen the rigor of the work and clarify the claims.

      Strengths:

      A strength of this work is the use of microscopy and automated methods to characterize an extremely large number of bacterial cells, which strengthens the authors' claims. However, substantially greater detail on these approaches is needed for the analysis to be reproducible and to allow verification that the analyses were performed correctly.

      Weaknesses:

      Major concerns

      (1) Claims are overly broad, and the experimental system is too artificial to support general conclusions about bacteria, chemotaxis, or evolution.

      E. coli MG1655 is a longstanding model organism in the chemotaxis field, and agar chemotaxis assays are also widely used. However, the authors make very broad claims about how phenotypic changes observed during selection in 0.2% or 0.3% agar relate to bacterial chemotaxis and evolution more generally. In essence, the experimental foundation on which the authors build a complex theoretical framework is limited to a domesticated laboratory strain of E. coli and a highly artificial environment consisting of agar in a Petri dish. Although E. coli is well studied, its motility and taxis behaviors are not necessarily representative of bacteria across nature. In addition, natural environments are dynamic, and bacteria rarely experience stable gradients for extended periods, such as the 24-hour time-frame used here. The authors have also only focused on responses to attractant gradients with undefined complex growth media, and not assessed if this is also true for repellent gradients. This is important to consider because E. coli also generates repellent gradients (indole) that are not considered here. E. coli also generates AI-2, sensed as an attractant, that would be an opposing force for migration. For these reasons, it is not clear that the data and theory presented here generalize to diverse bacterial species, to natural environments, or to chemotaxis broadly.

      The authors should acknowledge that further work is needed to generalise their findings by testing additional organisms, such as non-laboratory E. coli isolates, other enteric bacteria, and species with fundamentally different motility systems (e.g., Campylobacter jejuni). Further work could also expand beyond agar by examining chemotaxis in a biological matrix such as mucin, as well as testing responses to defined attractants and repellents.

      (2) No genetic component is identified, so claims about evolution are not supported.

      Evolution requires heritable genetic changes that produce phenotypes advantageous under a given selection pressure. The authors state that bacteria were selected for rapid migration and that this selection produced progressively more efficient migrators. However, no sequencing analyses of the evolved isolates were performed, no genetic changes were identified, and no mechanism underlying this phenotypic shift was described. Without identifying genetic alterations, they cannot substantiate the claim that evolution occurred. Whole-genome sequencing of the evolved isolates is necessary to determine whether specific mutations underlie the observed phenotypes.

      (3) The predictive power of the model is not tested.

      The authors develop a model with post-dictive capability, meaning the model reproduces behaviors similar to those observed in the data used to construct it. However, the manuscript does not demonstrate that the model has predictive power. Demonstrating predictive performance would substantially increase the value of the model. For example, the authors could perform an additional round of selection and predict the resulting bacterial behavior under a condition not used during model construction (such as a different agar concentration or predicting the behavior of different bacteria). Otherwise, the authors should tone down the claims.

      (4) Limited novelty and impact of the environmental difference studied.

      A central point of the manuscript is the difference between evolution in 0.2% versus 0.3% agar and how this difference relates to the proposed model. However, this represents a relatively minor change in the environment experienced by the bacteria. Developing an extensive theoretical framework and proposing that bacterial evolution is highly sensitive to these parameters based on this narrow experimental system may be premature. This would be addressed by the suggested broadening of experiments described above.

      (5) The manuscript is too brief, and some data and methods are insufficiently described, particularly related to the machine learning analysis.

      The manuscript addresses a complex topic, yet the main text, methods, and figures are very brief, which need not be the case. As a result, it is often difficult to understand exactly what was done and how the data support the authors' claims. More detailed descriptions of the experimental approaches and analyses are necessary.

      One example is the machine learning approach used for cell tracking. This method is only briefly described, and no validation data are presented that would allow readers to evaluate whether the approach performs accurately. If the method is robust, it would be a powerful analytical tool, but the current description does not provide sufficient information to evaluate the reliability of the results. This issue is particularly important because the authors conclude that tumbles account for less than 3% of escape events, which contrasts with previous paradigms. Automated tracking methods can be susceptible to artifacts, and therefore, rigorous validation of the tracking pipeline, supported by appropriate figures and benchmark data, is essential.

    3. Reviewer #3 (Public review):

      The manuscript by Bai et al presents a study of the effect of trapping on the efficiency of chemotactic spreading. While the overall impression of the study is positive, there are multiple drawbacks that accumulate and together make the statement of the paper not fully justifiable. Below, I provide some detailed comments in chronological order, and indicate those of particular importance.

      (1) On the first page of the Introduction, the authors use the following wording: "...how bacteria optimise their intrinsic motility parameters to maximise navigation efficiency". However, it is not shown or known whether they do. In the experiments, the authors fetch the bacteria at the far front and artificially select the ones with shorter run times. The ones at the front could be the effect of heterogeneity of the population rather than an adaptation. Moreover, the authors claim that the selective pressure is via trapping. But this can be due to a multitude of other factors that change with agar concentration, availability of nutrients, osmotic properties of water, etc.

      (2) At the beginning of the results section, the authors claim that for both agar concentrations, they observe a progressive increase in chemotactic navigation. I do not see how the data for 0.2 % agar would correspond to that. Migration speed remains flat.

      (3) (Important). The authors claim that the mean run speed remained constant. But this is definitely not true, as seen in the plots. The speed of modernity is increasing for both agar conditions. And here it is important to note that the chemotactic drift velocity is proportional to the square of run speed (which is not the case for the formulas in this paper, see comment below). Thus, even smaller changes in v_0 can result in a significant increase in the drift velocity.

      (4) (Important). Tumble bias is also significantly increasing in 0.3 agar concentration. While it is not clear from the paper what exactly the tumble bias is, if it is related to the persistence of the turning angle, this also has a linear effect on the chemotactic drift velocity.

      (5) (Important). When performing aTc dependence testing, the authors didn't report how other observables of swimming behaviour are changing.

      (6) (Very important). I'm not sure that by interfering with Che-Z expression, one does not affect the whole chemotactic circuit, for example, by changing G (in terms of the model) and thus the optimality occurs not due to the agar concentration/traps but due to the perturbations in the circuit. Also, the effect of different % seems to be much more minor compared to the overall induced changes in spreading speed.

      (7) (Very important). I was very confused by the statement of the authors about only 3% of traps being exited due to tumble. I don't think this is possible (in a way consistent with the suggested model). Mean free run times (Figure 1C) go down to 0.4 s. Duration of tumbles is 0.3s (Figure S2c), but the duration of traps is longer than tumbles (and a bit shorter than runs). So how can it be that a running cell gets into a trap and only in 3% cases it experiences a tumble? What would be the distribution of run durations if one combines pre-trap+trap_time+post_trap run time - would they still have a mean below 1s?? It really looks like the authors are not able to detect tumbles when bacteria are trapped. Or is there an active mechanism suppressing tumbles when in the trap?

      (8) It is not clear what it means that post-tumble angles were uniformly distributed. Does this refer to only trap-associated tumbles? It is known that in the freely swimming e.coli the tumbling angles are not isotropic but have a preference for the forward direction. Is it different in agar conditions?

      (9) (Very important) The authors assume an oversimplified model for the chemotactic drift based on biased random walks. As a result, the answer for chemotactic drift velocity has a wrong scaling with run speed. In the linear theory of chemotaxis by de Gennes, the scaling is v_0^2, while the authors use a linear relationship. Thus, the assumption of the simplified model is incorrect. The exact effect of the traps (where no tumbling is happening, and the directional memory is conserved) needs to be properly calculated, for example, in the same de Gennes framework. And I can't say what the result would be from the top of my head because the calculation is, in fact, not too trivial. Thus, the model used is oversimplified, and thus the fact that it shows a non-monotonous relationship with tau_f is of little predictive power.

      Taken together, you see that all the key points that are used in the chain of the argument about the optimality are not rock solid and allow for alternative explanations. I think all those either need to be tested explicitly or at least clearly discussed, and the respective conclusions of the paper need to be rephrased. In my view, this work needs major revision.

    1. Reviewer #1 (Public review):

      Summary:

      Eroglu and Hobert demonstrate that injecting CRISPR guides and repair constructs to target three genes at a time, tagging each with a different fluorescent protein, and selecting which gene to tag with which fluorophore based on genes' expression levels, can improve efficiency of gene tagging.

      Strengths:

      This manuscript demonstrates that three genes can be targeted efficiently with three different fluorophores. It also presents some practical considerations, like using the fluorophore least complicated by agar/worm autofluorescence for genes with low expression levels, and cost calculations if the same methods were used on all genes.

      Weaknesses:

      Eroglu has demonstrated in a previous publication that single-stranded DNA injection can increase efficiency of CRISPR in C. elegans, while inserting two fluorescent proteins and a co-CRISPR marker into three loci, and Paix et al 2015 demonstrated simultaneous insertion of two fluorescent tags. The current work is valuable and incremental advance. In general, I applaud the authors' willingness to strategize about how whole proteome tagging might be accomplished. I predict that the advance here will be one of many small advances that will get the field to that goal. The title oversells the advance presented, in my view, since seems like one among many key advances, and the first sentence of the Discussion seems a more apt summary of the key advance here.

      Some injections targeted genes on the same chromosome together, which will create unnecessary issues when doing crossing that will be useful for some future experiments. This made me wonder if injecting 3 together really is helpful vs targeting each gene separately, since only 5 worms need to be injected. It cuts time down by 2/3, but perhaps avoiding targeting the same chromosome with two tags would be useful.

      The limited utility of current blue fluorescent proteins makes me wonder if it's worth using at this stage, before there are better blue fluorescent proteins, or better yet, far red, to avoid issues with live imaging under phototoxic UV or near-UV illumination.

    2. Reviewer #2 (Public review):

      Overall, we found the responses to be quite recalcitrant.

      We have one remaining composite concern about the comparison between observed expression patterns with the new strains versus published data.

      First, the authors only report patterns for one stage while it should be not too much effort to image the different life stages. However, since this is a revision, we are not formally requesting they do this.

      Second, in the now provided Table (thank you) 'observed expression' (last column) is lacking for 9 of the 30 proteins, and for 6 of these the procedure was not successful. Why not report patterns for the other three? It is confusing also because on page 5, the authors say that "overall, 24 of 30 tags ...all of which were visible with fluorescence stereomicroscopy" - are we missing something? Also, they then said that they "obtained 6/9 of the originally failed tags"; why are the corresponding patterns not included in table 1, and are 9 proteins still labeled as "no" in the "success?" Column?

      Third, we strongly feel that the response to our comments about expression patterns is not adequate. On page 5 the authors say that "all proteins were expected to be ubiquitously expressed" and that "scRNA-seq indicated that transcript abundance was ubiquitous and without strong tissue-specific enrichment with few exceptions". However, in their rebuttal, the authors now argue for tissue-specific expression for proteins with paralogs, turning around their own argument! Moreover, their Table indicates that many genes show tissue-enriched expression by RNA-seq while many of their tagged proteins exhibit ubiquitous expression.

      Overall, this indicates that both the overall accomplishment of generating tagged protein strains and analyzing their expression is oversold.

    3. Reviewer #3 (Public review):

      Summary:

      The authors argue that establishing the expression pattern and sub-cellular localisation of an animal's proteome will highlight hypotheses for further study. This claim is probably accepted by many in the community. This manuscript seeks to confirm the feasibility of establishing such a resource, by using current transgenic methods to knock in DNA encoding different colored fluorescent tags into C. elegans genes.

      Strengths:

      The authors make the points above. For example, they provide evidence that the C. elegans germline harbors two populations of mitochondria that differ qualitatively in the proteins they express. They also confirm that labelling the whole proteome is an achievable goal with relatively limited resources and time.

      Weaknesses:

      The work is somewhat incremental in that it uses existing transgenic technology. Cell biology in C. elegans is challenging because of the small size of many of its cells, notably neurons. This can make establishing the sub-cellular localisation of a fluorescently tagged protein, or co-localizing it with another protein, tricky. The authors point out in their introduction that advances in light microscopy such as diSPIM, STED and ISM (a close relative of SIM), have increased the resolution of light microscopy. They also point out that recent advances in expansion microscopy can similarly help overcome the resolution limit. However, they do not use these technologies to characterize their transgenic strains.

    4. Reviewer #4 (Public review):

      Summary:

      Tagging the entire proteome of a metazoan would be a landmark achievement, providing a powerful complement and extension to existing "omic" catalogs in model systems. Here, Eroglu and Hobert argue that efficiently tagging multiple loci in a single "batch" would make the community-based achievement of this goal realistic. They provide rigorous evidence that such an approach is indeed feasible, exploring issues related to efficiency, design and screening strategies, disruption of gene function, and the potential for endogenously tagged alleles to reveal unexpected aspects of protein expression and localization. While the work has some minor gaps that are important to rigorously assess the feasibility of the proposed effort, the detailed and valuable insights that emerge should provide impetus to the community to coordinate efforts to make this ambitious goal a reality.

      Strengths:

      The work has numerous strengths. The authors provide compelling evidence that:

      - three distinct loci can be efficiently targeted with three distinct fluorescent tags in a single injection.

      - thoughtful targeting design can reduce the likelihood of disruption of function by the tag.

      - systematic design principles based on expression level and predicted localization/function can be used to optimize tagging strategies.

      - the resulting tags can provide unexpected insight into patterns of protein production and subcellular localization.

      Not all of these advances are novel in themselves, but taken together, they represent an important technical and conceptual advance. The most important strength comes from the exceptionally high value of the goal itself, in that the work is that it has the potential to spur a community-wide effort toward achieving the ambitious goal of proteome-wide tagging.

      Weaknesses:

      The work's shortcomings are minor.

      - One concern has to do with the feasibility of the proposed screening strategies. The experimental design cleverly coinjects tags for three loci in different gene expression 'zones'; this expression level determines which tag will be used. As the authors allude to, there is an important distinction between genes with the same overall FKPM value between those that are expressed broadly and those focally expressed in a specific tissue. The proposed strategy claims that there are a sufficient number of highly expressed genes "to be used as visible markers" for recovering successfully edited animals. It would be useful for the authors to discuss the issue of broad vs focused expression among this set of genes a bit more thoroughly, with an eye toward the issue of how likely it is that these genes could indeed consistently be used as visible markers, particularly for those at the low end of this limit.

      - What fraction of the proteome (on a per-gene basis) is secreted proteins? How difficult will it be to screen these for successful tags? Are there specific tags that would be more optimal for secreted proteins? (The authors mention the use of an SL2 or T2A cassette to label the cells in which these proteins are expressed but note that there are technical challenges associated with doing this at scale.)

      - For secreted and/or weakly expressed genes, it would be useful for the authors to estimate for what fraction of these would successful insertions need to be screened by PCR, and what resources (time and money) this would likely entail.

      - For how many genes would a single tag not capture all predicted isoforms?

      - Finally, some readers might object to the authors' assertion in the abstract that this work is "a first step in this direction" (presumably referring to designing a strategy for whole-proteome tagging). There is no concern that the authors are disregarding the extensive work of other groups, as they explicitly mention the contributions of other groups to the foundation that enables the present work. However, the spirit of the abstract could be misinterpreted by a well-intentioned reader.

    1. Reviewer #1 (Public review):

      The manuscript has been improved in response to the reviewing. Although overinterpretation has been partially reduced compared to the previous version, the main concerns on the manuscript remain. The experiments have been conducted according to rigorous standards and the limitations of the results have been discussed to provide a comprehensive interpretation. However, this still represents an incomplete study in which the conclusions are insufficiently supported by the data provided.

    2. Reviewer #2 (Public review):

      Summary:

      This paper starts with a large-scale yeast two-hybrid (Y2H) screen using Set1 (full-length and smaller parts) and other Set1C/COMPASS subunits as bait. There are hundreds possible interactions identified, but only a small number are given any follow-up. While it's useful to document all the possible interactions, the unfocused and preliminary nature of the results makes the paper feel scattered and incomplete.

      Strengths:

      The Y2H screen was very comprehensive, producing lots of interesting possible leads for further experiments.

      Weaknesses:

      Most interactions were not further tested, and even in the case of those that were, the experiments are often inconclusive or incomplete.

    3. Reviewer #3 (Public review):

      The SET1C/COMPASS complex is the histone H3K4 methyltransferase in Saccharomyces cerevisiae, where it plays pivotal roles in transcriptional regulation, DNA repair, and chromatin dynamics. While its canonical function in histone methylation is well-established, its full interactome remains poorly defined. Moreover, whether SET1C methylates non-histone substrates has been an open question.

      In this study, Luciano et al. employ systematic yeast two-hybrid (Y2H) screening to uncover novel interactors and functions of SET1C. Their findings reveal potential functional connections to RNA biogenesis, chromatin remodeling, and non-histone methylation.

      The authors performed multiple Y2H screens using Set1 (full-length, N-terminal, and C-terminal fragments) and each of its seven subunits as baits. They identified high-confidence interactors that link SET1C to diverse cellular processes, including chromatin regulation (e.g., the SWI/SNF complex via Snf2), DNA replication (e.g., Mcm2, Orc6), RNA biogenesis (e.g., spliceosome components Prp8 and Prp22; polyadenylation factors Pta1 and Ref2), tRNA processing (e.g., Trm1, Trm732), and nuclear import/export (e.g., importins Kap104 and Kap123). Some of these interactions were further validated by immunoprecipitation or in vitro assays.

      Given the interaction of Set1 with Slx5 and Wss1-proteins involved in SUMO-dependent processes-the authors investigated and convincingly demonstrated that Set1 is sumoylated. This modification may influence the function and regulation of the SET1C complex.

      Finally, the authors provide evidence that SET1C methylates Snf2, the catalytic subunit of the SWI/SNF chromatin remodeling complex.

      One of the interactors, Nrm1, contains a domain resembling the H3K4-methylated sequence, which is also present in other proteins. Whether this H3K4-like domain is required for methylation remains to be demonstrated

      Strengths:

      This study offers valuable insights into the interactome of SET1C, suggesting potential links between the complex and a wide range of cellular processes. It also provides information on the possible regulation of Set1 by sumoylation. Finally, the finding that Snf2 is methylated in a Set1-dependent manner could significantly expand the known targets and functions of SET1C.

      Weaknesses:

      Many of the Y2H interactions remain to be validated and have to be considered as a starting point for further studies. Their functional significance remains to be explored. Several conclusions based on these 2HY data are speculative.

    1. Reviewer #1 (Public review):

      Summary:

      The authors identify and investigate a specific population of PVNOT neurons (oxytocin neurons of the paraventricular hypothalamus) that seem to be involved in both behavioral and autonomic thermoregulation. These cells are activated by social thermoregulatory behaviors, but can influence thermoregulation in both social and social contexts, specifically during transitions and when mice are at low core body temperature (Tb).

      Strengths:

      The manuscript has many strengths.

      This is a novel study, with a clear question that is addressed using an array of well-designed experiments employing integrative methods. Most of the Figures are well developed, and the analysis is generally rigorous and well detailed. The authors are clearly very experienced in this field, and indeed their scholarly introduction and discussion sections is in their credit.

      The link between thermoregulation and the oxytocin system is well established, as is the link between social behavioral and the same broad system. However, the link between these three things is novel, if it can be well substantiated. I am not persuaded that was achieved here, but I do think this manuscript has many novel and useful offerings.

      The authors use a cooling floor and only go town to 10 degrees Celsius. This is fine, but I would like to see the effects using ambient temperature also. This is not a crucial issue, as it is not necessary for the authors' interpretations, but it could improve measurement sensitivity.

      Through an elegant behavioral experiment in Fig. 1, the authors identify c-Fos patterns in the PVN that are activated by active social huddling, and they show that at the RNA level these cells overlap with oxytocin, indicating that they are oxytocin producing cells. But this is not well discussed or indeed quantified.

      The authors engage in deep analysis of fiber photometry experiments, first by observing PVNOT neuron overall activity during a variety of different behaviors in the context of three different temperatures. Activity was associated with nesting, quiescence, and both types of huddling (when social opportunities exist). Social situations did not strongly effect this, not did temperature conditions. These analyses indicate that the PVNOT neurons are involved in mediating specific behavioral outputs.

      With more detailed analysis, the authors investigated how PVNOT neuronal activity relate to behavioral state transition. They found that the probability of peak PVNOT neural activity strongly predicts the offset of quiescence or quiescent huddling and therefore can be argued to signal an increase in physical activity, and as such increased metabolism. However, the opposite pattern was observed for huddling and nesting (onset being associated with PVNOT activity), again arguing for increased thermogenesis as a function.

      What is particularly compelling is that these peaks of activity tend to occur during low Tb, again arguing for the function in increasing body warmth.

      The authors then employ an impressive set-up where they image brown dispose tissue (BAT) in tandem with DeepLabCut (DLC) based animal tracking. Crucially, BAT activity and surface temperature correlated with the calcium peak of PVNOT neurons.

      Lastly, optogenetic activation of PVNOT neurons increased Tb when it was in the lower range, but not when in the higher ranger. It also affected BAT and rump temperature, again at low Tb. However, there is no real affect on behavior, except a trend in activity.

      The authors do some interesting tracing work at the end, though this is not functionally explored. That's not a criticism as it does seem like this would be a follow-up whole study.

      Comments on revised version.

      As discussed before, the authors employ a wide range of techniques (FOS IHC, FP for fine scale PVN OXT population dynamics, behavioural analysis, core and surface temperature tracking, physiological recordings to assess AAV specificity, optogenetic activation of PVN OXT neurons, and projection tracing) to address a clear question. The outcomes of these techniques seem to drive the same conclusion that PVN OXT neurons signal transitions from rest to arousal (behavioural and thermogenic) in a state-dependent manner:

      - FOS data identifies PVN OXT population activity following behavioural onset

      - Ca activity in these cells peaks at behavioural and thermogenic state transitions

      - Rump temperature and BAT activity increase at state transition points

      - Optogenetic stimulation of these cells recapitulates the thermogenic effects seen during physiological state transitions (in low body temperature animals) with a trending increase in physical activity

      Despite the inconclusive IHC results when validating the specificity of their AAV, the virgin female/ lactation experiment is convincing that they are specifically targeting PVN OXT neurons. The rationale for this experiment is clearer in the revised manuscript.

      Generally, in terms of the revised manuscript, the authors give strong responses to reviewer comments, either incorporating feedback, or giving clear explanations for the choices they made in the original manuscript. The revised manuscript is clearer about the question the authors aim to address, the reasons for their choice of experiments, and the limitations of the techniques used.

      Criticisms:

      I appreciate and agree with the authors' point that this manuscript is more fundamental than simply social basis oxytocin neuron function. This is point is well made by their data, and in the revised text. However, I still believe more behavioural analysis would be welcome to any reader.

      They partly justify the lack of behavioural analysis in Figure 6 with the problem of "animal merging" on the SGBS images. However, in Figure 6C, they confirm that, in solo conditions, the SGBS readings are consistent with core body temperature readings. So why not stick to core body temperature, opto stimulate and analyse the social behaviour with DLC (with normal video recordings)?

      The lactation validation still seems out of place in manuscript order. It is a very valuable validation, but it feels more like supplementary data for Figure 1. I feel the authors wanted it as a main figure because of how much work it must have been. In that case, it still makes more sense to include it in Figure 1.

      Though their lactation experiment validates that they are targeting PVN OXT neurons, their optogenetic stimulation protocol may not be specifically inducing OXT release from these cells. PVN OXT neurons co-release glutamate but can also release glutamate independently of OXT following lower frequency tonic stimulation. OXT release from PVN neurons requires pulsatile stimulation at a higher frequency (Leithead et al., 2021; Piñol et al., 2014; Lincoln & Wakerley, 1975). In this paper, the authors use a low stimulation frequency (10Hz) and continuous pulse train (20s) to optogenetically manipulate the target PVN population which may bias the cells towards glutamate release over OXT. Therefore, though they find evidence that PVN OXT neurons are involved in driving the transition between states in their other experiments, their optogenetic stimulation may not necessarily involve OXT release/signalling. It may be valuable to separate this out to identify the signalling molecule underlying this behavioural/ thermogenic transition. This could be done by using an opto protocol that recapitulates physiological OXT release.<br /> The authors do however mention that isolating the specific contribution of OXT signalling compared to other co-transmitted molecules was not the aim of this study, so this is not an essential question for this manuscript.

      A loss of function experiment to test for sufficiency would be a nice addition to further confirm their claims, but the authors mention that there were technical limitations to their attempts at inhibiting PVN OXT neurons. I appreciate the authors declaring that the DREADDs attempt suffered from unfortunate confounds. But for optogenetic attempts, I don't think they need a closed-loop system to get some useful results. They still can shine the light at "random" moments (that will correspond to random body temperatures) and then separate the data per body temperature.

      Lastly, the mention of Raam et al. 2026 is insufficient. The authors just mention it regarding the potential differences with males, to be explored in future experiments. Even if not using males in the current study doesn't affect the stated conclusions, the fact that they chose females because "their thermo-behavioural states were readily discernible" is a considerable bias. Testing males in this very study might be out of scope, but more discussion is warranted.

      References

      Leithead, A. B., Tasker, J. G., & Harony-Nicolas, H. (2021). The interplay between glutamatergic circuits and oxytocin neurons in the hypothalamus and its relevance to neurodevelopmental disorders. Journal of neuroendocrinology, 33(12), e13061. https://doi.org/10.1111/jne.13061

      Lincoln, D. W., & Wakerley, J. B. (1975). Factors governing the periodic activation of supraoptic and paraventricular neurosecretory cells during suckling in the rat. The Journal of physiology, 250(2), 443-461. https://doi.org/10.1113/jphysiol.1975.sp011064

      Piñol, R. A., Jameson, H., Popratiloff, A., Lee, N. H., & Mendelowitz, D. (2014). Visualization of oxytocin release that mediates paired pulse facilitation in hypothalamic pathways to brainstem autonomic neurons. PloS one, 9(11), e112138. https://doi.org/10.1371/journal.pone.0112138

    2. Reviewer #2 (Public review):

      This is a very interesting study from Vandendoren and colleagues examining the role of PVN oxytocin neurons during thermoregulatory behaviors, in particular during thermoregulatory huddling. The findings are important and have implications for the thermoregulation field as well as the social/naturalistic behavior field. The findings are compelling and use a combination of state-of-the-art tools (photometry, optogenetics, automated behavior tracking, thermal imaging, and core body temperature measurement), often in combination with each other, to produce a rigorous and high-dimensional dataset.

      Comments on revised version.

      I appreciate the effort the authors have put into addressing all of my questions, and I have no remaining concerns.

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates how the activity of hypothalamic paraventricular oxytocin (PVNOT) neurons relates to physiological states in female mice, with a particular focus on behavioral states and thermogenic sympathetic activity. To address this question, the authors combined automated video-based behavioral classification with calcium imaging of PVNOT neuron activity. Sympathetic thermogenesis was inferred from surface temperature changes measured by infrared thermography, and the authors have made their custom analysis scripts available. The authors report that strong, pulsatile activation of PVNOT neurons was "occasionally" observed immediately before transitions from resting to active states. This observation suggests that PVNOT neuronal activity may facilitate the transition from rest to activity. This phenomenon was observed in both pair-housed and individually housed animals. Taken together, these findings raise the possibility that the oxytocinergic system contributes to naturalistic behavior transitions even in the absence of social interactions. However, concerns regarding the selectivity of GCaMP expression in oxytocin-expressing neurons call into question the validity of the recorded PVNOT neuronal activity.

      Strengths:

      The oxytocinergic neural system is believed to subserve a wide range of physiological functions. Elucidating these roles requires monitoring PVNOT neuronal activity under diverse behavioral contexts, as well as manipulating this activity to establish causal relationships. In this study, the authors present a technically sound experimental framework that integrates behavioral tracking in both individually and group-housed mice with the monitoring and manipulation of PVNOT neuron activity. This setup represents a valuable methodological resource for researchers investigating the physiological functions of oxytocin.

      Weaknesses:

      (1) Immunohistochemical validation of selective GCaMP expression in oxytocin-expressing neurons showed that only 24-51% of GCaMP-positive neurons expressed oxytocin. As an alternative approach, the authors demonstrate that GCaMP-expressing PVN neurons in virgin females exhibit calcium peaks during rest-wake transitions with kinetics similar to those observed in PVNOT neurons during early lactation. However, this comparison is based solely on population-level peak profiles and does not provide direct evidence for cell-type specificity of GCaMP expression in oxytocin neurons. This limitation substantially undermines the validity of the optical calcium imaging data. In situ hybridization targeting oxytocin mRNA, rather than immunohistochemistry, may provide a more reliable assessment of expression specificity.

      (2) Although the authors' interpretation is generally consistent with the data presented, their main conclusions rely heavily on observational findings. Moreover, optogenetic stimulation of PVNOT neurons failed to robustly recapitulate behavioral state transitions (Figs. 6D and S5B). Further interventional experiments will be necessary to more rigorously test the authors' interpretation and to establish mechanistic insight into the causal relationship between PVNOT activity and rest-to-active transitions. In particular, loss-of-function approaches targeting the PVNOT system, such as OXTR antagonism, inhibitory DREADDs, or cell-type-specific ablation, will be essential to determine whether perturbation of this system alters behavioral state transitions These points should be addressed in future studies.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the reviewers' comments adequately and revised the manuscript accordingly.]

      Summary:

      In the submitted manuscript, Steinbach et al describe the formation of a detergent-resistant "cloud" around the Legionella-containing vacuole (LCV) that functions as a protective barrier. The authors show that formation of the "cloud" barrier is contingent upon the phosphoribosyl-ubiquitination activity of the SidE/SdeABC effector family, and is temporally regulated, with the assembly and subsequent disassembly of the "cloud" coinciding with replication and vacuolar expansion. The authors postulate a model of "cloud" barrier formation that relies upon a wave of initial ubiquitination by the SidC effector family, after which the SidE/SdeABC family expands the ubiquitination and forms cross-links that render the ubiquitin cloud resistant to harsh detergents. Additionally, Steinbach et al. also demonstrate that Rab5 is recruited to the LCV and remains associated for a considerable period.

      Strengths:

      This manuscript is very well written, with clear justification provided for experiments that make it very easy to follow along with the experimental logic. The figures have clearly been designed with much thought and are easy to interpret. Steinbach et al have also done a commendable job of addressing the previous reviewers' comments, even though some may suggest that some of these comments could be viewed as slightly unreasonable. This work would be of interest to both the Legionella and ubiquitin fields. Legionella researchers would potentially be interested to explore the proposed barrier model as the function for the ubiquitin "cloud," whereas ubiquitin researchers may be interested in exploring the mechanisms underlying SidE's crosslinking ability.

    2. Reviewer #2 (Public review):

      Summary:

      The manuscript "Canonical and phosphoribosyl ubiquitination coordinate to stabilize a proteinaceous structure surrounding the Legionella-containing vacuole" by Steinbach et al. is well written and presents strong evidence that satisfactorily supports the main hypothesis and research objectives. The authors have clearly demonstrated the presence of cloud-like, detergent-resistant GTPase Rab5 surrounding the LCV, and formation of the structure is dependent on the SidE family of effectors. The study provides insights into the relevant (associated with described phenotype) ubiquitination pathways. The findings advance our understanding of Legionella pneumophila vacuole remodeling during intracellular infection and open directions for future research to establish broader implications of this structure on Legionella pathogenesis.

      Strengths:

      The manuscript convincingly demonstrates the presence of a cloud-like, detergent-resistant GTPase Rab5 surrounding the LCV through elegant microscopy. The experimental evidence about the dependence of the observed phenotype on the SidE family of effectors is compelling and presented with strong scientific rigor. The introduction is well-written, and the discussion is thorough and satisfactory. The article is thought-provoking and shows preliminary evidence for ubiquitin-mediated protection and spatial organization of the LCV.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript by Mukherjee and colleagues extended earlier studies on the coordination of the SidC and SidE effector families on the generation of a unique ubiquitin layer on the surface of the vacuoles containing the bacterial pathogen Legionella pneumophila (LCV).

      Strengths:

      The main strength of the manuscript is the identification of the small GTPase Rab5 as a major "carrier" of these differently modified ubiquitin and ubiquitin chains, which was nicely quantified.

      Weaknesses:

      The results are mostly descriptive, based on mechanistic studies from earlier works.

    1. Reviewer #2 (Public review):

      Summary:

      The authors conducted a time-course of whole-body transcriptional analysis of a pest aphid, Rhopalosiphum padi, and identified four major clusters of the genes that show diurnal rhythmicity in transcription. In addition, they have conducted the analysis of aphid feeding behaviour and showed that aphids salivate longer from the end of the day toward the beginning of the night while their phloem feeding time does not change throughout a day. The genes up-regulated at nighttime were enriched with the genes involved in metabolic activities, collaborating with the results showing higher number of honeydew excretion at night. The authors identified the list of candidate salivary genes that show diurnal rhythmicity in the transcription and silenced a salivary gene C002 and the candidate salivary gene E8696. Silencing of these genes reduced aphid fecundity and survival rate on the host plant but not on the artificial diet.

      Strengths:

      The time-course transcription study and its analysis will be of interest to researchers studying diurnal rhythms in insect biology. Also, the analysis of aphid feeding behaviour at different time of day is interesting. This study provides variable resources for those who study insect biology.

      Weaknesses:

      Without the knowledge of the functions of the salivary effectors, especially their targets, it is hard to conclude that the rhythmical expression is important for the aphid performance. In addition, it is not clear whether increase of gene expression is directly corelated with the increase of protein secretion into the saliva and the plant.

    1. Reviewer #1 (Public review):

      Huang et al. examined ACC response during a novel discrimination-avoid task. The authors concluded that ACC neurons primarily encode post-action variables over extended periods, reflecting the animal's preceding actions rather than the outcomes or values of those actions. The authors have made considerable revision to address the raised the concerns. However, it appears that some important issues remain unresolved.

      To what extent ACC neurons encode post action content remain as a major concern. This may be at least partially attributed by the analysis methods. If I understand it correctly, the authors compared pre- vs post-event neural activity and looked for significant changed. By default, this is to look for post-event changes, rather than pre-event. As a result, it would lead to the conclusion 'Our study also reveals that ACC neurons play a limited role in encoding pre-action variables associated with decision-making or planning, as evidenced by their minimal responses to auditory cues and the modest activity changes prior to shuttle initiation'.

      To determine whether ACC encode pre-action variables or planning, different time windows should be used in the analysis.

    2. Reviewer #2 (Public review):

      Summary:

      Huang et al recorded anterior cingulate cortex activity in mice while they performed a shuttle escape task. The task utilized two auditory cues, each of which informed the mice to stay or escape depending on which side they were on, and incorrect responses were punished by shock administration. Analyses focused on ACC neurons that fired when mice crossed the shuttle box in either direction (A-->B or B-->A), coined "action state", or when mice crossed in one direction but not the other, coined "action content". The authors characterized these populations, and ACC firing changes mostly occurred around the time of shuttle crossing. This work will likely be of broad interest to those who are interested in neocortical neurophysiology broadly, anterior cingulate cortex specifically, and their contributions to learning about actions. The task is well-designed and provides a nice background for neurophysiological recordings. The authors leveraged these strengths in characterizing the neural populations that fire to shuttle crossings in both directions vs one direction.

      Strengths:

      The factorial design nicely controls for sensory coding and value coding, since the same stimulus can signal different actions and values.

      The figures are well presented, labeled, and easy to read.

      Additional analyses, such as the 2.5/7.5s windows and place-field analysis, are nice to see and indicate that the authors were careful in their neural analyses.

      The n-trial + 1 analysis where ACC activity was higher on trials that preceded correct responses is a nice addition, since it shows that ACC activity predicts future behavior, well before it happens.

      The authors identified ACC neurons that fire to shuttle crossings in one direction or to crossings in both directions. This is very clear in the spike rasters and population scaled color images. While other factors such as place fields, sensory input, and their integration can account for this activity, the authors discuss this and provide additional supplemental analyses.

    3. Reviewer #3 (Public review):

      Summary:

      The authors record from the ACC during a task in which animals must switch contexts to avoid shock as instructed by a cue. As expected, they find neurons that encode context, with some encoding of actions prior to the context, and encoding of neurons post-action. The primary novelty is dynamic encoding of action-outcome in a discrimination-avoidance domain, while this is traditionally done using operant methods.

      Comments on revised version:

      I appreciate subsequent responses to my comments and other reviewers. My comments are addressed, and at this point, I think readers can judge the work appropriately in context.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates whether the distribution of receptors and transporters of neurotransmitters accounts for the topography of cortical activity of confidence and surprise in probability learning. The authors first examined the invariance of functional correlates of confidence and surprises with multiple fMRI studies and then investigated whether 20 PET-derived receptor and transporter density maps account for this cortical invariant activity of confidence and surprise in probabilistic learning. Beyond these specific findings, the main novelty of this study lies in its attempt to bridge neuromodulatory systems and cognitive processes using neuroimaging data. This integrative approach is particularly valuable, as it showcases a framework to combine neurochemical architecture and cognitive computations.

      Strengths:

      This study attempts to link neuromodulatory systems with cognitive processes involved in probabilistic learning. Although the role of neuromodulatory systems in learning has been highlighted in several influential previous studies, it has not yet been widely investigated or systematically related to functional neuroimaging data so far. The authors used an efficient approach to address this question by combining group-averaged neurotransmitter maps with functional results from multiple fMRI studies using probabilistic learning tasks with similar structures. This approach provides informative insights into the relationship between the distribution of neuromodulatory systems and cognitive processes from neuroimaging data.

      Weaknesses:

      One limitation of the study stems from the unavoidable constraints of relying on pre-existing datasets rather than data specifically collected to address the present research question. Because the four fMRI studies differed in their measurements and task structures, the authors defined confidence and surprise on the basis of ideal observer behavior. Thus, "confidence" and "surprise" are not related to individual decision or subjective value, and the PET data is also from group-level data. Thus, it certainly has a limitation in linking with individual learning performance and brain activity. Also, "surprise" in this study does not seem to capture the nature of "surprise" in the learning process, which is a violation of expectation, as it was calculated with improbability. Moreover, the correlation of Study 1-4 for surprise was not consistent and not strong enough to argue for spatial invariance. Thus, these results may not yet be fully conclusive.

    2. Reviewer #2 (Public review):

      Summary:

      Learning in dynamic, stochastic environments is difficult, and neuromodulatory systems may shape where learning signals appear in the brain. Using fMRI from four probabilistic learning studies and a Bayesian ideal observer model, the authors examined latent variables driving learning, such as confidence and surprise. They found that brain activity related to confidence, and to a lesser degree surprise, is highly spatially invariant across tasks and modalities, suggesting a stable cortical organization. This invariant pattern aligns with PET-derived maps of receptors and transporters, implicating catecholamine and opioid systems, and supporting a neuromodulatory account of adaptive learning with receptor-level hypotheses.

      Strengths:

      (1) Elegant combination of computational modelling, functional magnetic resonance imaging (fMRI) and positron emission tomography (PET).

      (2) The authors describe results of four separate experiments, with very similar results, in effect providing internal replications.

      (3) Cross-validated results compared against a meaningful null model.

      Weaknesses:

      (1) Unclear rationale for using one-sided statistics (e.g., in Figure 3). One-sided tests appear to be invalid, given that the Introduction lacks a preregistered directional hypothesis at an operationalised level. This may have consequences for the following statement in the Discussion: "The associations between receptor architecture and functional topography were substantially weaker for the language network, which is not thought to rely strongly on neuromodulatory systems."

      (2) Limited computational modelling. Since learning rates probably differ across subjects, I wonder if they have considered fitting the "volatility" instead of using the generative one. Would that give more meaningful fMRI maps, and better explained variance when correlating these to the PET-based predictors? I was also wondering how their surprise measure relates to "change-point probability" (e.g., Murphy et al., Nat Neurosci, 2021). Finally, I think it would be helpful to show average time courses of surprise and confidence time-locked to state changes.

      (3) Lack of GLM validation. It would help to show that the model fits the data well. This is important given the many underlying assumptions (shape of the HRF, linear effects of variables, etc). For example, one could show average insula activity time-locked to state changes, as well as the model-predicted activity, and separately for three strata defined by how surprising the state change was (according to the ideal observer model). Related, the authors use a substantial number of predictors in their GLM, and the language in the Methods is a little casual. It would help to show part of a design matrix, and clearly describe the following: were (occasional) questions and responses modelled by separate stick functions? Which predictors (stimulus, questions, response) varied parametrically with which variables?

    3. Reviewer #3 (Public review):

      Summary:

      In this unusual paper, Hodapp and Meyniel relate the spatial topography of activity maps for confidence and surprise (from four learning tasks) to the spatial topography of receptor density maps from atlas data. They find that the brain maps for confidence and surprise are largely consistent across four studies using different stimuli/ task demands. They then use a general linear model to predict the spatial pattern of confidence/surprise-related activity from the spatial distribution of receptors (receptor types) for several neuromodulators. Further analyses test which neuromodulators are most important for predicting the functional maps.

      Strengths:

      The study gives an interesting new perspective on the brain networks for surprise and confidence, indicating that one reason for the involvement of different networks with these computational parameters is the neurochemical sensitivity of tissue within those networks.

      Weaknesses:

      I felt the paper was light on context.

      To what extent are the distributions of receptor types correlated with each other?

      What does the spatial topography of receptor density look like for the identified receptors (NET, MOR, 5HT1b)? Could these be displayed alongside the functional networks? I realise these are atlas data, but for me to interpret the result, I'd need to see the map, and I don't want to download the atlas.

      To what extent are the correlations with receptor maps network-wide, vs being driven by one big patch of activity in a single region with high receptor density? To me, this would be important - does this study demonstrate that distant regions united in a functional purpose by shared receptor profile (which would in my opinion be more intersting that the alternative, that there is a single region within each network driving the effect).

      Finally, I wasn't convinced by the spin test in this particular application. To my mind, permutation tests are valid when the permuted points are interchangeable under the null. The spin test, as used, preserves the distribution and spatial pattern of activity, but assumes that it could equally plausibly be relocated to any angle on the 3D surface (under the null). However, the brain has a lot of structure that is non-uniform across its surface (connectivity patterns and histological boundaries being important ones). The observed data probably follow this structure, but the 'spun' or permuted datasets probably overlay randomly on the connectivity structure (for example), so that one blob of activity has uniform connectivity in the real data, but overlaps the projections of multiple white matter tracts in the permuted data. But then the permuted data would likely be more heterogeneous in terms of both function and histology than the original data. Since connectivity, histology (layer structure) and receptor density are likely correlated, I think it must be impossible to find verticies that differ in one modality whilst being interchangeable in all others, therefore it may not be possible to use permutation logic to make a claim about (say) receptor density independently of connectivity and histology.

      I should add I'm not sure how one would carry out a permutation test that respects the underlying brain anatomy here, or whether this is even possible; that is a difficult question.

      I would add that I think the observation that the functional networks have different receptor profiles is interesting, even as a qualitative observation, but not convinced the statistical approach can be justified.

    1. Reviewer #1 (Public review):

      Summary:

      This study uses stacked encoding models to characterize differences in sensory (visual and auditory) processing between autistic and non-autistic children and adolescents. The authors found no significant enhancement of low-level feature encoding in either visual or auditory cortex, but reduced high-level visual representations and a relative shift toward low-level over high-level visual feature encoding in the posterior superior temporal sulcus (pSTS). The shift in pSTS correlated with social symptom severity (SRS scores). These findings support weak central coherence (WCC) theory over enhanced perceptual functioning (EPF) theory, suggesting an altered visual feature encoding in pSTS in autism.

      Strengths:

      This study uses sophisticated methodology and an open data set with a relatively large sample size. fMRI data are acquired during a naturalistic paradigm (i.e., movie watching), which promotes attention and engagement among participants, and provides greater ecological validity. The use of encoding models to explore population-level differences in neural representations of stimulus-computable features is novel. Overall, results provide somewhat modest yet still informative evidence for adjudicating between possible theories of altered sensory processing in autism.

      Weaknesses:

      Some important methodological details are missing and/or require justification. Some potential confounding factors or unconsidered differences between individuals and/or diagnostic groups should be explored and possibly addressed. Specific major and minor points are raised below.

      Major comments:

      (1) Unclear description of noise ceiling calculation (line 205-206, 632-634) and potential heterogeneity: it is not clear what data were "split" for the split-half correlation used to calculate noise ceilings. To our knowledge, each participant watched each movie once each, so there is no within-subject repetition available. Were these correlations across participants (i.e., ISC)? If so, does this across-subject metric provide a fair representation of the true noise ceiling, given that a) encoding models themselves are trained within subjects and b) autistic individuals are known to exhibit more idiosyncrasy in responses to naturalistic stimuli (e.g., Hasson et al., 2008)? Moreover, do noise ceilings differ between individual participants, diagnostic groups, and/or with age? If so, how might these differences affect the interpretation of results (e.g., R2 differences)?

      (2) Possibly underperforming visual model: given that the visual model in general performed worse than the audio model, the visual vs audio perceptual preference analyses (line 281-290) might be affected by the underlying mismatch between model performance. Though the visual and auditory regions showed similar noise ceilings (Figure 2 S1B), the stacked model performed better in auditory regions than in visual or multimodal regions (Figure 2 S1A). Supporting the same idea, the visual model in general showed lower fitting R2 than the audio model (Figure 2 S2A, Figure 2 S3A vs B). Instead of using mean motion (line 608-614), applying PCA on the raw features might help reduce noise inherent in the raw motion energy features (Malik et al., 2026), therefore improving model performance.

      (3) The clipping procedure for unique variance (lines 634-637) requires justification: the unique variance is defined by subtracting high-level R² from stacked R² with explicit clipping when high-level R² is negative or exceeds stacked R². However, in the original stacked regression framework (Lin et al., 2024), unique variance is defined by simple subtraction without such post-hoc adjustment, as the negative R2 is still meaningful, indicating the model performs worse than predicting using the mean value. This requires justification. How frequently does clipping occur, and in which brain regions? Is it an indicator of overfitting or poor model performance? How substantially do results change if clipping is removed? E.g., the hemisphere dominance comparison (line 271-280, Figure 6). Critically, does this procedure affect the key finding regarding SRS/sensory symptom severity correlations in pSTS?

      (4) The interpretation of the correlation between SRS with neural patterns is misleading (line 237-242, line 364-366): based on Figure 3, SRS and SSS showed more significant and robust relationship with unique variance of high-level visual feature, meaning that the decrement of high-level feature encoding in STSvp and STSdp, rather than the relative low-level preference, is likely driving the relationship with autism severity and sensory symptom.

      (5) Details are missing about how data from the two movie runs were combined. Were the time series concatenated without regard to which movie they originally came from, or was the distinction between movies taken into account for purposes of splitting data into train/test cross-validation folds? The results would be stronger if the authors could show that results replicate across the two movies when they are each analyzed independently, though we recognize that there is perhaps not enough data, especially in the shorter [~4min] movie, to do this. The authors discussed this in lines 412-417, but it would be helpful to provide a justification in the Methods section as well.

      (6) Potential feature weight differences across individuals and/or diagnostic categories: since the encoding models were trained for each subject, is there significant variability in feature weights across individuals and/or diagnostic categories (e.g., did the model predictions heavily rely on face for the non-ASD group but not for the ASD group)? If so, how does this change the interpretation of the R2 comparisons? The authors showed the results of stacked feature weight differences between diagnostic categories and their relationship with autism severity and sensory symptoms, but it might be informative to show the raw feature weightings before diving into stacked-weight differences.

    2. Reviewer #2 (Public review):

      Summary:

      This study by Mentch et al. uses naturalistic-movie fMRI and grayordinate-level stacked encoding models to test preregistered hypotheses about low/high-level and audio/visual feature encoding in autism and adolescence from openly available Healthy Brain Network data. Null results reported that autism was not linked to increased low-level encoding in primary sensory cortices. Exploratory analyses showed participants with autism showed reduced high-level visual encoding in social regions (pSTS, face areas), with the high-low feature shift tracking social responsiveness scale (SRS) scores. Age and laterality effects were also found.

      Strengths:

      (1) This study and hypotheses were preregistered.

      (2) The study utilised proper variance partitioning, split-half noise ceilings, FD-threshold sensitivity analyses, and an explicit modelling framework that recovers known sensory hierarchies in the aggregated sample. The developmental sampling adds to the interest.

      (3) The manuscript is written clearly, laying out the background and theories to be tested with encoding models. The analyses and reporting of results are clear.

      Weaknesses:

      (1) If I understand correctly, by only averaging the grayordinates that already passed a significance threshold, the resulting parcel value is guaranteed to look stronger than if all grayordinates had been included. This has been raised in neuroimaging (Kriegeskorte et al., 2009; Vul et al., 2009). Can the authors justify these choices?

      (2) I assume that the phrase "temporally permuting the order of observations" on Page 22 means random shuffling of time points. The details of this exact permutation are not specified. Both the fMRI BOLD signal and movie features have strong temporal autocorrelation, and random shuffling will destroy this structure. This is important as grayordinate-level survivors will propagate to parcel pools. Circular shifting or phase randomization preserving the autocorrelation spectrum is appropriate.

      (3) In the movie feature selection, the low-level visual model contains only two scalars: mean perceptual brightness and a single averaged value across 2,139 motion-energy filters. With only two low-level visual features, the low-level visual model potentially would underestimate low-level visual encoding. The H1.1 toward the null perhaps suggests to this. Principal components of the motion-energy outputs, as was done for the cochleagram, could be used.

      (4) The pilot sample composition is not described. Features were selected based on their performance on an independent set of 54 pilot subjects. Please provide age, sex, and diagnostic composition of the pilot sample. The main point being whether the selected features were optimised for a population that differs from the subject studied.

      (5) The authors acknowledge the lack of eye-tracking in theory study. I think this should be elaborated, especially why this modality is important for answering sensory and perceptual encoding. Face encoding may not be degraded, but just that faces are not being attended to.

      (6) I think a more nuanced distinction about the representational nature of encoding-model R² should be mentioned, especially when the interpretation of findings is related to perceptual functioning (EPF theory). R² measures how well a feature set predicts brain activity, not perceptual function or cognitive integration.

      (7) The literature also includes evidence for no Colavita effect, not just reverse Colavita in autism, and the framing should reflect this more even-handedly.

      (8) The 0.2 mm per-volume threshold is quite strict. The 40%/60%/80% sensitivity analyses partially address this, but a brief justification for the choice of 0.2 mm would strengthen the Methods.

      (9) Figure 1 seems confusing and would benefit from more information or text in the figure.

      (10) Figure 2 supplement has caption A labelled twice; please correct.

      (11) Acronyms. Please spell out MSI on first mention (page 2) and ISC/ISFC on first mention (page 4).

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates the neural mechanisms underlying sensory-perceptual differences in autism through a naturalistic movie-viewing fMRI paradigm. By employing encoding models, the authors demonstrate that autistic children and adolescents exhibit a specific alteration in visual feature weighting, characterized by a shift toward low-level visual feature encoding in higher-order association regions, particularly the posterior Superior Temporal Sulcus (pSTS). This shift is linked to social symptom severity, providing empirical support for Weak Central Coherence accounts.

      Strengths:

      The study's primary strengths lie in its methodological rigor and innovative approach. The use of a pre-registered analysis plan ensures transparency and enhances the credibility of the findings, while the encoding models allow for a fine-grained dissociation of low-level versus high-level feature representations across the cortex. Overall, the writing is clear, the logic is sound, and the results offer a significant contribution to the field by refining our understanding of how sensory processing is differentially organized in autism.

      Weaknesses:

      While the study presents compelling findings regarding visual feature encoding in autism, several methodological and interpretive limitations warrant consideration. First, the Discussion focuses primarily on WCC and EPF theories, failing to explicitly address how the results intersect with other prominent frameworks mentioned in the Introduction, such as Bayesian predictive coding or E/I imbalance hypotheses. Second, the demographic characteristics and specific sample sizes of the ASD-ADHD and ASD+ADHD subgroups are not reported, limiting the interpretability of the stratified analyses; furthermore, the counterintuitive finding that the ASD+ADHD group resembles controls is not sufficiently discussed. Third, given the significant group difference in IQ and the known relationship between cognitive ability and neural processing, the potential confounding influence of IQ on the neuroimaging results requires more explicit acknowledgment, particularly since IQ was not included as a covariate in the primary models.

    1. Reviewer #1 (Public review):

      Summary:

      Evidence for visual representation of animacy.

      Strengths:

      This is a very cool paper that casts light on a persistent problem in the psychology and philosophy of visual representation: is there high-level perception? Every vision scientist agrees that low-level features such as shape, color, texture, motion and spatial frequency are represented in visual perception, but there is a great deal of controversy about the representation of high-level properties such as causation, faces, agency and animacy. Animacy is especially problematic because there are large differences in line curvature between stimuli that represent animate and inanimate items.

      This article uses a novel approach-visual "anagrams" that are exactly the same image, except one is rotated 90 degrees relative to the other. They found persistent differences in visual processing between animate and inanimate stimuli. (Of course, the stimuli aren't animate-they represent animate items.). For example, there were processing differences between changes between animate and inanimate items (rabbit to boot) that were not present in rabbit to dog. They also showed such differences in two kinds of visual search tasks.

      Of course, there are feature differences that exploit orientation. A classic example is the difference between a square and a diamond that is produced from the square by rotating it 45 degrees.

      They addressed an aspect of this challenge having to do with some features using silhouettes. There was no search advantage for silhouetted stimuli.

      Weaknesses:

      I thought this was an excellent submission. I have two suggestions for revision:

      (1) I thought that experiment 7 should have been described in more detail, with the upshot explained better. What exactly do the authors take it to show?

      (2) There should be a candid discussion of what the loose ends are and how they might be addressed. It would be good to have some examples like the square/diamond case with some indication of what would address such challenges.

    2. Reviewer #2 (Public review):

      Summary:

      The authors present a creative approach using visual anagrams matched on low-level image statistics to isolate animacy from low-level visual features and report consistent effects of animacy on visual working memory and attention. While this is a thoughtful design and is well executed across seven pre-registered experiments, it remains unclear whether the reported effect is truly driven by animacy, as opposed to broader differences in ensemble statistics or semantic structure across the "mixed animacy" versus "uniform animacy" conditions. As such, the interpretation of a "pure" animacy effect may be overstated.

      Strengths:

      (1) An important methodological advance in controlling low-level confounds that have historically complicated the study of animacy.

      (2) The converging effects across multiple experiments, together with the pre-registered design, strengthen the reliability of the reported findings.

      Weaknesses:

      (1) Specificity of the animacy effect vs. category-level ensemble structure

      The central claim is that animacy itself drives the observed effects. However, the key manipulation ("mixed animacy" versus "uniform animacy") also introduces differences in category-level ensemble structure. For example, in Experiments 1-2, cross-category change detection (e.g., dog to chair) may be easier not because of animacy per se, but because of a change in overall ensemble statistics (Brady & Alvarez, 2011, 2015). In addition, since each display contains five objects (two in one category and three in the other category), cross-category changes may also alter category balance in a way that further facilitates detection. In contrast, within-category changes preserve both ensemble structure and category composition, making them more difficult to detect.

      Brady, T. F., & Alvarez, G. A. (2011). Hierarchical encoding in visual working memory: Ensemble statistics bias memory for individual items. Psychological Science.

      Brady, T. F., & Alvarez, G. A. (2015). Contextual effects in visual working memory reveal hierarchically structured memory representations. Journal of Vision.

      (2) Limited stimulus set and potential learning effects

      The relatively small stimulus set (six anagram pairs) and repeated exposure raise the possibility of learning or familiarity effects. Does performance change over time? e.g., are there meaningful differences between early and late trials (e.g., first 10% vs. last 10%)? If such differences are present, they could suggest the development of task-specific strategies or increased efficiency with repeated exposure, rather than stable effects driven by the experimental manipulation itself.

      (3) Role of semantics

      Although the anagram paradigm effectively controls low-level visual features, it still relies on high-level semantics (e.g., "dog" vs. "boot"). These stimuli differ not only in animacy but also along other semantic dimensions such as natural versus manmade categories. From a semantic standpoint, it remains unclear whether the observed effects can be uniquely attributed to animacy or whether they reflect broader conceptual distinctions.

    3. Reviewer #3 (Public review):

      Summary:

      This study makes clever use of generative AI to create stimuli that are pixel-for-pixel identical but which have radically different meanings depending on their orientation, to investigate the perception of animacy while retaining control over low-level image features (so-called 'anagram' stimuli).

      The authors present seven elegantly designed experiments in a commendably compact format.

      Experiments 1 and 2 involved a working memory paradigm in which participants had to spot which of five objects in an array changed after a pause. Importantly, the changed object was an anagram stimulus that in one orientation matched the animacy/inanimacy of the changed object, and in the other orientation was the opposite (e.g., a rabbit is replaced by either a dog or a boot, where the dog and boot stimuli are actually identical, just rotated by 90 degrees). They found a difference in accuracy depending on whether the animacy of the objects matched.

      Experiments 3 and 4 used a visual search task in which the participants had to localize the target, and the distractors were anagrams that either matched the target in terms of animacy or did not. There was a significant cost in terms of response time when the animacy of the target was the same as that of the distractors. Experiments 5 and 6 also used a similar visual search design, except that the task was to determine if the target was present or absent from the display, and the distractors again either matched or differed from the target in terms of animacy. Again, the authors found slower responses when the distractor arrays matched the animacy of the target than when they differed.

      An obvious potential concern about the studies is addressed by Experiment 7. It is unclear if the observed effects are related to the specific orientations of the target and distractor stimuli selected in each condition. For example, it could be that all the animate versions of the anagrams involved tall and skinny shapes, while all the inanimate versions involved wide and short objects, due to the 90-degree rotational difference between the two versions of the stimuli. To control for this, the authors repeated the visual search experiment but with convex-hull silhouettes of each of the stimuli. In other words, all targets and distractors from each trial were replaced by a black splotch with approximately the same overall outline (envelope) as the corresponding stimulus. Importantly, in contrast to the anagram stimuli, the silhouettes had had no meaningful semantic interpretation, and their animacy did not change depending on their orientation.

      Strengths:

      The main strength is the elegant use of stimuli that control almost perfectly for low-level image features.

      Weaknesses:

      My only real concern about the study is whether the findings truly provide evidence for a high-level visual representation of animacy independent of the low-level stimulus characteristics, or whether, instead, the effects are essentially semantic priming, which is independent of visual processing per se. For example, if all the stimuli in the experiments were replaced with the verbal names of the depicted objects instead of pictures, would we expect different results? Words can also access semantic representations of the animacy of objects, and also don't suffer from low-level visual confounds. It would be helpful to add a discussion of this possibility to the article.

    4. Reviewer #4 (Public review):

      In this article, the authors investigate whether perceived animacy influences visual processing independently of lower-level visual features by using "visual anagrams." Across seven experiments, they test whether animacy, isolated from many lower-level visual properties, structures visual working memory and guides visual attention. The central claim is that the visual system may represent animacy itself, rather than animacy emerging solely from associations among low-level visual properties.

      I find this investigation compelling. The experiments described provide strong control over several lower-level visual features, including curvature, texture, and related image properties. However, the visual anagrams are not pixelwise-identical across orientations. Because the images are rotated, the retinal configuration of pixels and the spatial organization of some low- to mid-level shape features also change. As a result, the configural arrangement of mid-level visual features may still contribute to perceived animacy.

      I encourage the authors to discuss how independent perceived animacy is in this context from the contribution of mid-level visual features, such as configural shape cues that are diagnostic of animacy. This distinction would help sharpen the interpretation of the results and more precisely define the level of visual representation isolated by the visual-anagram approach.

      Additionally, previous studies have argued that low- and mid-level curvilinear features may contribute to animate/inanimate categorization, and may in some cases be sufficient to support such distinctions (e.g., PMID: 33798259; PMID: 28654965). I encourage the authors to clarify how these previous findings on curvilinearity and rectilinearity fit with the overarching claim of the current study, namely that the visual system may represent animacy itself rather than animacy emerging solely from associations among lower-level visual properties.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, Deepak V. Raya and colleagues combined behavioral measures with EEG recordings to investigate how distractors presented during the working memory delay influence memory representations. Using oriented gratings as stimuli and a continuous estimation task, the authors systematically manipulated factors that may modulate distractor interference, including the behavioral relevance of the WM item (cued vs. uncued) and the spatial relationship between the distractor and the WM item. By analyzing the relative orientation between the WM item and the distractor, the authors showed that distractors presented at the same location as the WM item induced an attractive bias (i.e., reported orientations biased toward that of the distractor), whereas distractors presented at the opposite location produced a weaker effect, with any systematic bias tending to be repulsive. Through a combination of behavioral analyses and EEG-based decoding, the authors further examined and revealed factors that modulate the magnitude of distractor interference, including cueing status, the strength of memory maintenance, distractor timing, and neural indices of distractor encoding and gating. Lastly, the authors propose a computational account of these effects by implementing a two-layer ring attractor model that captures several key behavioral patterns observed in the data.

      Strengths:

      The influence of distractors on working memory has been extensively studied both behaviorally and with neuroimaging. The present study advances this literature by providing a more comprehensive account that jointly manipulates and quantifies many key factors, including cueing (behavioral relevance), the spatial relationship between WM items and distractors, and distractor timing. This integrative approach enables a more systematic characterization of how different sources of interference interact. A particular strength of the study is the use of EEG combined with multivariate decoding to track the dynamics of memory and distractor representations. Compared to prior fMRI work, this approach provides a time-resolved view of how encoding, maintenance, and distractor processing unfold over time. This is especially valuable for dissociating memory maintenance and stimulus encoding, or gating contribute to behavioral interference, which is more difficult to achieve with fMRI.

      Behaviorally, while most previous studies have reported attractive biases by distractors, the current study identified a repulsive effect when distractors were in the opposite hemifield from the WM item. Overall, the study provides a rich investigation of distractor interference in working memory and will be of interest to researchers studying the neural and computational mechanisms that protect memory representations from distraction.

      Weaknesses:

      (1) In the paragraph starting around line 125, the authors reported a 2-way ANOVA (cue/uncued × same/opposite side) restricted to trials in which a distractor was present. However, the subsequent post-hoc analyses compared distractor-present trials (same or opposite side) with no-distractor trials, which were not included in the ANOVA. While both analyses were informative, presenting them together in this way was somewhat confusing, as the post-hoc tests extended beyond the factors and conditions analyzed by the ANOVA. I suggest presenting these analyses separately and clarifying their distinct purposes. Additionally, Figure 1C appeared to reflect only the pairwise comparisons; including a figure that directly visualizes the two-way ANOVA results would improve clarity.

      (2) In lines 138-150, the authors fitted von Mises functions to the distributions of memory error and reported that the effect of distractor location (same vs. opposite) was stronger in the uncued condition than in the cued condition. However, this result appears difficult to reconcile with the earlier 2-way ANOVA, which showed no interaction between cueing and distractor location. It is unclear whether this discrepancy arose from differences in the dependent measures (CSD vs. κ), statistical procedures, or other factors. Clarifying how these two sets of results should be interpreted together would improve the clarity of the findings.

      (3) For the analyses in Figures 1B and 1D, parametric functions were fitted to the distributions of memory error using aggregated data. Models of memory error distributions have been central to ongoing debates in the working memory literature (e.g., Schurgin, Wixted, & Brady, 2020; van den Berg, Awh, & Ma, 2014). Fitting functions/curves to aggregated data can be problematic, as it distorts the underlying distributions at the individual level. I suggest performing the fits on the individual data and analyzing the fitted parameters across participants using appropriate group-level statistical tests.

      (4) At the end of the first Results section (lines 234-235), the authors concluded that cued memoranda were "better shielded from interference" than uncued memoranda. However, I did not see a clear statistical test directly supporting this. This statement appeared to rely mainly on Figure 1D, which showed a stronger location effect (same vs. opposite) when the memory item was uncued. However, this analysis does not directly test whether distractors impair uncued items more than cued items overall. Supporting this broader claim would require a direct comparison of distractor effects (e.g., distractor vs. no-distractor) between cued and uncued conditions, or an interaction test involving cueing and distractor presence (e.g., either by pooling different distractor locations, or focusing on the same-location condition if opposite-location distractors show no significant effect).

      (5) While the attractive and repulsive biases are an interesting finding, it was demonstrated only at the behavioral level. It would be informative to examine whether the biases are reflected in the decoding results. For example, after deriving trial-wise orientation tuning functions, one could estimate decoded orientations (e.g., via vector averaging or the peak of the tuning curve) and assess bias at the neural level. Although EEG SNR may limit recovery of full function of the memory error (e.g., Figure 1F-G), grouping trials into fewer bins (even with just two bins) may still allow detection of the overall direction of the bias in the decoding results. This type of decoding bias has been reported in other contexts (GY Bae - NeuroImage, 2021).

      (6) The analysis P2/P3a requires more explanations. Typically, these components are extracted from trial-averaged ERP. The methods section also mentioned "averaged across channels and trials to obtain the ERP waveform." However, to split the trials, these components have to be identified at a single-trial level. More details are needed in the Methods.

      (7) Components such as P3a are often linked to attentional capture and orienting, which would predict increased, rather than decreased, distractor interference. The interpretation of this signal as reflecting gating appears to be inferred from the observed relationship between larger P3a amplitudes and weaker interference. The N2pc component is a well-established index of spatial attention allocation and may be particularly relevant (and useful) here, given the lateralized distractor. Have the authors tested whether distractor-evoked N2pc can be used to split trials and examine its relationship with the bias?

      (8) Line 676 in the Discussion states "possibly by error-correcting top-down control mechanisms." It is unclear which results provide support for this interpretation, except that there are stronger feedback connections at the cued location in the attractor ring model.

    2. Reviewer #2 (Public review):

      Summary:

      Understanding the factors and mechanisms underlying the deleterious effects of distraction, and protection from distraction, in working memory is an important question that has a long and rich history in psychology and neuroscience, and continues to be highly relevant. In this study, the authors recorded the EEG while subjects viewed the initial presentation of two oriented-grating stimuli, aligned on either side of fixation along the horizontal meridian (memory array), followed by a 70%-valid cue, then one of three distractor conditions (overlapping cued item (40%), opposite cued item (40%), no distraction (20%)), followed by recall ("delayed estimation"). The behavioral and EEG results from this procedure are complemented with computational modeling with a two-tier bump-attractor model.

      Weaknesses:

      Interpretation of the results is complicated by several factors. One is the non-consideration of a considerable amount of extant research that is highly relevant to the question of interest (these include seminal studies from Gi-Yeul Bae and from Tatiana Pasternak). Relatedly, the manuscript emphasizes biasing effects of distractors to the exclusion of a conceptually distinct effect: degradation of representational precision. (For example, the actual focus of the study of Wimmer et al. (2014) that the manuscript cites with reference to bias is the degradation of precision; one only has to read the title of this paper to know this.) Also relatedly, the authors are aware of the possibility of misbinding (a.k.a. "swap") errors, in which subjects mistakenly recall a high-fidelity representation of a foil (in this case, the distractor) rather than the target, but they (1) fail to cite any of the extensive literature on this topic and (2) seem to erroneously attribute what their analyses would seem to identify as misbinding errors as "antagonistic bias" exerted by the distractor on the target item.

      A second concern relates to the interpretation of patterns in the empirical results. In particular, Figure 1G is interpreted as displaying a pattern of repulsive bias exerted by the distractor on trials when the distractor appeared at the location opposite to the cued item. However, it is not clear that this is a repulsive bias. Rather, what the plot shows is that report error is attracted to "near" distractors with a positively signed offset but repelled by "near" distractors with a negatively signed offset. Stated another way, when one applies a model-free assessment of the influence of the distractor on the memorandum, there is no systematic bias: the AOC of positively signed offset values from 0 to +45 deg is roughly the same as the AOC negatively signed offset values from 0 to -45 deg. The same also seems to be true, albeit with a smaller magnitude, for trials featuring "stronger mnemonic neural representation" that are illustrated in Figure 2. And so it's unclear that the effect of the "Dist. Opp" distractor is indeed a repulsive bias, rather than a loss of precision.

      The third primary concern is that the results from simulations from the two-tier bump-attractor modeling are difficult to interpret due to several poorly motivated and seemingly "hand-coded" assumptions. These include the (seemingly arbitrary) strengthening of HCVC feedback connections by 20% for cued vs. uncued items; and the choice to "transiently block[ed] feedforward connections from the VC to the HC during the maintenance epoch" as a consequence of cuing. There is frankly no evidence that the latter phenomenon actually happens in primate brains performing comparable tasks, including in papers (such as from Xu and from Rademaker) that are cited in this manuscript. The current consensus is that priority-related rotations of representational geometry are the scheme employed by mammalian nervous systems to control the otherwise deleterious effects of distraction.

    1. Joint Public Review:

      Cardiolipin, is a key lipid constituent of mitochondrial membranes. Perturbation of its abundance is thus poised to affect broad aspects of mitochondrial function. Given the important role of mitochondria, it is not surprising that cardiolipin deficiency would have pervasive effects on cell physiology.

      The original version of this paper advanced the idea that cardiolipin deficiency, and the attendant mitochondrial dysfunction, plays a causative role in the progression of fatty liver (a common feature in the human population) to a more pathogenic inflammatory state known as steatohepatitis. Given the prevalence of this form of liver disease in the human population this claim for discovery was deemed sufficiently interesting to merit peer review at eLife.

      Peer review reaffirmed the importance of the claim but also revealed important limitations in the experimental support provided. Specifically, the lack of experimental interventions that uncouple the correlation between progression in a mouse model and changes in cardiolipin abundance to test the causal relationship. The review process also recognised the utility of other aspects of the paper, namely the evidence implicating cardiolipin deficiency in altered properties of the mitochondrial membrane, its contribution to an electron leak and the potential for these features to contribute to pathology.

      The revised version of the manuscript now focuses on the importance of cardiolipin sufficiency to mitochondrial integrity and contains various improvements to the data supporting this aspect. At the same time the revised paper retreats from the most interesting claim of a causal role for cardiolipin deficiency in disease progression. We are left with a more convincing but less significant paper.

    1. Reviewer #3 (Public review):

      The authors find that DNA methylation-based clocks are generally less accurate at predicting age in cohorts with large proportions of non-European (especially African) ancestry, compared to cohorts with high European ancestry proportions (which more closely reflects the genetic composition of individuals included in training sets). They provide evidence for this ancestry bias via ancestry-stratified analyses, and in analyses of continuous ancestry proportion effects on clock error. They then test two hypothesized underlying causes of ancestry bias: that ancestry-differentiated SNPs disrupt CpG sites preventing methylation, and that ancestry-differentiated SNPs influence DNA methylation levels. They find clear evidence especially for the second cause, in the form of meQTL that influence clock CpG sites and vary in frequency across ancestry groups. Finally, the authors provide key discussions of potential paths forward to alleviate bias and improve portability for future clock algorithms.

      The topic is timely due to the increasing popularity of DNA methylation-based clocks and the acknowledgment that many algorithms (e.g., polygenic risk scores) lack portability when applied to cohorts that substantially differ in ancestry or other characteristics from the training set. This has been discussed to some degree for DNA methylation-based clocks, but could of course use more discussion and empirical attention, which the authors nicely provide using an impressive and diverse collection of data. The inclusion of data from multiple cohorts, the analysis of ancestry as a continuous variable, and the attempts to address the underlying causes of ancestry-based differences in accuracy provide comprehensive evidence that genetic background influences clock portability.

    1. Reviewer #1 (Public review):

      Ono et al., compared the activity of prime editor nickase PE2 and primer editor nuclease PEn in introducing SNPs and short exogenous DNA sequences into the zebrafish genome to model human disease variants. They find the nickase PE2 prime editor had a higher rate of precise integration for introducing single nucleotide substitutions, whereas the nuclease PEn prime editor showed improved precision of integration of short DNA sequences. In somatic tissue the percentage of SNP variant precision edits improved when using PE2 RNP injection instead of mRNA injection, but increased precision editing correlated with elevated indel formation. While PEn overall had higher rates of precision edits, the indel rate was also elevated. Similar rates were observed when introducing a 3 bp stop codon into the ror gene using a standard pegRNA with a 13-nucleotide homology arm, or a springRNA driving integration by NHEJ. Inclusion of an abasic sequence in the springRNA prevented imprecise edits caused by scaffold incorporation, but did not improve the overall percentage of precise edits in somatic tissue. Both PE2 and PEn showed higher frequency of 3 bp precision integration, compared to CRISPR HDR mediated knock-in using a single strand donor DNA template with short homology. Recovery of a germline ror-TGA integration allele using PEn with RNP was robust, resulting in 5 out of 10 founders transmitting a precise allele. The authors demonstrate PEn was effective at integration of a 30 bp nuclear localization signal into the 5' end of GFP in an existing muscle-specific reporter line. PEn-mediated integration of long sequences was further demonstrated by integration into the wls gene of a 46bp attP sequence for phiC31 integrase recombination. Additional analyses are needed to determine if the approach can be used to isolate stable germline alleles of variants that are potentially dominant negative or gain of function in nature.

      The conclusions of the paper are well supported, demonstrating PE2 increases precision, while PEn increases efficiency, for integrating short DNA sequences. Introducing longer sequences up to 46 bp wit PEn highlights the potential broad utility of this approach for insertion of functional motifs for protein modification and gene expression.

      (1) In Figure 3 the data indicates a significant increase in precise edits of the 3 bp TGA using PE2 RNP (11.5%) vs. PE2 mRNA (1.3%). At the adgrf3b locus both PE2 RNP, PE2 mRNA, PEn RNP and PEn mRNA were tested for introducing the 3 bp TGA and a longer 12 bp insertion. PEn RNP showed the highest rate of precision for integration of the longer 12 bp sequence. A comparison of somatic precision editing at additional loci, and analysis of germline transmission rates using PE2 vs. PEn, would support the conclusion that PEn is preferred for precise integration of longer templates, and recovery of germline integration alleles.

      (2) Figure 4 shows the results of introducing a TGA stop codon that is predicted to result in nonsense mediated decay. Testing the ability to also isolate different substitution mutations in the germline would be useful information for identifying the most effective approach for generating human disease variant models.

    2. Reviewer #2 (Public review):

      The manuscript by Ono et al compares two prime editing strategies in zebrafish, one based on a nickase and the other on a nuclease, and evaluates their performance for introducing substitutions, short insertions, and transmission to the next generation. The study aims to clarify the relative strengths of these approaches and to extend their use for inserting short DNA sequences in vivo.

      The study provides a useful and well-executed comparison of two editing strategies in a vertebrate model. In particular, the finding that the nuclease-based approach shows higher efficiency for short insertions is of practical interest for functional studies. The authors also present convincing evidence supporting their conclusions, including sequencing and phenotypic validation at selected loci. These results support the reliability of the approach in this system.

      The overall conceptual advance remains somewhat limited, as the general strategy of delivering prime editing components in zebrafish has been described previously. The present study extends this work by comparing two editing modes and exploring insertion efficiency, which represents a useful but incremental advance.

      Regarding the comparison between the two systems, the authors have made efforts to address concerns about generalizability by adding data from additional loci and by refining the scope of their conclusions. These additions strengthen the manuscript. However, the comparison is still based on a relatively small number of loci, and the conclusions may therefore remain somewhat context-dependent.

      Overall, the authors largely achieve their stated aims of comparing two editing strategies and demonstrating their applicability in zebrafish. The data generally support the conclusions, particularly within the tested loci. The work provides practical value to the community, especially for researchers seeking efficient strategies for short sequence insertion in this model system, although its broader impact is somewhat limited by its incremental nature.

    3. Reviewer #3 (Public review):

      The manuscript by Ono et al describes application of prime editors to introduce precise genetic changes in the zebrafish model system. Probably the most important observation is that compared to the "standard" PE2, prime editor with full nuclease activity appears to be more efficient at introducing insertions into the genome. Although many laboratories around the world have successfully used oligonucleotide-mediated HDR to insert short exogenous sequences such as epitope tags or loxP sites into the zebrafish genome, the method suffers from high frequency of indels at the edit site. Thus, additional tools are badly needed, making this manuscript very important.

      Comments on revised version.

      Thank you for thoroughly addressing my minor concerns.

    1. Reviewer #1 (Public review):

      Summary:

      This paper carefully compares intramural vs. extramural National Institutes of Health funded research during 2009-2019, according to a variety of bibliometric indices. They find that extramural awards more cost-effectively fund outputs commonly used for academic review such as number of publications and citations per dollar, while intramural awards are more cost-effective at generating work that influences future clinical work, more closely in line with agency health goals.

      Strengths:

      Great care was taken in selecting and cleaning the data, and in making sure that intramural vs. extramural projects were compared appropriately. The data has statistical validation. The trends are clear and convincing.

    2. Reviewer #2 (Public review):

      This article reports a cost-effectiveness comparison of intramural and extramural that NIH funded between 2009 and 2019. Using data obtained from NIH RePORTER, they linked total project costs to publication output, using robust validated metrics including Relative Citation Ratio (RCR), Approximate Potential to Translate (APT), and clinical citations. They find that after adjusting for confounders in regression and propensity-score analyses, extramural projects were generally more cost-effective, though intramural projects were more cost effective for generating clinical citations. They also describe differences in the topics of intramural- and extramural-funded publications, with intramural projects more likely to generate papers on viral infections and immunity or cancer metastases and survival, but less likely to generate papers on pregnancy and maternal health, brain connectivity and tasks, and adolescent experiences and depression. The authors aptly describe the different natures of the intramural and extramural funding models, including that extramural researchers spend much time writing grant applications and that the work described in extramural publications often receives funding from sources other than NIH grants.

      Strengths:

      The authors leveraged publicly available data (including RePORTER and the iCite repository) and used robust validated metrics (RCR, APT, clinical citations). They carefully considered a large number of confounders, including those related to the PI, and performed several well-described regression analyses.

    3. Reviewer #3 (Public review):

      This article demonstrates a comparative study on two funding mechanisms adopted by the National Institutes of Health (NIH). The authors adopted a quantitative approach and introduced five metrics to compare the output of intramural and extramural grants. These findings reveal the impacts of intramural and extramural grants on the scientific community, providing funders with insights into the future decisions of funding mechanisms they should take.

      Strengths:

      The authors clearly presented their methods for processing the NIH project data and classifying projects into either intramural or extramural categories. The limitations of the study are also well-addressed.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      It is well known that neurons in the medial prefrontal cortex (mPFC) are involved in higher cognitive functions such as executive planning, motivational processing and internal state mediated decision-making. These internal states often correlate with the emotional states of the brain. While several studies point to the role of mPFC in regulating behavior based on such emotional states, the diversity of information processing in its sub-populations remains a less explored territory. In this study, the authors try to address this gap by identifying and characterizing some of these sub-populations in mice using a combination of projection-specific imaging, function-based tagging of neurons, multiple behavioral assays and ex-vivo patch clamp recordings.

      Strengths:

      The authors targeted mPFC projections to the nucleus accumbens (NAc) and basolateral amygdala (BLA). Using the open field task (OFT), the authors identified four relevant behavioral states as well as neurons active while the animal was in the center region ("center-ON neurons"). By characterizing single unit activity and using dimensionality reduction, the authors show differentiated coding of behavioral events at both the projection and functional levels. They further substantiate this effect by showing higher sensitivity of mPFC-BLA center-ON neurons during time spent in the open arms of the elevated plus maze (EPM). The authors then pivoted to the three-chamber social interaction (SI) assay to show the different subsets of neurons encode preference of social stimulus over non-social. This reveals an interesting diversity in the function of these sub-populations on multiple levels. Lastly, the authors used the tube test as a manipulation of the anxiety state of mice and compared behavioral differences before/after in the OFT and social interaction tasks. This experiment revealed that "losers" of the tube test spend less time in the center of the open field while "winners" show a stronger preference for the familiar mouse over the object. Using patch-clamp experiments, the authors also found that "winners" exhibit stronger synaptic transmission in the mPFC-NAc projection while "losers" exhibit stronger synaptic transmission in the mPFC-BLA projection. Given the popularity of the tube test assay in rank determination, this provides useful insights into possible effects on anxiety levels and synaptic plasticity. Overall, the many experiments performed by the authors reveal interesting differences in mPFC neurons relative to their involvement in high or low anxiety behaviors, social preference and social rank.

      Weaknesses:

      The authors have addressed all comments.

    2. Reviewer #2 (Public review):

      Summary:

      The goal of this proposal was to understand how two separate projection neurons from the medial prefrontal cortex, those innervating the basolateral amygdala (BLA) and nucleus accumbens (NAc), contribute to the encoding of emotional behaviors. The authors record the activity of these different neuron classes across three different behavioral environments. They propose that, although both populations are involved in emotional behavior, the two populations have diverging activity patterns in certain contexts. A subset of projections to the NAc appear particularly important for social behavior. They then attempt to link these changes to the emotional state of the animal and changes in synaptic connectivity.

      Strengths:

      The behavioral data builds on previous studies of these projection neurons supporting distinct roles in behavior and extend upon previous work by looking at the heterogeneity within different projection neurons across contexts, this is important to understand the "neural code" within the PFC that contributes to such behaviours and how it is relayed to other brain structures.

      Weaknesses:

      The diversity of neurons mediating these projections and their targeting within the BLA and NAc is not explored. These are not homogeneous structures and so one possibility is that some of the diversity within their findings may relate to targeting of different sub-structures within BLA or NAc or the diversity of projection neuron subtypes that mediate these pathways. This is an important future direction for this work but does not detract from the main finding as reported.

    1. Reviewer #1 (Public review):

      Summary:

      The authors attempted to identify if a new deep learning model could be applied to both resting and task state fMRI data to predict cognition and dopaminergic signaling. They found that resting state and moving watching conditions best predict episodic memory, but only movie watching predicts both episodic and working memory. A negative 'brain gap' (where the model trained on brain connectivity predicts worse performance than what is actually observed) was associated with less physical activity, poorer cardiovascular function, and lower D1R availability.

      Strengths:

      The paper should be of broad interest to the journal's readership, with implications for cognitive neuroscience, psychiatry, and psychology fields. The paper is very well-written and clear. The authors use two independent datasets to validate their findings, including two of the largest databases of dopamine receptor availability to link brain functional connectivity/activity with neurochemical signaling.

      Weaknesses:

      The deep learning findings represent a relatively small extension/enhancement of knowledge in a very crowded field.

      It's unclear from these results how much utility the brain gaps provide above and beyond observed performance. It would be helpful to take a median split the dataset on observed performance, and plot aside the current Fig 3 results to see how the cardiovascular and physical activity measures differ based on actual performance. Could the authors perform additional analyses describing how much additional variance is explained in these measures by including brain gaps?

      Some of the imaging findings require deeper analysis. For figure 1f - Which default mode regions have high salience? DMN is a huge network with subregions having differing functions.

      Along the same lines, were the striatal D1R findings regionally specific at all? It would be informative to test whether the three nuclei (Accumbens, Caudate, Putamen) and/or voxelwise models would show something above and beyond what is achieved from averaging D1R across the striatum. What about cortical D1R, which are highly abundant, strongly associated with cognitive (especially WM) performance, and have much unique variance beyond striatal D1R? https://www.science.org/doi/full/10.1126/sciadv.1501672. The PET findings are one of the unique strengths of this paper and are underexplored. It's also unclear if the measure of brain entropy should simply be averaged across all regions.

      It is not clear from the text that the authors met the preconditions for mediation analysis (that is, demonstrating significant correlations between D1R and entropy, in addition to the correlation with brain gap. Could they please report this as well?

      Was age controlled for in the mediation analysis? I would not consider this result valid unless that is the case.

      The discussion is long, but the authors would do better to replace some less helpful sections (e.g., the paragraph on methodological tweaks to parcellations and model alignment) with a couple of other important points, including:

      (1) Discuss the 'sweet-spot' of movie watching for behavior prediction in the context of studies showing that task states 'quench' neural variability: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007983. This may not be mutually exclusive of the discussion on dopamine and signal-to-noise ratio, but it would be helpful for the authors to discuss their potential overlap vs. unique contributions to the observed findings.

      (2) The argument that dopamine signaling increases signal-to-noise ratio is based on some preclinical data as well as correlational data using fMRI with pharmacological challenges. It is less clear how PET-derived estimates of D1R and D2R availability equate to 'dopamine signaling' as it is thought of in this context. Presumably, based on these data, higher D1R or D2R availability would be related to greater levels of tonic dopaminergic signaling. However, in the case of the COBRA dataset with D2R estimates, those are based on raclopride -- which competes with endogenous dopamine for the D2 receptor. Therefore, someone with higher levels of endogenous dopamine signaling should theoretically have lower raclopride binding and lower D2R estimates. I'm not arguing that the authors logic is flawed or that D1R and D2R are not good measures of dopamine signaling, but I'd ask the authors to dig into the literature and describe more direct potential links for how greater receptor availability might be associated with greater dopamine signaling (and hence lower entropy). Adding this to the discussion would be very valuable for PET research.

      Comments on revised version:

      I thank the authors for their extensive efforts to revise the manuscript. I have no further concerns.

    2. Reviewer #2 (Public review):

      The authors have made several corrections to the original manuscript. For example, they revised the bootstrapping analysis to avoid arbitrarily inflating the degrees of freedom. However, most substantive concerns remain inadequately addressed.

      (1) The primary issue is still the lack of baseline models against which to benchmark the predictive performance of the proposed DenseNet model. This concern was raised independently by two reviewers. Without such benchmarks, it is difficult to interpret the reported results in the context of prior work on MRI-based cognition prediction.

      Notably, the authors state: "While we compared our model with the connectome predictive modeling (CPM) approach and observed better performance with our deep learning framework, we did not conduct a comprehensive benchmark across all available machine learning methods, nor was this the aim of the present study."

      However, I could NOT find any discussion or results related to the CPM model in the manuscript. It is therefore unclear whether the DenseNet model was actually statistically compared with CPM, and, if so, how the comparison was conducted.

      Note that the statement, "While Vieira et al. show that the majority (76%) of prior studies used linear modeling approaches, including CPM and penalized regressions, these models are often vulnerable to overfitting, especially when applied to high-dimensional fMRI data," is not entirely accurate. Linear models typically have far fewer parameters than deep-learning models and are therefore often less prone to overfitting. In fact, it is well established that deep-learning models are particularly susceptible to overfitting and usually require substantially larger sample sizes to achieve stable and reliable performance. Although deep-learning models may outperform shallower models once sufficient data are available and training is well controlled, this does not justify the authors' claim as stated. I therefore disagree with the argument put forward by the authors.

      The authors further justify the absence of benchmarking by stating: "In this context, deep learning was employed as a flexible framework capable of modelling high-dimensional functional connectivity patterns across cognitive states, rather than as a claim of inherent methodological superiority. Thus, our goal was not to propose a universally superior prediction model, but rather to test how brain state influences predictive utility for WM and EM using a deep learning approach." However, most shallow models can likewise be applied across different brain states and cognitive targets. This rationale does not establish deep learning as a uniquely appropriate or necessary choice. If deep learning is indeed a better approach in this context, the authors should demonstrate this empirically through appropriate benchmarking against established baseline models.

      (2) Additional analysis shows that "BCG is not significantly associated with cognition itself". This is the most perplexing result. This is like saying Brain Age Gap is not related to chronological Age. It is counterintuitive since the Brain Age Gap is calculated by chronological age minus actual age, and most research has shown a strong relationship between the Brain Age Gap and age.

      If the brain cognition gap is not related to cognition, is it possible that the results found are mainly due to the predictive model not fitting well with another dataset? Regardless, the lack of association between BCG and cognition deserves a discussion.

      (3) I still do not fully understand the rationale of the mediation analysis. The analysis and findings are still not related to aims 1 and 2, since DA and entropy are not part of the prediction models. But I appreciate the explanation that this part is related to the authors' previous work, and that the authors attempted to link to them somehow.

    1. Reviewer #1 (Public review):

      Summary:

      This study used whole genome data to investigate Beefalo ancestry for the first time, filling the gap in the field of Beefalo ancestry. The authors used preserved semen samples to generate genomic data on 47 registered Beefalo and 3 bison hybrids, further questioning the ABA's stated goal of ⅜ bison ancestry. In addition, the authors also show that ancestry profiles of Beefalo and bison hybrid genomes are consistent with repeated backcrossing to either parental species, demonstrate the value of genomic information in examining gene flow between species in the genus Bison. Overall, these data thus demonstrate the utility of genomic information in validating specific breeding claims for a more complete understanding of gene flow and genetic variation among bovine species. This is an interesting study, but there are still some major weaknesses that exist.

      Strengths:

      Numerous genetic analysis methods such as PCA, ADMIXTURE, F4 ratios, and local ancestry inference techniques revealed that no single Beefalo set meets the ancestry requirements set by the American Beefalo Association (ABA) and some beefalo had detectable indicine cattle ancestry.

      Comments on revised version:

      The authors have made further revisions in the revised manuscript, and these revisions have undoubtedly helped improve the article. No further comments.

    2. Reviewer #2 (Public review):

      Summary:

      Shapiro et al. set out to verify the American Beefalo Association's claim that Beefalo cattle possess 37.5% bison ancestry. They employ a comprehensive range of well-established population genomics methods to estimate ancestry in these hybrid populations, including PCA, ADMIXTURE, D and F statistics, and local ancestry inference. Their findings conclusively demonstrate that most Beefalo lack the claimed bison ancestry, with only 8 out of 47 samples showing any detectable bison ancestry, ranging from 2-18%.

      Strengths:

      The primary strength of this analysis lies in the comprehensive dataset available to the authors, which includes important foundational Beefalo individuals and various reference populations. The rigorous and multi-faceted methodological approach employs several well-established techniques in population genomics for detecting and measuring admixture. Each method used has a firm basis in the field, providing consistent and robust results. The authors' approach of using PCA to initially assess the data within a global context, followed by more specific analyses using ADMIXTURE and D-statistics, provides a clear and logical progression of evidence. The presentation of these results in figures is particularly effective, clearly illustrating the key findings of the study. Additionally, the examination of both autosomal and sex chromosome ancestry offers a more complete understanding of Beefalo genetic composition and the mechanics of bison-cattle hybridisation.

      Weaknesses:

      One limitation of this analysis is the relatively low coverage (~2x) of many Beefalo samples. However, the authors have taken steps to mitigate biases that may arise from this, and their downsampling experiment demonstrates that this level of coverage is appropriate for summarising species-level ancestry across Bos. Another potential weakness is the limited sampling of contemporary Beefalo populations, as the study focuses primarily on historical samples. The authors have justified this choice on the grounds that contemporary Beefalo breeding involves no further bison input, so founder-era individuals are the most informative samples for addressing the study's central question.

      Appraisal:

      The authors have clearly achieved their primary aim using a rigorous and comprehensive methodology. Their extensive dataset and multi-faceted analytical approach provide strong support for their conclusions. The study not only addresses its main research question but also reveals unexpected insights into Beefalo genetics, particularly the presence of zebu ancestry, predominantly from Brahman cattle.

      Discussion:

      This study is valuable for several reasons beyond its primary findings. First, it definitively addresses and refutes the claim of 37.5% bison ancestry in Beefalo, providing crucial information for those studying these interspecies hybrids and the viability of their offspring. Second, it reveals the unexpected presence of zebu ancestry, predominantly from Brahman cattle, in many Beefalo, raising intriguing questions about the breed's development and the potential role of zebu cattle in achieving desired traits. This finding suggests that the distinctive appearance of Beefalo may be due in part to zebu admixture rather than bison ancestry. Third, the study highlights the significant barriers to admixture between bison and cattle, both in controlled breeding programs and potentially in wild populations. This has important implications for conservation genetics and our understanding of gene flow between these species. Lastly, the study demonstrates the power of genomic analysis in verifying breed claims and understanding the complex history of domestic animal breeds. These findings open new avenues for research in bovine genomics, breed development, and the dynamics of interspecies hybridisation.

      Comments on revised version:

      Thanks for the responses, which address my comments in full. I have no further concerns.

    3. Reviewer #3 (Public review):

      Summary:

      The American beefalo cattle breed was developed as a mixture of 5/8 domestic cattle and 3/8 (or 37.5%) bison ancestry. The authors sequenced 50 genomes from bison and hybrids (historical and present-day). They found that most animals did not carry any detectable bison ancestry, with only a few between 2-18%, while other beefalo had taurine/zebu cattle ancestry, which may explain morphological traits. Breeding design was likely each time to a parental instead of to other admixtures.

      The authors utilize whole genome sequence data to explore the ancestry of beefalo with respect to expected and possible contributions from cattle lineages. Using molecular and analytical methods central to questions exploring genomic ancestry and identity, the authors very nicely show evidence that calls into question ability of ancestry to be deduced from breed club documentation without considering reproductive challenges that are known in hybridization between cattle lineages.

      Comments on revised version:

      The authors have addressed all my comments to help improve presentation of specific details, results, and readability. Thank you!

    1. Reviewer #1 (Public review):

      Summary:

      The authors set out to better understand how Drosophila responses to CO2 can be aversive or attractive depending on context (especially presence of food odors, temperature, humidity). While some aspects of this circuit had been previously identified, the authors uncovered additional, critical aspects of the circuit to more fully explain these phenomena. One important discovery was the identification of the LN23 interneuron, which receives input from the V glomerulus. LN23 relays sensory input via an extraglomerular CO2 pathway, and manipulation of LN23 activity revealed a dominant role in CO2-induced avoidance behavior.

      Through a careful series of experiments, the authors demonstrate important aspects of these parallel (and sometimes converging) circuits - differential sensitivity to CO2 concentration changes, synaptic plasticity, circuit connectivity, developmental origins, and the effect of chemo and optogenetic manipulations on behavior. Together, they piece together a complex and interconnected circuit diagram for CO2-dependent behaviors that can be modulated by external factors. This finding will be impactful not only for the fly olfactory/gustatory field but also for many others in the sensory neuroscience community who are very interested in understanding state-dependent modulation of sensory circuits.

      Strengths:

      The experiments were well described and controlled. The addition of the developmental trajectory of the LN23 neurons was interesting. The inclusion of multiple levels of analysis from synaptic contacts and activity-dependent labeling of synapses, circuit analysis guided by connectomes, and detailed behavior analysis for each part of the circuit were all strengths.

      Weaknesses:

      The circuit is very complex and interconnected. This is important for its function, but it makes reading through the manuscript a challenge. The diagrams are helpful, but still somewhat confusing, and some of the experimental findings do not completely support the model outlined in the final figure.

      The main difficulty is visualizing the "default/predominant aversive" LN23 circuit - in the final diagram, there is no "stop" sign on that side, although it's depicted as an inhibition of a "go".<br /> Also, importantly, the findings shown in Figure 5 demonstrate pretty convincingly that LN23 inhibition reduces CO2 avoidance "almost entirely". Also supporting a central role for LN23 is the opposite effect of silencing LN23, with chronic CO2 inducing attraction. If this is the case, then where is the contribution of the other canonical aversive pathway? How does the silencing of LN23 override the PNvbi/uni pathways to aversion? Incorporating this into the figure more prominently would improve the understanding of this contribution to the circuit.

      A minor weakness is that CO2 levels were not reduced below ambient air. For the first part of the paper addressing the activation of these circuits, there seemed to be a ceiling effect for the LN23 neurons at ambient CO2 levels. It would be interesting to see if there would be some change to the activity labeling experiments if CO2 were reduced or eliminated from the air.

    2. Reviewer #2 (Public review):

      Summary

      The authors investigate how parallel olfactory pathways contribute to CO₂ valence processing in Drosophila. By combining multiple approaches, the study identifies LN23 as a previously unrecognized component of the CO₂ circuit and proposes a model in which distinct downstream pathways contribute to aversive and attractive behavioral responses. More broadly, the work aims to connect circuit organization with context-dependent sensory processing and behavioral valence.

      Strengths

      A major strength of the study is the integration of multiple complementary approaches spanning anatomy, circuit analysis, and behavior. This combination provides a rich and valuable framework for understanding how CO₂ information may be processed across different levels of the olfactory system. The identification of LN23 as an important component of the CO₂ pathway is particularly interesting and will likely be useful for future studies investigating olfactory processing, behavioral state modulation, and valence coding. The connectomic and anatomical analyses also provide a valuable resource for the community.

      Another strength of the manuscript is its conceptual ambition. The work moves beyond a simple labeled-line view of olfactory processing and proposes that flexible behavioral responses may emerge from interactions between parallel downstream pathways and multimodal integration centers. The behavioral manipulations further support an important role for LN23 in CO₂-related behaviors.

      Weaknesses

      Several aspects of the conceptual interpretation would benefit from additional clarification or more cautious framing relative to the current experimental evidence. In particular, the distinction between atmospheric versus experimentally elevated CO₂ conditions, as well as the interpretation of chronic exposure in terms of habituation, remains somewhat unclear throughout the manuscript.

      Some conclusions regarding valence coding and multimodal integration also appear more inferential than directly demonstrated experimentally, especially when moving from anatomical connectivity to functional interpretation.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Javorski and colleagues investigate how CO2 valence is processed in the Drosophila olfactory system. Although CO2 is classically associated with an aversive labeled‑line pathway, its behavioral significance can be modulated by environmental context, such as the presence of food‑related cues. The circuit‑level mechanisms underlying this flexibility remain incompletely understood. The authors address this gap by examining how CO2 sensory information diverges at early stages of olfactory processing and how distinct neural pathways contribute to opposing behavioral outcomes. By identifying the local interneuron LN23 as a relay for CO2‑induced aversion, the study suggests that CO2 valence processing may begin to diverge at the level of the antennal lobe, prior to synaptic integration in higher‑order brain regions such as the lateral horn.

      Strengths:

      A major strength of this study is its comprehensive, multi-level experimental design that effectively links neuronal identity, synaptic organization, and behavior. The authors combine calcium‑based anatomical mapping, activity‑dependent reporters, optogenetic and thermogenetic manipulations, and connectomic analyses with behavioral readouts under genetically defined neuronal activation or silencing conditions. Specifically, the identification of LN23 as a component of the CO2 avoidance pathway is supported by anatomical, genetic, and behavioral evidence. Both silencing and activation experiments indicate that LN23 plays an important role in mediating CO2‑induced aversive responses. In contrast, manipulation of the projection neurons (PNv bi and PNv uni) produces more modest behavioral effects, suggesting a degree of specificity for LN23‑associated circuitry within the avoidance pathway. Moreover, the use of previous reported connectome to identify downstream third‑order neurons strengthens the proposed circuit model and provides anatomical support for early divergence of CO2 valence processing.

      Weaknesses:

      While the study provides a strong mechanistic framework for CO2 aversion, some aspects of context‑dependent valence modulation are less directly addressed and may benefit from further experimental exploration.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript offers a careful and technically impressive dissection of how subpopulations within the subthalamic nucleus (STN) support reward-biased perceptual decision-making. The authors recorded STN neurons in monkeys performing an asymmetric-reward visual motion discrimination task, then combined single-unit analyses, regression modeling, and drift-diffusion model (DDM) fitting to identify functionally distinct neuronal clusters. Each subpopulation shows unique relationships to computational decision variables - evidence accumulation rate, decision bound, and non-decision time - as well as to post-decision evaluative signals including choice accuracy and reward expectation. The revised manuscript substantially strengthens the original submission by improving both the objectivity of neuron selection and the robustness of the clustering solution.

      Strengths:

      The asymmetric-reward paradigm cleanly separates perceptual and motivational contributions to STN activity, allowing the authors to characterize how neurons blend these distinct sources of information. The dataset is extensive and well-controlled, and the behavioral and neural analyses are tightly integrated. Relating cluster-specific activity to DDM parameters provides an interpretable computational link between population signals and behavior. The clustering solution is now validated across two algorithms, two monkeys, and subsets of trials - establishing that the three-cluster structure is robust. The new Figure 9 offers a conceptually useful, if necessarily speculative, synthesis connecting the identified subpopulations to distinct basal-ganglia pathways (hyperdirect versus indirect). The new Figure 8 documenting the anatomical intermingling of subpopulations is also important, as it directly informs the interpretation of prior and future STN stimulation studies.

      Weaknesses:

      The inferred relationships between neural clusters and DDM parameters remain correlational - the authors now appropriately flag this throughout, and the causal inference gap is acknowledged in the Discussion with concrete proposals for future targeted perturbation strategies. While a generative multi-cluster model would further strengthen mechanistic interpretation, the conceptual framework in Figure 9 provides a reasonable intermediate step given the scope of the study and the absence of simultaneous population recordings, which preclude direct inter-cluster covariation analyses. These remaining limitations are inherent to the experimental design rather than analytical oversights.

    2. Reviewer #2 (Public review):

      This study uses monkey single-unit recordings to examine the role of the STN in combining noisy sensory information with reward bias during decision-making between saccade directions. Using multiple linear regressions and clustering approaches, the authors overall show that a highly heterogeneous activity in the STN reflects almost all aspects of the task, including choice direction, stimulus coherence, reward context and expectation, choice evaluation, and their interactions. The authors report in particular how three classes of neurons map to different decision processes evaluated via the fitting of a drift-diffusion model. Overall, the study provides evidence for functionally diverse and anatomically intermingled populations of STN neurons, supporting multiple roles in perceptual and reward-based decision-making.

      This study follows up on work conducted in previous years by the same team and complements it. Extracellular recordings in monkeys trained to perform a complex decision-making task remain a remarkable achievement, particularly in brain structures that are difficult to target, such as the sub-thalamic nucleus. The authors conducted numerous analyses of STN activities, using sophisticated statistical approaches and functional computational modeling.

      One criticism that I would still make in the revised version of the paper concerns the description of the behavior of the two monkeys which is still minimal, while acknowledging differences in their choice and RT performance that reflect "individual differences in sensitivity to motion stimulus and a common heuristic-based satisficing strategy". This sentence is not clear to me. Moreover, the potential consequences of these differences on neuronal activity are only considered in the cluster analysis done for each of the two animals separately and for which it turns out there is no notable difference.

      Compared to the first version of the paper, the cluster analysis in this revised version yields three distinct populations instead of the previous four. While the authors suggest that these subpopulations play important roles in encoding different aspects of decision-making, the identification of three rather than four subpopulations seems to me an important update that warrants discussion.

      Finally, I think it would have been interesting to identify the level of collinearity in the model proposed by the authors (equation 7). Indeed, one can expect significant collinearity between some of the proposed explanatory factors of neuronal activity, such as choice and coherence level, for example. Similarly, for the analysis relating neuron activity to decision evaluation signals (p 16), firing rates calculated using sliding averages with 1-ms steps are compared, but the method does not specify controls for multiple comparisons or for non-independent data.

    1. Reviewer #1 (Public review):

      The manuscript examines whether insects can use bat odor as a cue of predation risk. The authors focus on the insectivorous bat Scotophilus kuhlii and the cricket Loxoblemmus equestris. They first use fecal DNA metabarcoding to show that crickets are part of the bat's diet, and field surveys to show that L. equestris is abundant at local foraging sites. In laboratory Y-tube assays, the authors show that crickets strongly avoid air carrying bat body odor. Gas chromatography coupled with electroantennographic detection showed that cricket antennae respond to components of bat odor. Chemical analyses identified several volatile compounds, with 2,2-dimethylheptane and (−)-limonene associated with antennal responses. Further analyses suggested that snout secretions are likely to contribute to the bat's body odor. The authors then tested individual compounds. Among the commercially available candidates, (−)-limonene elicited a strong antennal response and was sufficient to cause avoidance in the olfactometer. In field plots, spraying (−)-limonene reduced cricket calling activity relative to pre-exposure levels, whereas calling increased in control plots treated with hexane. Overall, the study argues that crickets can detect a vertebrate predator through olfactory cues and that a single bat-associated volatile can trigger antipredator behavior.

      This is an interesting and enjoyable study that addresses an understudied aspect of predator-prey interactions. The manuscript is clearly written, the experiments are presented in a logical sequence, and the figures are crisp and easy to follow. I really appreciated the combination of behavioral assays, electrophysiology, chemical analysis, and field observations.

      My main issue concerns the identity and biological origin of the proposed bat odor cue, (−)-limonene. Limonene seems like an unusual compound to be emitted endogenously by a mammal, particularly by an insectivorous bat. It would be helpful if the authors could clarify whether mammals are known to synthesize this compound de novo, and, if not, what the likely source of this plant-associated terpene would be in S. kuhlii. Possible sources could include environmental exposure, diet, roosting material, handling, or temporary housing conditions.

      I do not doubt that crickets avoid synthetic (−)-limonene. Indeed, this result is quite plausible given that limonene is widely used in insect repellent or repellent-associated fragrance products. However, this also makes contamination an important issue to address explicitly. How did the authors exclude the possibility that limonene entered the samples from human-associated sources, such as insect repellents, soaps, cleaning products, field equipment, cloth bags, cages, gloves, or other materials used while handling wild-caught bats? It would strengthen the manuscript to report limonene levels for individual bat odor collections, all relevant blanks, and any handling or housing controls.

      More broadly, given the common occurrence of limonene in plants and human-associated products, I am not yet convinced that it would function as a reliable "keystone kairomone" as suggested around line 253. How would crickets distinguish bat-associated limonene from limonene emitted by a mint leaf, citrus peel, pine material, or other non-threatening environmental sources? The authors may wish to soften this interpretation or provide additional evidence that crickets respond to limonene in a bat-specific context, perhaps through concentration, temporal patterning, co-occurring volatiles, or enantiomeric composition.

    2. Reviewer #2 (Public review):

      Summary:

      Many insects possess extremely sensitive olfactory systems that can detect chemical signals from distances of several kilometers. For decades, the arms race between bats and insects has served as a prime example of acoustic co-evolution. The auditory adaptations of insects to echolocation have been well documented. Cricket has a multi-sensory predator recognition system with keen olfactory, tactile, and auditory senses. However, whether crickets can use the scent of bats to avoid them remains unknown at present. The authors hypothesized that cricket prey (Loxoblemmus equestris) might eavesdrop on predator bat (Scotophilus kuhlii) VOCs as an early warning. L. equestris is one of the prey species of S. kuhlii, and the authors demonstrated that the body odor of the insectivorous bat S. kuhlii triggers robust avoidance and electrophysiological responses in the cricket L. equestris, and that a single compound, (-)-limonene, is sufficient to elicit this avoidance in the laboratory and suppress calling in the field. Overall, this paper has a complete chain of evidence and should be a highly praised study.

      Comments:

      (1) Olfactory eavesdropping can transcend the evolutionary divide between vertebrate predators and invertebrate prey, enabling invertebrates to trigger defensive avoidance behaviors in response to predator-derived volatile odors. This phenomenon is empirically well-documented and requires no excessive emphasis.

      (2) Without quantitative analysis and without knowing the relative content of this key substance limonene, I don't quite understand how to determine the concentration of limonene standard for EAD, as well as the concentration in field experiments. How is the concentration of limonene determined in field spraying, and is this actually the case in the wild environment?

      (3) Figures 1C and D should compare the GC-EAD response of L. equestris to the odor of bat body and the odor of bat nasal secretions. It should not be compared with the air control group. Figure 1D has the same problem.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      Noell et al have presented a careful study of the dissociation kinetics of Kinesin (1,2,3) classes of motors moving in-vitro on a microtubule. These motors move against the opposing force from a ~1 micron DNA strand (DNA tensiometer) that is tethered to the microtubule and also bound to the motor via specific linkages (Fig 1A). Authors compare the time for which motors remain attached to the microtubule when they are tethered to the DNA, versus when they are not. If the former is longer, the intepretation is that the force on the motor from the stretched DNA (presumed to be working solely along the length of the microtubule) causes the motor's detachment rate from the microtubule to be reduced. Thus, the specific motor exhibits "catch-bond" like behaviour.

      Strengths:

      The motivation is good - to understand how kinesin competes against dynein through the possible activation of a catch bond. Experiments are well done and there is an effort to model the results theoretically.

      Weaknesses from original round of review:

      The motivation of these studies is to understand how kinesin (1/2/3) motors would behave when they are pitted in a tug of war against dynein motors as they transport cargo in bidirectional manner on microtubules. Earlier work on dynein and kinesin motors using optical tweezers has suggested that dynein shows catch bond phenomenon, whereas such signatures were not seen for kinesin. Based on their data with DNA tensiometer, the authors would like to claim that (i) Kinesin1 and kinesin2 also show catch-bonding and (ii) The earlier results using optical traps suffer from vertical forces, which complicates the catch-bond interpretation.

    2. Reviewer #2 (Public review):

      Summary:

      To investigate the detachment and reattachment kinetics of kinesin-1, 2 and 3 motors against loads oriented parallel to the microtubule, the authors used a DNA tensiometer approach comprising a DNA entropic spring attached to the microtubule on one end and a motor on the other. They found that for kinesin-1 and kinesin-2 the dissociation rates at stall were smaller than the detachment rates during unloaded runs. With regard to the complex reattachment kinetics found in the experiments, the authors argue that these findings were consistent with a weakly-bound 'slip' state preceding motor dissociation from the microtubule. The behavior of kinesin-3 was different and (by the definition of the authors) only showed prolonged "detachment" rates when disregarding some of the slip events. The authors performed stochastic simulations which recapitulate the load-dependent detachment and reattachment kinetics for all three motors. They argue that the presented results provide insight into how kinesin-1, -2 and -3 families transport cargo in complex cellular geometries and compete against dynein during bidirectional transport.

      Strengths:

      The present study is timely, as significant concerns have been raised previously about studying motor kinetics in optical (single-bead) traps where significant vertical forces are present. Moreover, the obtained data are of high quality and the experimental procedures are clearly described.

    3. Reviewer #3 (Public review):

      Summary:

      Several recent findings indicate that forces perpendicular to the microtubule accelerate kinesin unbinding, where perpendicular and axial forces were analyzed using the geometry in a single-bead optical trapping assay (Khataee and Howard, 2019), comparison between single-bead and dumbbell assay measurements (Pyrpassopoulos et al., 2020), and comparison of single-bead optical trap measurements with and without a DNA tether (Hensley and Yildiz, 2025).

      Here, the authors devise an assay to exert forces along the microtubule axis by tethering kinesin to the microtubule via a dsDNA tether. They compared the behavior of kinesin-1, -2, and -3 when pulling against the DNA tether. In line with previous optical trapping measurements, kinesin unbinding is less sensitive forces when the forces are aligned with the microtubule axis. Surprisingly, the authors find that both kinesin-1 and -2 detach from the microtubule more slowly when stalled against the DNA tether than in unloaded conditions, indicating that these motors act as catch bonds in response to axial loads. Axial loads accelerate kinesin-3 detachment. However, kinesin-3 reattaches quickly to maintain forces. For all three kinesins, the authors observe weakly-attached states where the motor briefly slips along the microtubule before continuing a processive run.

      Strengths:

      These observations suggest that the conventional view that kinesins act as slip bonds under load, as concluded from single-bead optical trapping measurements where perpendicular loads are present due to the force being exerted on the centroid of a large (relative to the kinesin) bead, need to be reconsidered. Understanding the effect of force on the association kinetics of kinesin has important implications for intracellular transport, where the force-dependent detachment governs how kinesins interact with other kinesins and opposing dynein motors (Muller et al., 2008; Kunwar et al., 2011; Ohashi et al., 2018; Gicking et al., 2022) on vesicular cargoes.

  2. Jun 2026
    1. Reviewer #1 (Public review):

      This manuscript by Zhang et al addresses how Pi scarcity/depletion drives PMB resistance in Enterobacteriaceae, because it proposes a mechanistically distinct pathway from the better-known PhoBR-linked phospholipid-remodeling responses in other Gram-negatives. The authors also suggest an intervention strategy based on Mg repletion or Fe chelation. The results are substantial and include genetic analyses, mass spectrometry, reporter assays, phospho-signaling readouts, metal quantification, and comparative analyses across enterobacterial species.

      The paper reads well with the emphasis on the Mg loss followed by Fe mobilization during Pi depletion that induces PmrAB TCS activation for lipid A modification through transcriptional activation of ugd and arn genes. However, PmrAB is a well-known TCS responsible for PMB resistance through lipid A modification in the extensive studies by the Groisman lab. PmrA is a well-known transcriptional regulator to activate the transcription of the ugd gene in Salmonella and Yersinia by Mg depletion and Fe mobilization. Therefore, the current paper should focus more on the upstream signaling to connect the dots between Pi depletion and Mg loss. This is important because Ugd gene expression is not affected by PmrAB in Pi depletion. It should also be considered that Mg loss is temporally associated with Fe mobilization, but the manuscript does not quantitatively show that Mg dissociation/redistribution is sufficient to trigger Fe mobilization in the absence of Pi depletion, considering that Mg is a macronutrient, whereas Fe is a micronutrient.

      Second, the relationship between arn and ugd regulation needs a clearer mechanistic resolution to orchestrate the synthesis of the L-Ara4N during Pi depletion, because the manuscript shows that arn activation is PmrAB-dependent, whereas ugd is only partially PhoBR-dependent and not dependent on PmrAB. Yet the current model and narrative treat the system as a unified "ugd-arn" output. This should be carefully addressed, given that Pi depletion and Mg depletion might trigger different signaling modules.

      Third, the manuscript argues that this is a "conserved" circuit in Enterobacteriaceae. The evidence for conservation is presently strongest in E. coli MG1655 and includes supportive observations in E. coli O157, one UTI strain by lipid A MS, several UTI isolates by killing assay, and S. Typhimurium for key phenotypes. No direct mechanistic validation is shown in other important genera belonging to Enterobacteriaceae, which include Klebsiella, Enterobacter, Citrobacter, Yersinia, Serratia, or other clinically important Enterobacteriaceae.

      Fourth, the reversal and translational claims are a bit stronger than the current evidence supports. The title and Abstract state that identifying and targeting the circuit reverses Pi depletion-driven PMB, and the manuscript suggests a pharmacological intervention framework based on Mg supplementation or Fe chelation. The actual intervention evidence is limited to in vitro killing assays under acute Pi-depleted minimal-medium conditions in E. coli and S. Typhimurium, without in vivo testing, in that the experiments are performed under an acute 3-hour starvation in MOPS medium, not in host-mimicking or infection-relevant environments. The reversal needs to be shown not only at the level of survival curves, but also by the quantitative MIC/MBC measurements.

      More importantly, the authors demonstrated that the signaling module upon Pi limitation in Enterobacteria differs from that in other Gram-negative bacteria such as Pseudomonads. However, they did not discuss why this difference would impact the life of Enterobacteria. The authors should consider the glycolytic pathways (i.e., EMP pathway for enterobacteria vs ED pathway for pseudomonads), in that the ED pathway requires less Pi, whereas the EMP pathway requires more Pi. It should be noted that Pi supply is highly limited in the natural environment for the free-living bacteria, rather than in the host environment for the commensals.

    2. Reviewer #2 (Public review):

      Summary:

      Using E. coli K-12 as a model system, the authors investigated how phosphate (Pi) depletion induces polymyxin resistance in Enterobacteriaceae, which notably lack the canonical phospholipid remodeling pathways commonly associated with phosphate starvation responses. They demonstrated that low-phosphate conditions promote L-Ara4N modification of lipid A, thereby enhancing polymyxin resistance. Proteomic analyses revealed significant upregulation of the arn operon and ugd under phosphate-limited conditions, and promoter activity assays further confirmed that both promoters are strongly induced during Pi depletion. Through gene deletion experiments, the authors showed that arn expression is regulated by the PmrAB two-component system, whereas ugd is controlled by PhoBR under low-phosphate conditions. Using ICP-MS analysis, they further found that phosphate limitation increases cell-associated Fe levels, and that reducing Fe availability abolishes PmrAB-dependent activation of the arn operon. Finally, the study demonstrated that Mg supplementation and Fe chelation can suppress polymyxin resistance, highlighting the critical role of metal homeostasis in phosphate depletion-induced antimicrobial resistance.

      Strengths:

      Overall, I found this study to be well conducted, with convincing results that strongly support the proposed model. Through comprehensive genetic analyses and detailed characterization of metal ion homeostasis and membrane lipid modifications, the authors uncovered a novel regulatory connection among Mg²⁺, Fe³⁺, and the PmrAB pathway, a key driver of polymyxin resistance. These findings are highly interesting and have important implications for understanding the evolution of the Fe-sensing PmrAB system, as well as the broader role of nutrient availability in shaping antibiotic resistance.

      Weaknesses:

      I did not identify any particular weaknesses.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript examines how phosphate limitation primes E. coli and Salmonella for defense against polymyxin antibiotics. Other environmental signals, such as altered levels of extracellular Mg or Fe, were previously shown to induce polymyxin resistance in Enterobacteriaceae, and phosphate limitation was known to augment polymyxin resistance in other organisms such as A. baumannii and P. aeruginosa; however, whether phosphate limitation boosted polymyxin resistance in Enterobacteriaceae was not known. This study shows that this indeed occurs, and the mechanism is distinct from that in A. baumannii and P. aeruginosa. The model proposed is: (1) low phosphate causes bacteria to jettison Mg to balance cellular P/Mg ratio, (2) extracellular Fe3+ associates with the cell envelope to replace Mg as LPS-bridging cation, and (3) envelope Fe3+ activates PmrAB, which mediates a transcriptional response leading to L-Ara4N modification of lipid A and protection from polymyxin B. Flooding with Mg or chelating the surface Fe3+ blocks the protective response to low phosphate in E. coli and Salmonella but not in P. aeruginosa despite Fe still mobilizing in the latter. The differential response between Enterobacteriaceae and P. aeruginosa is connected to the presence/absence of Fe-sensing motifs in the PmrB periplasmic domain.

      Strengths:

      The strengths of the study are the wide array of approaches used and the thorough characterization of a novel stress-response mechanism involving metal mobilization. Combined with the analysis of multiple bacterial families, the results clarify how different strategies have evolved to defend against polymyxins during phosphate starvation.

      Weaknesses:

      Controls are needed in some of the genetic experiments, namely complementation, to verify linkage of defective survival phenotypes to the genes mutated and to rule out protein stability defects for the PmrB variants tested. In addition, the generalizability of the metal mobilization feature of the model would be strengthened by examining media with differing metal composition. Claims about antibiotic resistance would be strengthened by data examining bacterial growth in the presence of an antibiotic.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      Knowing that small pupil-size variations accompany brightness variations (even when these are illusory), the authors asked whether pupil constrictions would accompany the synesthetic perception of a brighter color (compared with a darker one), induced by the presentation of a black-white character. This grapheme-colour synesthesia is only experienced by few participants, sixteen of whom were enrolled in this study. The results reliably showed that a relative pupil constriction would "betray" the perception of a brighter color in these participants, while no such effect would be observed in control participants who were asked to report a color in association with each grapheme, even though they did not perceive any.

      Strengths:

      The main strength of the study lays in its combination of psychophysics (brightness ratings) and pupillometry, which allowed for showing clear-cut results.

      Impact:

      This work is likely to improve our understanding of synesthesia, providing a new tool to quantify the subjective sensations; an interesting potential extension would be using pupillometry for tracking changes over time of the synesthetic experiences, opening up the possibility to evaluate the importance of learning for this peculiar experience.

    2. Reviewer #2 (Public review):

      Synesthesia is a neurological condition where stimulation of one sensory channel leads to involuntary, automatic, and consistent experience of another, unrelated percept. For example, Sir Francis Galton (1880, Nature) famously described the robust tendency of some individual (synesthetes) to associate numerals with a distinct color. Ever since, synesthesia keeps attracting a broad interest in the cognitive neurosciences in light of its implications for the study of domains such as perception, consciousness, and brain connectivity, among others.

      Strauch, Leenaars, and Rouw measured pupil size in a group of 16 grapheme-color synesthetes and two matched control groups. The participants were presented with gray digits - that is, visual stimuli having identical physical properties in terms of brightness. Each participant subsequently rated the corresponding evoked color and brightness: unlike controls, synesthetes did so in a very consistent and reliable fashion. Accordingly, this was also shown in their pupils: despite the same objective luminance, digits associated with brighter percepts caused their pupils to constrict and digits associated with darker percepts caused their pupils to dilate more than controls. These results highlight how crossmodal correspondences are deeply rooted in synesthetes, and puts forward pupillometry as a particularly appealing biomarker for some phenomenological experience (at least those grounded in "brightness").

      Further strengths of the technique are its temporal resolution and its responsiveness to several constructs. Across several tasks, the authors show for example that responses to synesthetic light are somewhat slower than responses to real light (i.e., they are likely mediated), but at the same time faster than responses to mental imagery. The role of mental imagery can also be reasonably dismissed when considering the second feature of pupil size: its responsiveness to mental effort and cognitive load. The pupils tend to dilate with demanding, challenging tasks, and this was the case when control participants were asked to report the color of a digit for which they did not consistently experience a synesthetic association. The same task was, instead, seemingly effortless for synesthetes, again speaking in favor of the automaticity of number-color correspondences in their case.

      Overall, the findings by Strauch, Leenaars, and Rouw are highly significant for the field and likely to be impactful. The strength of their evidence, when accounting for the relatively small sample size and the inherent variability of both phenomenology (color perception and subjective reporting) and physiology (pupil size), is adequate and sufficiently convincing.

    1. Reviewer #1 (Public review):

      Summary:

      This retrospective study provides a new data regarding the prevalence of pain in women with PCOS and its relationship with health outcomes. Using data from electronic health records (EHR), the authors found a significantly higher prevalence of pain among women with PCOS compared to those without the condition: 19.21% of women with PCOS versus 15.8% in non-PCOS women. The highest prevalence of pain was conducted among Black or African American (32.11%) and White (30.75%) populations. Besides, women with PCOS and pain have at least a 2-fold increased prevalence of obesity (34.68%) at baseline compared to women with PCOS in general (16.11%). Also, women with PCOS had the highest risk for infertility and T2D, but women with PCOS and pain had higher risks for ovarian cysts and liver disease. Regarding these results, authors suggested the critical need to address pain in the diagnosis and management of PCOS due to its significant impact on patient health outcomes.

      Strengths:

      The problem of pain assessment in PCOS patients is well described and authors provided a clear rationale selection of the retrospective design to investigate this problem.

      A large number of analyzed patient's records (76,859,666 women) and its uniformity increases the power of the study. Using the Propensity Score Matching makes possible to reduce the heterogeneity of the compared cohorts and influence of comorbid conditions.

      Analysis in different ethnic cohorts provides actual and necessary data regarding the prevalence of pain and its relationship with different health conditions that will be helpful for clinicians to make a diagnosis and manage the PCOS in women of different ethnicity.

      Assessment of risk of different health conditions as including PCOS-associated pathology as other common groups of diseases in PCOS women with or without pain allows to differentiate the risk of comorbid conditions depending on the presence of one symptom (pelvic or abdominal pain, dysmenorrhea).

      Weaknesses:

      The significant weakness of the study is the absence of Latin American cohort. Probably the White cohort includes Latin Americans or others, but results of the study cannot be extrapolated to particular White ethnicities.

      Comments on revised version:

      At present, I have no questions or recommendations for the authors, as they have exhaustively addressed the previous comments and incorporated the necessary corrections.

    2. Reviewer #2 (Public review):

      Summary:

      The study offers a thorough analysis of the prevalence of pain in women with polycystic ovary syndrome (PCOS) and its associations with health outcomes across various racial groups. Furthermore, the research investigates the prevalence of PCOS and pain among different racial demographics, as well as the increased risk of developing various conditions in comparison to individuals who have PCOS alone.

      Strengths:

      The study emphasizes pain as a significant comorbidity of PCOS, an area that is critically underexplored in existing literature. The findings regarding the increased prevalence of some of the diseases in the PCOS + pain group provide valuable direction for future research and clinical care. I believe physicians should incorporate pain score assessments into their clinical practice to improve patients' quality of life and raise awareness about pain management. If future research focuses on the mechanisms of pain, it would provide a better understanding of pain and allow for a focus on the underlying causes rather than just symptomatic management. The study also highlights the association between PCOS+pain and various comorbidities, such as obesity, hypertension, and type 2 diabetes, as well as conditions like infertility and ovarian cysts, offering a holistic view of the burden of PCOS.

      Weaknesses:

      Due to the nature of retrospective design, some data may not be readily available in the EHR system. Diagnosis of PCOS, pain is based on ICD codes, which may lead to misclassification and may not capture symptom severity or patient-reported experiences.

    1. Reviewer #1 (Public review):

      Summary:

      The authors aim to demonstrate that PGLYRP1 plays a dual role in host responses to B. pertussis infection. PGLYRP1 signaling is known to activate bactericidal responses due to recognition of peptidoglycan. Through NOD1 activation and TREM-1 engagement, it appears PGLYRP1 also has immunomodulator activities. The authors present mouse knockout studies and gene expression data to illustrate the role of PGGLYRL1 in relation to B. pertussis peptidogylcan. Mice lacking PGLYRP1 had slightly lower pathology scores. When TCT peptidoglycan was removed from the bacteria, surprisingly IL23A, IL6, IL1B and other pro-inflammatory genes encoding cytokines increased. The relationship to TCT and PGLYRP1 suggest the pathogen uses this strategy to decrease immune activation. The authors when on to show the relationship between PGLRP1 and TREM-1 as mediated with PGN using various versions of peptidoglycan. The study presents multiple angles of data to back up its findings and demonstrates an interesting strategy used by B. pertussis to down-regulate innate responses to its presence during infection.

      Strengths:

      Use of knockout mice of the key factor being considered paired with isogenic B. pertussis strains to reveal the mechanism of immune modulation to benefit the bacteria. The authors used in vivo gene expression paired with in vivo assays to establish each aspect of the mechanism.

      Weaknesses:

      The main focus was on innate responses, but some analysis of antigen specific antibody responses could improve the impact of the findings.

      Comments on revised version.

      I have no further input to add.

    2. Reviewer #2 (Public review):

      Since its original discovery, the mechanistic basis for TCT-mediated pathogenesis of Bordetella pertussis has been a moving target and difficult to uncouple from confounding variables. The current study provides some exciting data that suggest PGLYRP-1 modulates host responses upon 'activation' by TCT. While there are some strengths associated with the unbiased approaches and collective data to support the claims associated with TCT and PGLYRP-1's function in this system, caution should be used when interpreting and extrapolating some the information provided. While many of the initial concerns were addressed, one concern remains: using whole, intact PG sacculi from other species for comparative studies with a fragment of released PG (i.e., TCT).

      Comments on revised version.

      I have no further comments.

    3. Reviewer #3 (Public review):

      Summary:

      This study evaluates the contributions of the mammalian PG-binding protein PGLYRP1 to Bordetella infection. The authors find potential roles for PGLYRP1 in both bacterial killing (canonical) and regulation of inflammation (non-canonical). While these are interesting findings and the idea that PG fragment release has differential impacts on infection depending on fragment structure, the study is ultimately limited by the lack of connection between the in vivo and in vitro experiments and determining the precise mechanism of how PGLYRP1 regulates host responses and bacterial fitness during infection requires further study.

      Strengths:

      (1) The combination of scRNAseq with in vitro and in vivo assays provides complementary views of PGLYRP1 function during infection.

      (2) The use of TCT-deficient B. pertussis provides a useful control and perturbation in the in vitro assays.

      Weaknesses/Areas for future study:

      (1) The study does not ultimately resolve the initial early versus late phenotype divergence. While the in vitro assays suggest explanations for their in vivo observations, further mechanistic links are lacking and necessary for the author's conclusions throughout. To state one example, what is the early and late infection phenotype of TCT- Bp in mice lacking PGLYRP1? RNAseq data is reported from these mice but there are no burden or pathology studies. Furthermore, what are the neutrophil phenotypes (NOD-1/TREM-1 activation) in vivo? And are they dependent on PGLYRP1 and/or TCT? This will be an important topic of future study, as noted by the authors in their response.

      (2) It is unclear whether or how the NOD1 and TREM-1 pathways interact.

      (3) Many of the study's conclusions rely on the use of HEK293 reporter lines in the absence of bacterial infection, which may not be physiologically representative.

      Comments on revised version.

      The authors have responded adequately to my comments.

    1. Reviewer #1 (Public review):

      Summary:

      In their manuscript, Zhou and colleagues present a detailed look at how the JSP functions differently in the various cells of a breast tumor. The authors have effectively shown that the JSP acts as a double-edged sword, as it helps T cells fight cancer but also allows tumor cells to grow and avoid ferroptosis. These findings are important because they identify a useful biomarker to predict how TNBC patients might respond to PD-1 inhibitors.

      Strengths:

      This work is important because it provides a clear explanation for the conflicting roles of the JSP in the tumor environment. The evidence is solid, as it combines data from thousands of patients with single-cell analysis and lab experiments to confirm the role of STAT4 in cancer progression and immunity.

      Comments on revised version:

      The authors made a significant effort to improve the manuscript. My comments were sufficiently addressed.

    2. Reviewer #2 (Public review):

      Summary:

      The JAK-STAT pathway (JSP) exhibits cell-type-specific functional heterogeneity in breast cancer. This study investigates the JSP in breast cancer and its response to anti-PD‑1 immunotherapy. JSP displays distinct cell‑type heterogeneity: it promotes malignant phenotypes and immunosuppression in tumor cells, while enhancing cytotoxicity and reducing exhaustion in T cells. Elevated JSP expression correlates with improved immunotherapy responses, especially in triple‑negative breast cancer. These findings highlight the paradoxical roles of JSP, indicating that broad inhibition may compromise anti‑tumor immunity.

      Strengths:

      The major strengths of this study include the comprehensive characterization JSP heterogeneity across epithelial, tumor, and T cells in breast cancer. The identification of JSP and STAT4 as predictive biomarkers for immunotherapy response, particularly in triple‑negative breast cancer, provides clinically relevant insights for patient stratification.

      Weaknesses:

      The corresponding content has been revised.

    3. Reviewer #3 (Public review):

      Summary:

      This multi-omics study by Zhou et al elucidates the context-dependent roles of the Janus kinase-signal transducer and activator of transcription (JAK-STAT) pathway (JSP) across different cellular compartments in the breast cancer tumor microenvironment. While bulk JSP activity is associated with a favorable prognosis, single-cell analysis reveals a paradoxical landscape: high JSP in T cells drives anti-tumor cytotoxicity and reduces exhaustion, whereas high activity in tumor epithelial cells promotes malignancy and immunosuppression via the MIF-CD74 signaling axis. The JSP score (immune-related) serves as a robust predictive biomarker for response to anti-PD-1 immunotherapy, particularly in triple-negative breast cancer (TNBC). Furthermore, the study identifies the STAT4/SLC47A1 axis as a critical mechanism through which tumor cells resist ferroptosis, facilitating disease progression. These findings suggest that broad JAK-STAT inhibition may be counterproductive in cancer therapeutics; instead, therapeutic success depends on precise modulation and carefully timed interventions to preserve its T-cell-associated functions. This study may inspire future studies to explore specific factors that selectively modulate JAK-STAT activity in immune cells to achieve favorable therapeutic outcomes.

      Strengths:

      Significant therapeutics implications

      Weaknesses:

      Limited molecular mechanisms

      Comments on revised version:

      The authors have addressed my comments

    1. Reviewer #2 (Public review):

      Summary:

      This study aims to establish a rational framework for designing bacterial probiotics against respiratory infections. The central hypothesis is that in vitro antagonism, particularly through metabolic niche overlap with a pathogen, predicts in vivo efficacy.

      Strengths:

      (1) Systematic pipeline: The study integrates bacterial isolation, in vitro characterization, model development, and in vivo validation into a cohesive workflow.

      (2) Quantitative model: The introduction of the Niche Index (NI) and Niche Index Fraction (NIF) provides a novel, quantitative tool for predicting probiotic efficacy based on ecological principles.

      (3) Mechanistic insight: The work dissects different modes of action, clearly demonstrating that inhibition can be driven by specialized metabolite production (CP8) or carbon resource competition (e.g., CP7), with lactate utilization identified as a key factor.

      Weaknesses:

      (1) Limited model generalizability: The predictive power of the NI model is not universal. It fails to account for the in vivo inefficacy of CP8 (a metabolite-dependent inhibitor) and cannot explain the short-term protection conferred by some non-inhibitory CPs in vivo, suggesting unmodeled mechanisms like immune priming are at play.

      (2) Preliminary nature of key findings: The emphasis on lactate consumption as a critical predictor, while interesting, is not sufficiently explored to establish its general importance beyond the specific strains and conditions tested.

      Appraisal:

      The authors successfully achieve their aim of establishing a rational probiotic-design pipeline. The data robustly support the conclusion that metabolic niche overlap predicts efficacy for many strains, while also clearly delineating the model's limitations, as acknowledged by the authors.

      Impact:

      This work provides a valuable methodological framework for hypothesis-driven probiotic discovery. The quantitative Niche Index offers immediate utility to the field and, with further refinement, has the potential to become a fundamental tool for developing respiratory therapeutics.

      Comments on revised version.

      I thank the authors for their meticulous revisions.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a potentially important integrative model linking spontaneous retinal waves, apoptosis, microglial activity, and vascular development during postnatal retinal maturation. Its significance lies in proposing a mechanistic framework that could reshape understanding of how neural activity and tissue remodeling are coordinated in the developing central nervous system. The evidence is strengthened by the use of multiple complementary techniques, including Ca++ imaging, high-throughput electrophysiology, transcriptomics, histology, and pharmacology.

      Strengths:

      (1) Multimodal Validation: The authors correlate large-scale functional imaging (calcium imaging and MEA) with high-resolution structural and molecular data (scRNA-seq and IHC), providing strong topographical evidence for the "centrifugal expansion" pattern.

      (2) The primary significance lies in identifying apoptotic Retinal Ganglion Cells (RGCs) as the physiological "pacemakers" for stage II retinal waves. By linking programmed cell death directly to neural activity and subsequent angiogenesis, the authors propose a self-regulating developmental loop.

      Weaknesses:

      (1) While the PANX1 pharmacological data provide compelling functional support, extending these conclusions to the broader CNS may be premature. Additional direct mechanistic validation would further strengthen the claim of causality.

      (2) While the manuscript beautifully illustrates the co-occurrence of events during retinal development, strengthening the distinction between correlation and direct causation would enhance the impact of the findings.

    2. Reviewer #2 (Public review):

      Summary:

      Savage et al. investigate the synchronization of retinal Ca2+ waves with developmental cell death, microglia activation, and vascular outgrowth. These developmental processes occur through a mechanism where apoptotic cells release ATP through Panx-1 channels to stimulate both Ca2+ retinal waves and microglia activation. Using scRNAseq, the authors classify autofluorescence cell clusters (ACCs) at the leading edge of vasculature outgrowth as Hmox-1+ microglia. From here, they show microglia engulfment of apoptotic RGCs, and the potential release of ATP may contribute to Ca2+ wave generation. The authors demonstrate these mechanisms through the use of two pharmacological agents to either block the ATP release from Panx-1 or block receptor binding to ATP. Furthermore, while previous studies have described the site of initiation of retinal Ca2+ waves as random, this study shows that the initiation of Ca2+ waves is biased to the leading edge of vascular growth in the developing retina. To do this, the authors use a combination of wide-field Ca2+ imaging and multi-electrode arrays to pinpoint the sites of Ca2+ wave initiation in the developing retina.

      Strengths:

      The authors use several techniques to interrogate these mechanisms, including single-cell RNAseq, wide-field Ca2+ imaging, and multi-electrode arrays. With these experiments, this manuscript proposes several novel ideas, such as ATP as the Ca2+ wave-initiating cue, and the localization of the Ca2+ wave initiation to the leading edge of vascular growth.

      Weaknesses:

      The main weakness of the manuscript is the overreliance on only two pharmacological agents to test the central hypotheses. These conclusions would be strengthened if, in addition to their pharmacological manipulations, they used genetic knockout models to perturb programmed cell death or ATP release (i.e., BAX-KO, Panx-1 KO).

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Lei and co-workers aim to uncover the genetic underpinnings of thermal adaptation across three strains of the diamondback moth (Plutella xylostella) through experimental evolution over three years under three different thermal regimes. They identify systematic differences in trait responses (e.g., survival, fecundity), metabolic profiles, gene expression, and in the amino acid sequence of the PxSODC gene, among others. These results suggest that the diamondback moth has a strong potential for rapid physiological adaptation to different thermal regimes. Overall, this is a comprehensive and generally well-executed study that addresses an important question in the face of ongoing climate change.

      Strengths:

      The authors employ multiple approaches to identify signatures of thermal adaptation across the three strains, such as trait performance comparisons, metabolomics, transcriptomics, and amino acid sequence comparisons. All these different angles form a convincing picture of the underlying factors that underpin thermal adaptation in this experimental system. The manuscript is also generally well written and easy to understand.

    2. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors set out to better understand the genetic mechanisms underlying thermal adaptation in insects. They experimentally evolved diamondback moth (Plutella xylostella) populations - a pest species with a wide distribution - under both hot (12h:12h 32{degree sign}C/27{degree sign}C) and cold (15{degree sign}C/10{degree sign}C) thermal conditions, and conducted phenotypic assays and metabolic and transcriptomic profiling to analyze how populations changed to deal with this thermal stress compared to the nonevolved ancestral population (constant 26{degree sign}C). Phenotypic assays showed that evolved hot populations had increased survival at high temperatures (42-43{degree sign}C) while evolved cold populations had lower freezing points compared to the ancestral population. When measured at the constant 26{degree sign}C conditions, metabolic and transcriptomic profiles of 3rd instar larvae from the evolved population were distinctive from the ancestral population, with a set of overlapping metabolic and transcriptomic pathways that were significantly differentially expressed in both hot and cold evolved populations compared to the ancestral. The authors narrowed down this set of candidate genes further by focusing on genes with high expression levels overall, whose expression profile was correlated with differentially expressed metabolites, and that contained mutants in both hot and cold strains. From this set, they chose the PxSODC gene for further functional validation, as it has previously been shown to be involved in the response of insects to abiotic stress with its antioxidative role in cellular defense. At the constant 26{degree sign}C, this gene showed lower expression across development in evolved strains compared to the ancestral population, while it showed similar expression patterns under thermal stress. Knockdown of PxSODC resulted in decreased survival rates at high temperatures and higher freezing points compared to the ancestral population. Based on this validation, the authors hypothesize that the non-synonymous mutation in the PxSODC gene that they found in the cold and hot evolved populations might alter the conformation of the PxSODC protein, increasing enzyme capacity. Their experimental evolution experiment furthermore indicates the capacity of the pest species, the diamondback moth, to adapt to a wide range of temperatures, providing insights into its capacity for global dispersal.

      Strengths:

      (1) The authors did a tremendous amount of work to characterize the mechanisms underlying thermal adaptation in the diamondback moth, artificially selecting populations for three years in the lab and characterizing how they evolved as a result at different biological levels: from phenotypes in different life stages, to larval metabolites and gene transcription, to functionally validating how one of the resulting gene candidates influences the capacity to deal with thermal stress.

      (2) The paper identifies and provides further evidence for candidate genetic mechanisms that might be particularly important for thermal adaptation in insects, including lipid metabolism, oxidoreductase activity, and DNA methylation. It is furthermore interesting that the authors found similar mechanisms to be involved in both the adaptation to cold and hot environments. Their functional validation of some of the genes involved in these mechanisms is very useful to understand how these genes might be causally involved in insect thermal adaptation.

      (3) The paper also has applied value: the diamondback moth is a pest species with a wide distribution, so understanding its adaptive capacity to different thermal environments is important for predicting the prevalence and potential further range expansion of this species under future climate change.

    1. Reviewer #1 (Public review):

      Summary:

      This research sheds light on the nuanced role of ABHD6 in regulating AMPARs, highlighting its interaction with TARP γ-2 as a critical factor in modulating receptor gating kinetics. It is crucial to understand that although ABHD6 alone does not alter AMPAR kinetics, its presence alongside TARP γ-2 accelerates AMPAR deactivation and desensitization, thereby affecting synaptic transmission dynamics.

      Strengths:

      Important findings in the research include:<br /> - ABHD6 does not affect the gating kinetics of GluA1 and GluA2(Q) homomeric receptors independently.<br /> - In the presence of TARP γ-2, ABHD6 accelerates deactivation and desensitization of these receptors, regardless of their splicing or editing isoforms.<br /> - The effect is consistent for both homomeric GluA1 and GluA2(Q) receptors and heteromeric GluA1i/GluA2(R)i-G receptors.<br /> - The recovery from desensitization of GluA1 with the flip splicing isoform is slowed by ABHD6 in the presence of TARP γ-2.

    2. Reviewer #2 (Public review):

      Summary:

      Cong et al. investigated the regulatory effects of ABHD6 on AMPARs. The authors performed adequate electrophysiology recordings to show the exact pattern of this regulation and covered major critical points.

      Strengths:

      The authors have performed high-quality ephys recordings and examined all potential regulatory aspects of ABHD6 on AMPARs. This is important to understand the AMPAR functions.

      Weaknesses:

      (1) The authors discussed CNIH-2 extensively from line 92-110 in the introduction, however, they did not perform related experiments. I suggest they move this part to the discussion where they also discussed the roles of CNIH.

      (2) The authors need to report the "n" for all the experiments they have presented in this manuscript. How many cells were recorded in each condition? How many batches? This information has to be in all of the figure legends, but it is missing except Fig. 4.

      (3) One question is what the physiological meanings of this regulatory effect are. The authors may consider adding some discussions.

      (4) About statistics. The authors need to add more details and make sure their statistics sound. For example, they also need to check the equality of variances. In their Table EVs, where the P values are reported, the authors need to report which statistics they have used, one-way ANOVA, K-W test, or others, and the exact post-hoc test type for each comparison. For one-way ANOVA, report the F values simultaneously with the P values in all figure legends.

      (5) Fig. 3J, the authors need to correct the label of the Y axis. It is shifted.

      Comments on revised version.

      In the revised manuscript, the authors have addressed all my concerns. The manuscript has been substantially strengthened by additional data and discussion.

    1. Reviewer #1 (Public review):

      Summary:

      Brunsdon et al. present a zebrafish model of mosaic PIK3CA activation to investigate mechanisms underlying PIK3CA-related overgrowth spectrum (PROS), with a particular focus on non-cell-autonomous mechanisms of tissue overgrowth. The study is timely and addresses an important gap in the understanding of how mosaic activation of PI3K signaling leads to tissue-specific developmental abnormalities.

      Using a Tol2-based mosaic expression system combined with single-cell transcriptomics, the authors provide evidence suggesting that mutant PIK3CA-expressing cells influence surrounding wild-type tissues through indirect signaling mechanisms, contributing to vascular malformations and tissue overgrowth.

      Overall, the work presents an interesting and potentially impactful model for studying mosaic PIK3CA-driven overgrowth and non-cell-autonomous signaling mechanisms. However, several aspects require clarification, additional controls, and improved presentation to strengthen the mechanistic conclusions and overall impact of the study.

      Strengths:

      This study addresses an important and timely question by investigating the mechanisms underlying mosaic PIK3CA activation in the context of PROS, a condition for which developmental mechanisms remain poorly understood. The use of a mosaic zebrafish model is particularly appropriate, as it closely reflects the mosaic nature of PIK3CA mutations observed in patients and allows the investigation of non-cell-autonomous effects.

      Another major strength of the study is the integration of single-cell transcriptomics, which provides valuable insight into potential signaling pathways involved in indirect tissue overgrowth and offers a rich dataset for hypothesis generation. The authors also propose an interesting conceptual framework in which PI3K-activated cells influence surrounding tissues through paracrine signaling, which could have broader implications beyond PROS and contribute to understanding mosaic developmental disorders more generally.

      Finally, the work has potential translational relevance, as identifying mechanisms driving mosaic PI3K activation and non-cell-autonomous signaling could inform future therapeutic strategies for PROS and related conditions.

      Weaknesses:

      Despite these strengths, several aspects of the study require clarification and additional experimentation.

      Major comments:

      (1) The Tol2-based system results in mosaic overexpression of mutant PIK3CA in the presence of endogenous wild-type PIK3CA, making it difficult to determine how co-expression of WT and mutant proteins influences the observed phenotypes. While mosaic expression is relevant to PROS, a complementary approach in which endogenous PIK3CA is knocked out prior to introducing mutant variants would allow clearer interpretation of mutant-specific effects.

      (2) The authors do not clearly describe the validation of editing or integration efficiency. It would be important for the authors to clarify whether sequencing was performed to confirm integration, to quantify the proportion of mosaic expression, and to measure transgene expression levels. These controls would strengthen confidence in the model and interpretation of the results.

      (3) The manuscript would benefit from rescue experiments to strengthen causal conclusions. It remains unclear whether the phenotypes induced by PIK3CA PROS variants can be rescued, either through expression of wild-type PIK3CA, pharmacological inhibition of PI3K signaling, or assessment of developmental reversibility. Such experiments would strengthen the link between PI3K activation and the observed phenotypes.

      (4) The authors propose candidate signaling molecules mediating non-cell-autonomous effects downstream of PI3K hyperactivation; however, these conclusions remain speculative, as no functional validation is provided. Testing selected candidate mediators identified in the RNA-seq dataset would significantly strengthen the mechanistic conclusions.

    2. Reviewer #2 (Public review):

      In this manuscript, Burnsdon et al. aim to study PIK3CA-related overgrowth spectrum (PROS) by establishing a mosaic zebrafish model with overexpression of pik3ca carrying hotspot mutations, coupled with an mScarlet+ reporter. Using fluorescence microscopy, the authors demonstrated that overexpression of pik3ca with a number of hotspot mutations led to mesodermal and particularly vascular malformations in the zebrafish model. Interestingly, they found a paucity of mScarlet+ mutant cells in the vascular lesions, consistent with the finding of low PIK3CA mutation burden in PROS tissue. Such data suggest a non-cell-autonomous effect of PIK3CA mutation. Following this logic, the authors performed single-cell RNA-Sequencing on zebrafish overexpressing WT pik3ca and mutant pik3ca at 19 hpf, and demonstrated widespread transcriptomic perturbations across multiple lineages, including lineage frequencies, key cell pathways, and cell-cell interactions. Importantly, they demonstrate that mScarlet+ cells carrying mutant pik3ca cluster separately from other cell types, do not demonstrate clear lineage identity, and have a general downregulation in signaling components.

      Overall, the conclusions in the manuscript are well-supported by the presented data. The imaging studies are particularly convincing. The transcriptomic analysis generated a list of potential pathways to further investigate and potentially target with future therapeutic interventions. Importantly, this study provides a valuable in vivo model of PROS that: 1) recapitulates key features of PROS (e.g., multiple mesodermal defects, paucity of mutation burden in lesions suggesting non-cell-autonomous interactions); 2) is scalable; and 3) offers direct visualization of lesion development, compatible with time-course live imaging. This model will be valuable to further understand PROS and potentially study other diseases where the PIK3CA pathway is altered (e.g., certain cancers).

      The following are not necessarily weaknesses of the data, but rather suggestions where the manuscript could be further strengthened:

      (1) The model recapitulates the variability of mesodermal lesions in PROS. It would be valuable to utilize this model to further study factors that are associated with the development of more severe lesions (e.g., by comparing samples with more severe lesions to those unaffected despite carrying the mutations, Figure 1F).

      (2) ScRNA-seq analysis could be enriched with a comparison between cells overexpressing mutant pik3ca vs. those overexpressing WT pik3ca.

      (3) In the scRNA-Seq analysis, it is curious that the C0 cluster, enriched with mScarlet+ cells, is found to have downregulated signaling interactions (Fig. 5C), yet exerts a widespread non-cell-autonomous effect. Meanwhile, there is also a noticeable loss of certain lineages (e.g., notochord, Figure 4E) and related cell-cell interactions (e.g., notochord-related interaction, Figure 5A). A deeper exploration of the basis of the non-cell-autonomous effect would be valuable.

      (4) The scRNA-Seq analysis was performed at one time point (19 hpf). Additional analysis (not necessarily by scRNA-Seq) at other time points to study whether findings at 19 hpf are persistent throughout development or undergo dynamic changes (e.g., cell fate/state of mSc+ mutant cells) would be helpful.

      (5) The scRNA-Seq analysis provides a valuable list of perturbed interactions that could be targeted by future therapeutic approaches. Validation of the scRNA-Seq findings with protein-level analysis, and studying the effect of targeting some of the pathways on the disease phenotype, would offer valuable data for the community.

    3. Reviewer #3 (Public review):

      Summary:

      The study "PIK3CA-related overgrowth spectrum (PROS) zebrafish models reveal pan-lineage developmental dysregulation" presents important findings that extend significantly beyond a single subfield, bridging developmental biology, vascular medicine, and cancer-related PI3K signalling. By developing mosaic zebrafish models of PROS and combining live imaging with single-cell transcriptomics, the authors provide compelling evidence for a non-cell-autonomous mechanism of tissue overgrowth, a conceptual shift with meaningful therapeutic implications.

      Strengths:

      The evidence is overall convincing, with methodology appropriate and well-validated relative to the current state of the art; the integration of multiple approaches (in vivo modelling, scRNA-seq, ligand-receptor inference) strengthens the central claims. However, some aspects of the proposed non-cell-autonomous signalling mechanisms remain partly correlative, and direct functional validation of the rewired ligand-receptor interactions would further consolidate the conclusions.

      Weaknesses:

      The transgenic overexpression approach chosen by the authors represents a well-established and effective strategy for generating mosaic models in zebrafish. However, this approach introduces notable limitations: the lack of control over transgene dosage and unknown integration sites may generate non-physiological effects, potentially confounding the interpretation of key findings.

      The authors are certainly aware that alternative approaches (though technically more demanding) could be considered in future studies to further strengthen the model. For instance, a CRISPR/Cas9-mediated knock-in of the pik3ca-PROS allele at the endogenous locus (retaining upstream native regulatory elements with only a minimal promoter in the construct, co-expressed with a fluorescent reporter via P2A) could allow even more physiological, lineage-restricted expression while enabling direct visualisation of mutant cells. Mesodermal specificity could potentially be further refined by driving mosaic Cas9 expression under a pan-mesodermal tbx promoter, restricting editing to the relevant lineage while simultaneously marking mutant cells fluorescently, thus even more closely mimicking the post-zygotic mutational events characteristic of PROS. As a complementary strategy, blastula transplantation experiments using pik3ca-PROS donor cells (ideally co-expressing a distinct fluorescent marker such as mCherry) into fli1:GFP transgenic hosts could provide a powerful and technically consolidated approach to directly visualise and quantify non-cell-autonomous effects on host vasculature, with precise control over mutant cell burden. This combinatorial framework, separating donor mutant cells from host tissue in a two-colour imaging setup, could be particularly compelling for validating the ligand-receptor rewiring predicted by single-cell transcriptomics in future investigations.

      These reflections are offered in the spirit of prospective methodological development and do not diminish the value of the current work, which opens a valuable new avenue for therapeutic investigation, suggesting that targeting indirect overgrowth-propagating signals, alongside PI3K inhibition, deserves serious consideration.

    1. Reviewer #1 (Public review):

      Summary:

      The authors performed seqFISH in 26 gastruloids and performed a variety of computational analyses on these novel spatial data sets. Whilst the data is valuable and the computational concepts useful (exposure index, L-metric, ... ), the article falls short on novelty and is written using a very clunky language, often with contradictory conclusions.

      Major issues:

      (1) The authors did well in explaining and detailing the provenance of data and the individual experiments performed. However, their 26 gastruloid data still constitute a very limited sampling from their total organoids: one experiment pooled 4 plates at an 80-94% success rate; 6 different aggregation experiments were done, making a total of 1843 gastruloids, sampled 26 (~1-2%). A simple IF stain of 2-3 markers in a bigger sample could have given a more accurate picture of specific domains of interest and their proximity. Regardless, more information should be given about the existing samples: variation across experimental batches, differences between 300-cell vs 100-cell gastruloids that were used.

      (2) Language in the manuscript should be revised. Overall the manuscript is very long, descriptive and written "impressions and beliefs" are often not adequately justified and indeed can be contradictory, e.g. in Section 1: the title states "cell types' locations ...are consistent", a few sentences down we find "there was substantial variation" and "within range of what would be considered a 'morphologically normal' gastruloid". "quite consistent", "compelling patterning", "we don't believe"... these types of expressions are best avoided and replaced with data or used and bolstered with quantitative numbers such as percentages when a given cutoff is used. Another example: "location of each cell type relative to gastruloid morphology was quite consistent the posterior region ... mainly consisted in NMPs." Given T expression in the posterior, this result phrased as such appears quite inflated, in fact, looking at cell types in Figures S1, 2a/b/c, this reviewer would state they are all but consistent and indeed it takes sophisticated analyses to find a pattern (of sorts) beyond the coarse domains expected!

      (3) Figure 6 is one of the most valuable parts of the work, as the authors use the battery of analyses developed to investigate the variable and not-so-robust endothelial clusters in gastruloids. However, this investigation is still very preliminary, and it should be further linked with known biology. It is still unclear what the unique organization of this cell type is (circularity isn't convincing) and whether any signalling cues of adjacent cells could explain it. Is there any evidence that more mature endodermal cell types are generated (like the suggested "liver") to give rise to endothelial cells? It would certainly be interesting to perform IF for this cell type together with mesodermal and endodermal markers to validate seqFISH predictions on a bigger sample.

      (4) Figures 1c and 6b need statistical significance assessments.

      (5) The article should include an analysis of Hox colinearity expression in these gastruloids as a validation of the system.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents an ambitious and technically challenging spatial-transcriptomic atlas of 26 gastruloids using seqFISH. The authors introduce quantitative metrics (mixing score, exposure index, L-metric / scL-metric, spatial L-metric, triplets) to characterize spatial organization at multiple scales. The dataset is valuable, and several analyses are original, particularly the rank-based L-metric family for mutual exclusivity.

      Strengths:

      The authors generate one of the most detailed spatial transcriptomic datasets of gastruloids to date. They propose creative computational metrics (L-metric/scL-metric) to quantify mutual exclusivity of gene expression without predefined thresholds, and they explore organizational principles from single-cell topology to cluster-level structure. Many observations align well with known gastruloid biology, such as posterior robustness and anterior variability. The writing is generally clear, and the figures are rich.

      Weaknesses:

      Several central claims rely on metrics whose computation and justification are insufficiently explained, making it difficult to assess how robust or interpretable the results are. Many choices in the analysis appear arbitrary or are insufficiently motivated (normalization schemes, choice of parameters such as the number of neighbors, the distance cutoffs, hierarchical clustering setup, and so on). The interpretations of spatial consistency, gene-program inference, and endothelial heterogeneity are plausible but might be stronger than the evidence currently supports.

      The manuscript would benefit from stronger benchmarking, quantification of uncertainty, and explicit controls for known artifacts in spatial transcriptomics (e.g., spillover, 2D slicing, cell type assignment entropy). The biological insights are promising, but since several depend on methodological assumptions that have not yet been demonstrated to be stable, they would benefit from clearer methodological explanation.

      The work is rich and could become a reference dataset. Then, clarifying and validating the quantitative methods will considerably strengthen the impact and reliability of the conclusions.

    3. Reviewer #3 (Public review):

      Summary:

      Triandafillou and colleagues report a single-cell resolved spatial atlas of gene expression of 26 gastruloids. While previous work had analyzed either single-cell gene expression or spatially coarse-grained patterns of gene expression (van den Brink et al, 2020), the authors here use multiplexed sequential RNA FISH (seqFISH) to create the first gastruloid atlas, which is simultaneously spatially and cellularly resolved. This atlas adds to a growing list of resources cataloging gastruloid development (see also Suppinger et al 2023).

      To analyze this dataset, the authors also describe a novel analytical framework. Their analysis centers around the 'L-metric', which measures the degree to which pairs of genes are either coexpressed or mutually exclusive. While this metric is similar to calculating correlations in gene expressions, it has important differences (including that it can, in principle, be asymmetric; although the authors symmetrize much of their analysis). In addition to the gene-centric L-metric analysis, the authors also analyze cells in their dataset according to the cell type entropy (an information-theoretical measure of confidence in cell type assignment) and the 'exposure index' (a measure of the similarity of nearest cellular neighbors).

      Using this framework, the authors focus their analysis on two major features of development. The first is the differentiation of the bipotent neuromesodermal progenitor (NMP) cells in the posterior of the gastruloid into either presomitic mesoderm (PSM) or spinal cord SC lineages. They use L-metric analysis to compare overlap in marker genes used to separate NMP, PSM, and SC fates. They highlight that L-metric analysis can recover spatial patterns of gene expression (without explicit spatial information) and discern subtle features of marker genes beyond simple binning of cell types (e.g., that Epha5 expression in anterior NMPs may predict future SC differentiation).

      The second is the formation of endothelial (spatial) clusters within the gastruloid. The authors highlight two subtypes of endothelial clusters: (1) smaller clusters within the somitic anterior region, and (2) larger clusters associated with endoderm. While the authors discern some subtle differences in gene expression between these two clusters, their different spatial patterns suggest a potential physiological difference that would not be captured in traditional droplet microfluidic-based scRNAseq pipelines.

      Overall, this manuscript is a sophisticated and technically sound study that will provide a valuable beachhead for future studies of developmental patterning in gastruloids and organoids.

      Strengths:

      The major strengths of this study are the overall technical sophistication of the data set and analysis, as well as its potential generalizability to other developmental systems (both in vitro and in vivo). The data are extensively analyzed and reasonably interpreted, and this atlas makes good use of the variability in gastruloid development to extract the statistical structure of developmental processes. The L-metric offers a parameter-free tool to analyze transcriptomic datasets that could overcome the pitfalls of other approaches.

      Weaknesses:

      The major limitations of this study are the depth and novelty of the developmental processes studied. The authors provide very convincing proof-of-concept that their data set can recover known features of gastruloid development, including NMP differentiation and endothelial development. However, further analysis and/or investigation would be required to discover new principles of gastruloid development and patterning.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript entitled "Essential function reflected in the phylodynamics of a multigene family - the pir genes of malaria parasites" by Jackson and colleagues investigates the global phylogeny of pir genes across 14 Plasmodium species and one Hepatocystis species. The authors also focus on the functional characterization of the conserved ortholog pirC1 and claim that pirC1 is not the founder of the family and that it plays an essential role in blood-stage growth.

      Strengths:

      Overall, the manuscript is well written and interesting, as it combines comparative genomics and evolutionary analysis with functional experiments. The phylogenetic analysis is rigorous and represents a major strength of the manuscript.

      Weaknesses:

      The general conclusions regarding the potential function of this gene family are not fully supported by the data presented. The manuscript moves too quickly from growth phenotype and localization studies to a specific mechanistic model. The discussion argues that PIRC1 may be involved in nutrient acquisition, host sensing, or metabolic support, but the data provided do not directly support these functions, and the manuscript in its present form remains speculative. Although the manuscript includes some experimental results, it lacks direct mechanistic validation of the specific functions of the pir genes, including pirC1. In its current form, the study does not yet establish a definitive role for pirC1 in metabolic processes.

    2. Reviewer #2 (Public review):

      Summary:

      This is an extensive study using phylogenetic comparison across multiple plasmodium species to gain new insights in relation to their evolutionary pathways and the potential function of pir. In addition to establishing a framework to identify related orthologues across species as well as expanding paralogues families within a species, the work also focuses on understanding loss and gain of different PIRs and how this indicates a relative lack of functional constraints and essentiality for most members of the gene family.

      The authors provide evidence that at least pirC has a conserved function and plays an important role in parasite growth in multiple species.

      While this study represents a significant effort and does provide interesting new insights that would help our understanding of this complex gene family in the future, it has a number of limitations.

      Strengths:

      Extensive and thorough phylogenetic analysis that is supported by some biological validation. Provides an indication that the PIR gene family has limited biological constraints and evolved independently across different species, leading to rapid expansion and deletion of orthologous groups. Identified pirC as a functional and important member of the family that is conserved across the species.

      Weaknesses:

      The phylogenetic tree is based on a truncated sequence that focuses on the more conserved parts of the pir sequence. This could potentially lead to missing the key functional drivers of evolution. The biological validation of the role of pirC has some inconsistencies that need to be addressed.

    3. Reviewer #3 (Public review):

      This paper aims to classify, from an evolutionary perspective, the multigene family PIR found in malaria parasites infecting rodents and Old World monkeys, and to link this classification to functional diversification. The authors also hypothesize that PIR members conserved across species play important roles in parasite survival, and seek to clarify their functions.

      To achieve these aims, the authors comprehensively analyze the evolution of PIR genes using genomic and transcriptomic information from many malaria parasite species. They focus on PIRC1, a member conserved across species, and attempt to clarify its function in rodent and simian malaria parasites by examining the phenotypes of parasites in which the corresponding genetic locus has been disrupted. They also attempt to determine its localization using PIRC1 tagged with an epitope sequence. However, although the locus-disrupted parasites appear to show an approximately 50% reduction in growth rate, this effect seems to be overestimated. Another weakness is that the cause of the reduced growth rate has not been clarified. The localization analysis also remains insufficiently conclusive.

      Therefore, I consider that the first half of the paper, consisting of the bioinformatics analyses, achieves the objective of comprehensively summarizing PIR and may become a reference paper for discussing the evolution and function of the PIR gene family. On the other hand, regarding the function of PIRC1, no clear conclusion can be drawn from the results presented, and several additional experiments are necessary.

      My major comments are as follows.

      (1) The claim that the failure of eight disruption attempts indicates that pirC1 is essential is too strong.

      Lines 319-321: The authors argue that a total of eight failed attempts to disrupt the pirC1 locus using two different construct designs suggest that pirC1 is essential in P. berghei. However, the failure of these attempts could also reflect technical issues with the construct design itself, such as the length of the homologous regions used for recombination, which are approximately 650 bp. Therefore, it is an overstatement to conclude that "pirC1 is essential for P. berghei blood-stage growth." Given that parasites with disruption of the corresponding locus could be obtained in both P. chabaudi and P. knowlesi, a more appropriate statement would be that "pirC1 is important for P. berghei blood-stage growth."

      (2) The data on the mCherry-expressing P. berghei line shown in Supplementary Figure 11 are insufficient.

      (a) Panel C: Southern blot analysis<br /> To conclusively identify the lower band in panel C as chromosome 1, additional probes specific to genes located on chromosomes 1 and 2 would be required. In addition, a parental parasite control should also be included. The Southern blot image of the parental parasite should show only a single band at the higher position, with no band at the lower position. Probes specific to chromosomes 1 and 2 would help demonstrate that the lower band corresponds to chromosome 1, rather than chromosome 2.

      To this end, the authors could describe the result as follows:<br /> "In the parental parasite, only a single band corresponding to chromosome 7 was detected, indicating that the smaller chromosome was genetically modified. The size of the lower band detected with the dhfr probe was identical to that of the band detected with the control chromosome 1 probe, but distinct from that detected with the chromosome 2 probe, indicating that chromosome 1 was modified."

      That said, this chromosome-level Southern blot analysis is not sufficient to demonstrate that the target PBANKA_0100500 locus was specifically modified. The authors should provide more direct evidence showing that the PBANKA_0100500 locus, rather than another genomic locus, was modified. For example, Southern blot analysis after restriction enzyme digestion would provide more definitive evidence. Diagnostic PCR may also provide more specific evidence.

      (b) Panel D: Flow cytometry analysis

      To allow a more accurate interpretation of the percentage of mCherry-positive cells, flow cytometry data for the parental parasite line should also be presented.

      (3) There are unclear points in the PCR results shown in Supplementary Figure 12.

      Supplementary Figure 12: In panel B, a PCR product should also be amplified from dPCHAS_0101200 using the P1-P3 primer pair. Why is this band absent? The authors should provide the uncropped electrophoresis image so that the larger band can be seen. In addition, if labels 1 and 2 indicate independent clones, this should be stated in the figure legend.

      (4) The growth rates of P. chabaudi and P. knowlesi parasites with disruption of the PIRC1 gene locus should be quantitatively analyzed.

      The growth rates of P. chabaudi and P. knowlesi are described only qualitatively, but they should be evaluated quantitatively. In Figure 4A, the parasitemia of wild-type P. chabaudi increases from approximately 6.1% on day 6 to approximately 15.6% on day 8, corresponding to a 3.8-fold increase. However, because parasite growth may already be affected by immune-mediated suppression at this stage, this value should be regarded as a minimum estimate. In contrast, the mutant increases from approximately 3.2% on day 8 to approximately 6.8% on day 10, corresponding to a 2.1-fold increase. Based on these values, the daily growth rate of the mutant appears to be reduced to at least approximately 56% of that of the wild type. Similarly, from the growth curve of P. knowlesi in Fig. 5A, the DMSO-treated group appears to increase approximately two-fold per day, whereas the rapamycin-treated group increases only approximately one-fold per day. Thus, P. knowlesi also appears to show an approximately 50% reduction in growth rate. Taken together, both P. chabaudi and P. knowlesi appear to reproducibly show an approximately 50% reduction in growth capacity. A reduction of this magnitude is difficult to describe as a "severe growth defect"; a more appropriate wording would be simply that the parasites "showed a growth defect." In addition, the terms "a severe growth defect" and "essential" appear to be overstated throughout the manuscript, and the wording should be toned down. Finally, I recommend presenting Figure 4A and Figure 5A on a logarithmic scale so that the trend in growth rates can be more intuitively appreciated from the graphs.

      (5) The evidence that disruption of the PIRC1 gene locus in P. knowlesi does not affect erythrocyte invasion is weak.

      The authors describe that "the developmental cycle of the parasites lacking PIRCl is slightly longer than that of parasites that produce PIRCl (line 383-384)," and appear to support this interpretation with data showing that "mutant parasites are significantly smaller than wild-type parasites (line 414)" and that "the DNA content in ML10-arrested parasites lacking PIRCl is lower than that of DMSO-treated parasites (line 417-418)" at 24 hours after invasion. However, a slightly longer developmental cycle alone does not seem sufficient to explain a 50% growth reduction.

      I think the erythrocyte invasion capacity has not been quantitatively evaluated, and therefore, the evidence supporting the conclusion that the phenotype of P. knowlesi parasites with disruption of the PIRC1 gene locus is unrelated to erythrocyte invasion is weak. The authors should assess invasion efficiency using purified merozoites. For P. chabaudi, it should also be possible to apply an in vitro or in vivo erythrocyte invasion assay similar to that used for other rodent malaria parasites, and this should be evaluated as well.

      (6) The authors should examine whether disruption of the PIRC1 gene locus results in a phenotype characterized by a reduced number of merozoites.

      Alternatively, the reduced DNA content in ML10-arrested parasites lacking PIRC1 (lines 416-417) could suggest that the number of merozoites formed per schizont may be reduced. To clarify this point, the authors should assess whether the number of merozoites per schizont is altered in P. knowlesi (and P. chabaudi parasites lacking PIRC1).

      (7) The authors propose the possibility that PIRC1 expressed in merozoites is released after invasion; however, the evidence that PIRC1 localizes to intracellular organelles is weak.

      Line 333: "a peripheral pattern around the parasite" is indicative of parasite plasma membrane, PV, or PVM. ", indicative of a parasitophorous vacuole (PV) or parasitophorous vacuole membrane (PVM) location" should be amended to ", indicative of parasite plasma membrane, a parasitophorous vacuole (PV) or parasitophorous vacuole membrane (PVM) location". In the Figure S14 image, red signals are uniformly detected from the merozoites formed in the schizont stage parasite (not really microorganelle patterns), but not from the PVM surrounding the schizont, suggesting parasite plasma membrane localization, not PVM. I agree that the signal is detected from the compartments extending into the iRBC cytosol, which may be difficult to explain if it is located on the parasite plasma membrane, but how frequently were such images seen?

      Figure 4D. In the images of liver-stage schizonts, AMA1 does not appear to localize to the micronemes in mature merozoites, suggesting this image is an immature schizont. Although PIRC1 appears to be expressed in liver-stage schizonts, it is difficult to clearly determine whether it localizes to intracellular organelles or to the parasite plasma membrane.

      To clarify the above points, the authors should examine whether PIRC1 is detected in intracellular organelles or around the merozoites by analyzing its localization in purified merozoites.

    1. Reviewer #2 (Public review):

      Summary:

      Protein synthesis - translation - involves repeated recognition and incorporation of amino-acyl-tRNAs by the ribosome. This process is a trade-off between the rate and accuracy of selection (for review see (Johansson et al, 2008; Wohlgemuth et al, 2011)). The ribosome does not just maximise the rate or the accuracy, it balances the two. Therefore, it is possible to select mutants that translate faster than the wt (but are sloppy) or that are very accurate (more than the wt) but translate slower. Slow translation is detrimental as it limits the rate of protein synthesis (and, therefore, growth) and hyper-accurate mutants accumulate mis-translated proteins, which is detrimental for the cell.

      Bi and colleagues employ genetics, MIC measurements, reporter assays and structural biology to characterise the role of GidB rRNA methylase in translational accuracy in Mycobacterium smegmatis.

      Strengths:

      The genetics and phenotypic assays are convincing and establish the biological role of the methylase. The authors use a powerful set of complementary assays that convincingly demonstrates that the loss of GidB results in mistranslation.

      Weaknesses:

      Cryo-EM analysis of vacant 70S ribosomes is not sufficient for understanding the mechanisms underlying the accuracy defects in the gidB KO. Ideally, one should assemble and solve structurally near-cognate and non-cognate complexes.

      References:

      Johansson M, Lovmar M, Ehrenberg M (2008) Rate and accuracy of bacterial protein synthesis revisited. Curr Opin Microbiol 11: 141-147

      Wohlgemuth I, Pohl C, Mittelstaet J, Konevega AL, Rodnina MV (2011) Evolutionary optimization of speed and accuracy of decoding on the ribosome. Philos Trans R Soc Lond B Biol Sci 366: 2979-2986

    1. Reviewer #1 (Public review):

      [Editors' note: This version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review satisfactorily and toned down the comments as advised.]

      In this manuscript, the authors investigate the relationship between genetic codes and their robustness to single-point mutations. They construct ten alternative genetic codes by reassigning nine codons to Leu, Ser, or Ala, and assess mutational robustness using three reporter proteins subjected to error-prone PCR. This represents an interesting experimental approach to addressing the hypothesis that the standard genetic code is optimized for mutational robustness.

    2. Reviewer #2 (Public review):

      The study addresses the long-standing question in molecular biology and genetics: why has nature selected the current genetic code (SGC, or standard genetic code)? The authors have tested 'error minimization theory', one of the prevailing hypotheses to explain this. Their approach is to create a minimum genetic code (MGC) and its variants (3^9 theoretical possible codes). Using three parameters to quantify the effect of mutations (Polarity, volume, and hydropathy), they computationally test the cost of these genetic codes (3^9) by simulations. Finally, they test this cost experimentally using an in vitro translation system with 10 select genetic code variants with a range of costs (low to high). They use three randomly mutated reporter genes for this purpose - beta-galactosidase, luciferase, and mSG. They find no correlation between the cost of the genetic code and the reporters' output. Based on these observations, they suggest that error-minimization theory may not explain the current egocentric code.

      The question they are asking is very exciting, and their approach is solid. The authors are very careful in their analyses and conclusions.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Miyachi and Ichihashi investigate whether the arrangement of the genetic code affects mutational robustness. Using an in vitro minimal genetic code with vacant codons, they constructed 10 non-standard genetic codes by reassigning Ala, Ser, and Leu, generating codes with replacement costs that were generally higher than those of the standard genetic code across several amino acid property measures. They then tested how random mutations affected the activity of reporter proteins translated under these altered codes. Although error minimization theory predicts that higher-cost codes should make mutations more harmful, the authors report that protein function declined to a similar extent across all codes examined, suggesting that mutational robustness remains largely unchanged within the range of genetic code alterations tested here.

      Strengths:

      This is an interesting study that investigates one of the most fundamental and intriguing questions in molecular evolution: the emergence of the genetic code, which is nearly universal across nature. The in vitro approach is a powerful aspect of the work and provides an opportunity to examine this phenomenon experimentally at a depth that has previously been inaccessible.

    1. Reviewer #1 (Public review):

      Summary:

      In their article, Guo and coworkers investigate the Ca²⁺ signaling responses induced by Enteropathogenic Escherichia coli (EPEC) in epithelial cells and how these responses regulate NF-κB activation. The authors show that EPEC induces rapid, spatially coordinated Ca²⁺ transients mediated by extracellular ATP released through the type III secretion system (T3SS). Using high-speed Ca²⁺ imaging and stochastic modeling, they propose that low ATP levels trigger "Coordinated Ca²⁺ Responses from IP₃R Clusters" (CCRICs) via fast Ca²⁺ diffusion and Ca²⁺-induced Ca²⁺ release. These responses may dampen TNF-α-induced NF-κB activation through Ca²⁺-dependent modulation of O-GlcNAcylation of p65. The interdisciplinary work suggests a new perspective on calcium-mediated immune response by combining quantitative imaging, bacterial genetics, and computational modeling.

      Strengths:

      The study provides a new concept for host responses to bacterial infections and introduces the concept of Coordinated Ca²⁺ Responses from IP₃R Clusters (CCRICs) as synchronized, whole-cell-scale Ca²⁺ transients with the fast kinetics typical of local events. This is elegantly done by an interdisciplinary approach using quantitative measurements and mechanistic modelling.

      Comments on revised version.

      The revised version of the manuscript has addressed all my raised points. I'd like to thank the authors for the work they have put into the revision to make this a very compelling publication.

    2. Reviewer #2 (Public review):

      Summary:

      The authors of this study are trying to resolve how cellular infection by enteropathogenic E. coli (EPEC) subverts cellular signaling pathways to promote infection and dampen immune responses. Specifically, alteration in calcium dynamics has been evidenced in the prior literature as a potential initiator of these adaptions, and this study provides ideas and mechanistic detail as to how cellular calcium dynamics may be subverted by pathogens.

      Strengths:

      The clear strengths of this paper relate to the new ideas inherent in the proposed hypothesis and their support from the experimental approaches used. Overall, the proposed work provides new ideas in this area, which will benefit from further investigation. Certainly, this is an interesting and challenging paradigm to pick apart mechanistically, and is important for improving treatments from intestinal infections. The authors have provided additional data to clarify and expand on concerns raised during the original review, and these additions are helpful.

      Comments on revised version.

      Thorough response to original review. No further comments.

    1. Reviewer #1 (Public review):

      The authors present a compelling case for the necessity of age-specific templates in functional hyperalignment. Given that the brain undergoes substantial developmental, structural, and functional changes across the lifespan, a 'one-size-fits-all' canonical template is often insufficient. This study effectively demonstrates that incorporating age-congruent features significantly enhances the performance and sensitivity of hyperalignment models. By validating these findings across two independent datasets (Cam-CAN and DLBS), the paper provides robust evidence that accounting for age-related functional organization is a critical prerequisite for accurate functional alignment in lifespan research

      Comments on revised version:

      The authors have been exceptionally thorough in addressing the concerns raised by the reviewers. In particular, the inclusion of the supplemental analysis on the middle-aged cohort is a valuable addition that strengthens the manuscript. Furthermore, the rationale for employing a congruent template is well-articulated; this approach clearly provides a more robust and accurate foundation for reconstructing individualized connectomes. I appreciate the authors' detailed responses and have no further comments.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, Zhang and colleagues examine the role of participant selection in creating and using functional templates to improve analyses using hyperalignment. Hyperalignment aligns participants' functional MRI data to a shared functional template, analogous to the anatomical templates used to bring anatomical MRI data into a shared space (e.g., MNI152). The question of appropriate template creation is especially pressing for population-level analyses, where a large number of demographic groups (e.g., different age ranges, clinical statuses) may be included in the same analysis. These different demographic groups may have differences in their functional organization that complicate the creation of a single study-specific functional template.

      To provide an initial investigation of the potential effect of demographic-specific templates, the authors use the publicly available Cam-CAN dataset which contains participants from 18 to 87 years of age. They define a young adult (< 45 years of age) and an older adult group (> 65 years of age) from this dataset with approximately the same number of participants. They investigate whether "age-congruent" templates (i.e. defined in the same age group they are used) improve three analyses where hyperalignment has been previously shown to boost performance: inter-subject correlation, predicting individual connectomes, and predicting individual functional responses. Using the Cam-CAN derived older adult template, they then replicate the ISC analyses using the publicly available Dallas Lifespan Brain Study (DLBS).

      Overall, the presented results are highly suggestive that age-congruent templates consistently improve performance, though the absolute effects are small.

      Strengths:

      The use of a separate validation sample-re-using the same template calculated with Cam-CAN-highlights the potential of developing independent templates for individual demographic groups and then distributing these for wider use, analogous to the MNI templates that are widely used throughout the field of neuroimaging. This suggests that the potential impact of this framework is significant.

      Weaknesses:

      In their revision, the authors have addressed the previously raised "weaknesses" by providing guidance for researchers interested in using age-specific hyperalignment templates in practice.

      Impact:

      Overall, this work is likely to encourage future development of age-specific functional templates in the imaging community.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a tunable Bessel-beam two-photon fluorescence microscopy (tBessel-TPFM) platform that enables high-speed volumetric imaging with stable axial focus. The work is technically strong and broadly significant, as it substantially improves the flexibility and practicality of Bessel-beam-based two-photon microscopy. The demonstrations are generally strong and bridge a wide range of neuroimaging applications, namely vascular dynamics, neurovascular coupling, optogenetic perturbation, and microglial responses. These convincingly show that the approach enables biological measurements that are difficult or impractical with existing methods.

      The evidence supporting the technical and biological claims is generally strong. The optical design is carefully motivated, clearly described, and validated through a combination of simulations and experimental characterization. The biological applications are diverse and well chosen to highlight the strengths of the proposed method, and the data are of high quality, with appropriate controls and comparative measurements where relevant.

      Strengths:

      (1) The optical innovation addresses a well-recognized limitation of existing Bessel-TPFM implementations, namely axial focus drift during tuning, and does so using a relatively simple, light-efficient, and cost-effective design.

      (2) The manuscript provides convincing experimental evidence for this being a versatile platform to map flow dynamics across diverse vessel sizes and orientations in both healthy and pathological states.

      (3) Biological demonstrations are comprehensive and span multiple domains such as hemodynamics, neurovascular coupling, and neuroimmune responses.

      (4) Quantitative analyses of blood flow across vessel sizes and orientations, including kilohertz line scanning, are particularly compelling and clearly beyond the reach of standard Gaussian TPFM.

      (5) Particular advantages are that higher blood slow speeds become measurable up to 23mm/sec (20x more than conventional frame scanning), and that simultaneous (Bessel-)imaging and (Gaussian-)perturbation are possible because of the stable axial focus.

      Weaknesses:

      (1) At present, the paper does not properly position the new Bessel-beam method against previous work, and fails to compare it to alternative fast volumetric imaging methods without Bessel beams.

      (2) The cost-effectiveness of the proposed method is not well described or supported by evidence; it would be useful to include more detail or remove this claim.

      (3) Some biological conclusions, e.g., regarding novel features of microglial dynamics (i.e., the observed two-wave responses and coordinated extension-retraction), are based on relatively limited sample size and would benefit from clearer discussion of variability across animals and fields of view.

      (4) The use of neural network-based denoising for microglial imaging is reasonable but introduces potential concerns about trustworthiness; additional clarification of validation or failure modes would strengthen confidence in these results.

      To conclude, most of the authors' claims are well supported by the data. The central conclusion, namely that tBessel-TPFM provides tunable volumetric imaging enabling experiments not feasible with existing two-photon approaches, is justified. Some biological interpretations would benefit from a more cautious framing, but they do not undermine the main technical and methodological contributions of the study. This is a strong and technically rigorous manuscript that makes a substantial methodological advance with clear relevance to neuroscience and intravital imaging. Minor clarifications and a slightly more measured discussion of certain biological findings are recommended.

    2. Reviewer #2 (Public review):

      Summary:

      The authors describe a tunable Bessel beam two-photon microscope (tBessel-TPFM) designed to overcome a common limitation of Bessel-based volumetric imaging: axial shifts of the effective focus during Bessel beam parameter tuning. Their optical design allows independent control of axial beam length and resolution while keeping the axial center fixed. This is extensively validated through simulations and experiments.

      Strengths:

      A major strength of the work is the breadth of validation combined with the level of technical detail provided. The authors carefully characterize the optical performance of the system and clearly explain the design choices and underlying derivations, which will make it easier for others to understand and implement. The authors demonstrate the utility of the method across several in vivo applications, including neurovascular imaging, blood flow measurements, optogenetic stimulation, and microglial dynamics.

      Weaknesses:

      In the in vivo demonstrations, the authors employ different Bessel beam configurations across experiments, but the beam parameters are not dynamically tuned during live imaging. A video example showing continuous or interactive tuning of the Bessel beam within a single in vivo imaging sequence would further highlight the practical advantages of this platform and strengthen the case for its potential applications. In addition, while excitation powers are reported, the manuscript does not place these values in the broader context of known photodamage thresholds for two-photon microscopy, which would be helpful to the readers. Denoising/image restoration are applied in one of the in vivo examples, but it is unclear why this step was used specifically for this dataset and whether it was necessary to achieve adequate SNR or primarily included as an additional demonstration.

    3. Reviewer #3 (Public review):

      Summary:

      The manuscript presents an elegant and cost-effective approach for generating a tunable Bessel beam on a conventional two-photon microscope. The authors assemble a compact optical module comprising three axicons and a series of lenses that permits rapid adjustment of both lateral resolution and axial extent without modifying the focal plane. This flexibility enables the system to be readily adapted to a variety of biological preparations. As a proof of concept, the authors employ the device to record blood flow velocities in cortical microcapillaries, arterioles, and venules, thereby directly visualizing vasodilatation and vasoconstriction dynamics and permitting quantitative analysis of neurovascular coupling across cortical layers in awake mice.

      The authors demonstrate that the tunability of the Bessel beam can be exploited to match the numerical aperture to the vessel type: a high NA configuration, albeit slower scan, is optimal for resolving flow in capillaries, whereas a low NA setting provides faster acquisition suitable for arterioles and venules. By implementing a one-dimensional line scan with the Bessel beam, they achieve an imaging speed that is twentyfold faster than conventional frame-by-frame scanning, which proves sufficient to capture hemodynamic transients before and after an induced ischemic stroke.

      In addition to pure observation, the authors integrate a co-propagating Gaussian line to the system, allowing simultaneous imaging and photostimulation within the same focal plane. This capability addresses a common limitation of other Bessel beam implementations, in which the observation and perturbation planes often become misaligned when the Bessel beam is altered. The manuscript also emphasizes the advantage of Bessel beam excitation for calcium imaging after a perturbation, because it captures neuronal activity in planes both above and below the nominal focal plane, signals that would be missed with a standard Gaussian focus. Finally, the authors apply the technique to investigate the neuroimmune response following targeted microglial ablation; they report that adjacent microglia extend processes toward the injury site while retracting processes in the opposite direction.

      Overall, the work offers a technically straightforward yet powerful extension to existing two-photon platforms, providing high-speed, volumetric imaging and stimulation capabilities that are well-suited to a broad range of neurovascular and neuroimmune studies. The experimental validation is quite thorough, and the presented data convincingly illustrates the benefits of the approach.

      Strengths:

      The authors present a truly clever and inexpensive optical module that can be integrated into almost any two-photon microscope, providing a tunable Bessel beam with a minimal modification of the existing system. The experimental data and accompanying quantitative analysis convincingly demonstrate that the system can reveal physiological events, such as capillary flow, calcium transients across multiple axial planes, and microglial process dynamics, that are difficult or impossible to capture with a conventional Gaussian beam. The breadth of experiments chosen for the manuscript illustrates the practical utility of the device and supports the authors' conclusions that it extends the functional repertoire of standard two-photon microscopy.

      Weaknesses:

      The manuscript would benefit from a more detailed contextualisation of the claimed speed advantage. Although the authors mention other techniques in the introduction, they do not provide any direct comparison with other state-of-the-art high-speed two-photon approaches such as light beads microscopy (Demas et al., Nat. Methods 2021), temporal multiplexing schemes (Weisenburger et al., Cell 2019), or random access microscopy (Villette et al., Cell 2019). A brief comparison of imaging speed, spatial resolution, and instrumental complexity would enable readers to assess the relative merits of the present method.

      A second limitation that warrants discussion is the inherent trade off between volumetric coverage and image specificity. Because the Bessel beam excites fluorescence throughout an extended axial range, the detector inevitably integrates signal from a three dimensional volume into a two dimensional image. In densely labelled tissue, this can lead to significant signal crosstalk, reducing contrast and complicating quantitative interpretation. A brief analysis of how labeling density affects the fidelity of flow or calcium measurements, or suggestions for mitigating crosstalk (e.g., computational deconvolution, adaptive excitation shaping, or combinatorial sparse labeling), would broaden the applicability of the technique.

    1. Reviewer #2 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      The authors pair analysis of replication timing and allele-specific expression in clonal populations of primary human cells. They combine these data with previously published data on clones from transformed human cell lines. They identify a number of genomic regions that display asynchronous replication timing in at least one clone and correlate these regions with allele-specific expression of genes within them. They also observe that several interesting gene sets, including genes that are associated with human diseases, map to asynchronously replicating regions. This is a good experimental approach that builds on already published data demonstrating the connection between allelic imbalance and replication timing.

    1. Joint Public Review:

      This manuscript puts forward the provocative idea that a posttranslational feedback loop regulates daily and ultradian rhythms in neuronal excitability. The authors used in vivo long-term tip recordings of the long trichoid sensilla of male hawkmoths to analyze spontaneous spiking activity indicative of the ORNs' endogenous membrane potential oscillations. This firing pattern was disrupted by pharmacological blockade of the Orco receptor. They then use these recordings together with computational modeling to predict that Orco receptor neuron (ORN) activity is required for circadian, not ultradian, firing patterns. Orco did not show a circadian expression pattern in a qPCR experiment, and its conductance was proposed to be regulated by cyclic nucleotide levels. This evidence led the authors to conclude that a post-translational feedback loop (PTFL) clockwork, associated with the ORN plasma membrane, allows for temporal control of pheromone detection via the generation of multi-scale endogenous membrane potential oscillations. The findings will interest researchers in neurophysiology, circadian rhythms, and sensory biology. However, the manuscript has limited experimental evidence to support its central hypothesis and is undermined by several assumptions that underlie their data analysis and model builds, as well as insufficient biological data including critical controls to validate and/or fully justify the model the authors are proposing.

      Strengths:

      The authors raise several intriguing model-based hypotheses regarding the mechanisms that underlie the generation of olfactory rhythms. The electrophysiological approach and the long-term recording paradigm are elegant and technically impressive. In the revised version, the authors have added additional qPCR data supporting the lack of rhythmic Orco transcript expression and included a new figure suggesting that cAMP can modulate Orco conductance.

      Major weaknesses:

      (1) The cAMP experiment was only conducted at one time-point, which is insufficient to support the central claim that "AMP and cGMP may have ZT-dependent effects on Orco conductivity".

      (2) The revised manuscript continues to rely heavily on prior publications or defers key mechanistic questions (or important manipulations) to future studies. In its current form, the evidence presented remains insufficient to support the central claim that a PTFL constitutes the primary underlying circadian clock mechanism. The proposed model is intriguing, but the data provided do not yet directly demonstrate the novel mechanism.

    1. Reviewer #1 (Public review):

      M. tuberculosis exhibits metabolic flexibility, enabling it to adapt to various environmental stresses, including antibiotic treatment. In this manuscript, Serafini et al. investigate the metabolic remodeling of M. tuberculosis used to survive iron-limited conditions by employing LC-MS metabolomics and 13C isotope tracing experiments. The results demonstrate that metabolic activity in the oxidative branch of the TCA cycle slows down, while the reductive branch is reverted to facilitate the biosynthesis of malate, which is subsequently secreted.

      Overall, this study is experimentally well-designed, particularly the use of 13C isotope tracing to monitor TCA cycle remodeling under iron-limited conditions. The findings are valuable as they offer potential new targets for antibiotics aimed at non-replicating M. tuberculosis occurring in the hosts.

      Comments on revised version:

      All concerns are well addressed.

      I have one minor concern: Page 3 line 16 - Fig. 1G & H: The kinetics of ATP levels between H37Rv and Erdman seem different; Erdman induces greater ATP at days 2 and 3 after DFO treatment, which was not clear in H37Rv. Fig. 1I shows NAD/NADH ratio not NADH/NAD ratio. Please change it to NADH/NAD+ to be consistent with Supplement Fig. 1 result. Include the 17-day result of NADH/NAD+ in the discussion section to explain the different viability between the two strains.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigated the effect of prolonged iron limitation (which does stop growth but does not lead to cell death) alters central metabolism in M. tuberculosis. The major tool they used is metabolomics combined with stable isotope tracing. They show that the Krebs cycle is still active, despite the fact that it is dependent on some iron-dependent enzymes. They show that carbon flux through the oxidative branch of the Krebs cycle is stalled, resulting in the accumulation of metabolites, such as malate and alpha-ketoglutarate that are partially secreted. Apparently, the carbon flux from glycolysis is partially diverted to the reductive branch of the Krebs cycle. This is not achieved by using the glyoxylate shunt but probably through the GABA shunt. This unprecedented split of the Krebs cycle and malate secretion allows a continuous flow of carbon through the core of carbon metabolism, overcoming the metabolic stalling triggered by iron starvation.

      Strengths:

      Novel insight in the central metabolism of a major pathogen and its adaptation to iron starvation. Carefully conducted experimentation. Paper ends with a clear and helpful model.

      Weaknesses:

      The authors show some surprising and important findings, but would need a little more effort to really substantiate this. Especially the role of the GABA shunt should be genetically tested, as they did for ICL and the glyoxylate shunt.

      Also, the dataset 1 is not very convincing, it is only based on transcriptomics and shown with up or down, hardly a strong base for major conclusions. The very least you want is actual differences, preferable on the protein level, where it really counts....

      Comments on the revised version:

      In the revised version all these points were appropriately dealt with and discussed, although some of them textually and not experimentally, but for reasons that are logical.

    1. Reviewer #1 (Public review):

      This paper aims to improve the accuracy of predictions of the impact of ITN strategies by developing a method to estimate duration of ITN access and use over time on a subnational scale from cross-sectional survey data and the numbers ITNs received annually. The subnational estimates are then input into a mathematical model to predict clinical cases under different ITN distribution strategies.

      Strengths:

      The approach is novel and addresses a useful and timely topic. It makes use of available routine data, and has considered all of the relevant components of ITN distributions.

      The authors have made revisions, particularly to the methods, appendices and title - leaving the paper easier to follow, and with a clear, consistent aim. The assumptions are clearly stated.

      Weaknesses:

      The weaknesses are shared with other models of a similar complexity - it is not easy for a casual reader to fully understand the model or the implications of the assumptions which were required to be made. That routine data is used is good for availability, but data quality may be an issue in some places.

    2. Reviewer #2 (Public review):

      Summary:

      The authors design a custom Bayesian model to estimate the probabilities of access, use and use given access of insecticide-treated nets in six African countries, providing sub-national estimates and inferring the average duration of ITN use and access. An individual-based model was employed to simulate malaria epidemics and estimate the effectiveness of different ITN distribution strategies. The study finds that the mean probability of use or access did not reach 80% (a universal coverage formerly targeted by WHO) for any of the regions even for biennial campaigns, demonstrates that switching from triennial to biennial distribution campaigns increases population use by 7.9%, and evaluates the impact of employing more efficient ITNs on P. falciparum prevalence.

      Strengths:

      The authors developed a data-driven model that accounts for data collection imperfections and sources of uncertainty while differentiating between ITN use and access. They developed a methodology to infer the timing of mass campaign from publicly available data instead of assuming fixed dates. The probability of use given access allows determining the regions where ITN distribution is least effective. This work can help better inform future interventions by identifying regions where increasing mass campaign frequency or employing better ITNs are most effective. Finally, in addition to insights on ITN access and use for the six countries analyzed, the paper contributes with a methodological framework that can likely be extended to other countries.

      Weaknesses:

      Since the models employed are rather complex, the methodology description may be hard to follow for some readers. In addition, the models assume many hypotheses, including exponential decay of ITN use/access and narrow prior distributions. It is worth noting that, in the revised version of the manuscript, the authors justified the choice of exponential decay and narrow prior distributions, and made a significant effort to clarify the methodology and the model equations.

      Comments on revised version:

      I appreciate the improvements made to the text. The methodology description is much clearer now. I have no further suggestions.

    1. Reviewer #1 (Public review):

      Summary:

      Launay et al., conducted a screen of PDE in 25 new Rhabditidae species through cytological approaches and found PDE is detected in 17 out of 25 species, representing 12 out of 17 genera within the family. This work is significant because it expands PDE from a few known nematodes to a much broader set of Rhabditidae species.

      Strengths:

      By demonstrating PDE across many genera with the exception of C. elegans and some other Caenorhabditis species, the study provides an important resource for investigating PDE's evolutionary origins, mechanisms of genome reorganization and DNA repair, and its functional consequences.

      Most of the observed PDEs were supported by solid evidence through a survey-style cytological screen (PDE detected in 17/25 species and in 12/17 genera), which supports the main claim of widespread occurrence.

      Weaknesses:

      Although most PDE claims are supported by solid evidence, some of the existing data do not describe the depth of characterization, e.g., how many replicates were conducted for each species? How reproducible are the claimed PDEs between embryos in terms of timing and cell identities destined for PDE? Is it possible to validate a subset of PDE with independent evidence, especially for those with marginal PDE? This is important because some dying embryos may fail to maintain their chromosome integrity and release some of the broken DNAs, some others may suffer from noise such as intracellular parasites, for example, microsporidia, or even highly condensed mitochondrial DNAs.

    2. Reviewer #2 (Public review):

      Summary:

      Programmed DNA elimination is increasingly recognised as an important phenomenon across many species, including in animals. Exactly how widespread is still unclear, and the function of PDE is even more mysterious in most species where it has been described. PDE has been discovered in several nematode species, and in this manuscript, the authors carry out a more extensive search for PDE. They find PDE in many species, indicating that it is widespread across the phylum.

      Strengths:

      The large number of species across many different clades provides good evidence that the phenomenon has evolved many times independently. The work will therefore prompt many further studies characterising individual species, and potentially linking the evolution of the phenomenon to other features of these species' ecological characteristics.

      Weaknesses:

      The major technical weakness of this project is the assay that is used to evaluate PDE. First, this assay is clearly insensitive, as the authors acknowledge, O. tipulae, which has PDE, does not appear in their screen. Second, the assay gives no information about breakpoints and only limited, non-quantitative information about how much DNA is eliminated. Thus, their data really is only a preliminary screen, which would need to be confirmed by genomic assays.

    3. Reviewer #3 (Public review):

      Summary:

      Somatic programmed DNA elimination (PDE), also known as chromatin diminution, has primarily been studied in parasitic nematodes, such as Ascaris species, in which it was discovered almost 140 years ago. Recently, PDE has also been reported in three non-parasitic nematode species. In this manuscript, Launay et al present the results of a large-scale cytological and evolutionary study of PDE across 29 free-living nematode species belonging to the Rhabditidae family, for which they established a phylogeny based on 18S and 28S ribosomal RNA sequences. By combining DNA staining and telomere DNA FISH labeling in developing embryos, they convincingly document the formation of lagging fragments and/or the loss of long germline telomeres in 17 species, during one particular division of somatic precursor cells.

      Strengths:

      (1) The whole study is well executed, and the results are convincing.

      (2) The authors present compelling evidence that PDE is an ancestral feature of Rhabditidae nematodes.

      (3) This study provides a valuable resource of lab-tractable species for future PDE studies.

      Weaknesses:

      (1) Some clarifications are necessary to make the figures more reader-friendly.

      (2) Important references to ciliates are missing.

    1. Reviewer #1 (Public review):

      "Learning is a fundamental source of individuality," by Manna and colleagues, interrogates different sources of variation in individual behavior. The authors place individual flies in a Y-shaped arena, which is a common design in the field, and illuminate the arms of the Y with blue versus green light. They track the color preference of individual animals and also perform operant conditioning, meaning that they teach the fly to avoid a particular color/arm by generating a foot shock when the fly enters that arm. There are a number of things that are impressive about this setup: The authors are able to collect data on thousands of individual flies of many different strain backgrounds, and they demonstrate a strong change in color preference after conditioning. This is nice, because in past papers, visual learning ability has been modest and difficult to study. To put a number on it, in this paper, animals on average don't show a color preference at the start of the assay, spending around 30% of their time in the one arm illuminated green, and the remaining time in the two arms illuminated blue. After conditioning, the average animal spends only 23% of its time in the green arm.

      The authors run 64 animals through the assay for each of 88 wild-type strains (maybe? see Major Point 1 below) and see considerable strain-specific (genetic) variation in the change in time spent in the shocked color after conditioning. Some strains show no learning, while others spend <10% of their time in the shocked color after conditioning. They also, I believe, see that some strains have more variability across individuals, which would suggest that some strains have stronger canalization at the development or circuit function level than others, i.e., some genotypes produce more consistent copies of the individual, others less consistent copies. (Or, some genotypes produce robust circuits, and others produce noisy circuits.)

      Finally, the authors argue statistically that learning itself increases variability in individual performance. This makes a lot of sense to me intuitively. Learning changes the physical/chemical properties of circuits in the brain, and because it evolves over time and interacts with environmental variables, it seems like it should send different animals down different channels. Or, at a conceptual level, if I learn to play the piano and my sister doesn't (because of some genetic difference between us or something stochastic), this learning experience will cause all sorts of other differences in our behavior as time passes. I also think the authors do have enough data to be able to make this finding. However, the presentation of the argument in this portion of the paper is hard for me to understand, and I am not an expert in statistics, so the strength of the result is difficult for me to evaluate.

      Major points

      (1) It's difficult to track through the paper the number of animals tested for different assays. At the beginning, it says N=5632, which works out to 64 flies for each of the 88 DGRP strains. 64 happens to be the number of parallel Y arenas they have. Later in the methods, there's a description of more variation within the set of 64 for each strain, two different parent sets per strain, different sexes, conditioned and unconditioned. And, while the results text focuses on the color learning, the methods discuss additional assays (place learning, multi-day learning).

      Given the numbers, does each run of the 64 mazes include all the tested flies of one strain, or are flies of many strains included in each batch? Do different flies do different assays (color, place, multi-day), or do they all do all the assays? Perhaps there is a table including this information already in the supplement, but I recommend making it much clearer in the main results text and methods. While the dataset is large, if it is split over many conditions and/or if batch and genotype confound each other, this will affect the robustness of the results and how strong the conclusions can be.

      (2) The data presentation in Figure 1 is elegant and easy to follow, but getting into Figure 2 and subsequently, I get lost in the statistics and have trouble understanding what is being measured. My understanding of the big picture is that while genetics and individual randomness contribute a lot to behavior, the evidence for learning as an amplifier of individuality is that variance in behavior among animals of the same strain increases over time in the conditioned group (i.e., the group that is doing the most learning, or a specific kind of learning), but not in the control group. This idea is illustrated in the flattening distributions in the cartoons in Figure 1A. The authors should include graphs of the real data that use the same format as in that cartoon. Instead, the graphs present "residuals," and I don't know what those are. I suspect it's "variation left over after accounting for effects of strain and individual stochasticity." I see the residuals being tracked per strain over time in Figure 2H, but I don't see the change over time in other graphs. I'm looking for something simple like, "variation within the strain at the beginning of learning and at later time points in learning." (But I'm not sure exactly what instantaneous measurement would be the focus in longitudinal analyses of learning behavior.)

      (3) Figure 3 is a cool stab at tracking down the precise mechanism by which a stochastic environment interacts with learning to send individuals along different behavioral routes. But again, like in Figure 2, I don't have the sophisticated understanding of statistics to understand exactly what the graphs are telling me, or how they relate to the underlying measurements. I'm relying on the results text alone to reach a conceptual understanding, and just taking the graphs on trust.

      So, overall, the authors have a very nice body of work here, and with the potential to add a new facet to our understanding of the origins of diversity in animal behavior. In addition to the interpretations they focus on here, this dataset also represents an advance in studying visual associative learning in general, and quite an amazing ability to make longitudinal measurements of many behavioral decisions within the same animals. Improving the data presentation to make it easier to follow for a larger swathe of researchers, especially in figures 2 and 3, will increase its potential impact.

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to test the extent to which differences in learning capacity and experience contribute to behavioural variation in a genetically identical population under identical environmental conditions.

      Strengths:

      The authors developed and used a scaled-up version of a simple two-choice behavioural paradigm, allowing them to test thousands of individuals across multiple genotypes. They then deployed clever and powerful statistical analysis methods and provided compelling evidence for a role of variability in learning in the expression of behavioural variation.

      Weaknesses:

      There are no major weaknesses, although some level of longitudinal analysis to strengthen the evidence for a strict definition of individuality would be a welcome extension of a future study. In addition, it would have been very interesting, although understandably beyond the current scope, to delineate a potential source of learning variability in the brain.

    1. Reviewer #1 (Public review):

      Summary:

      This paper presents a toolkit for the transformation of Blastocystis. The authors have screened a number of selectable agents, promoters and reporter genes and present their findings. This resource will be of immense use to those in the Blastocystsis field, as well as those seeking to establish transformation tools in other species where such tools do not yet exist. Establishing new transformation tools is extremely challenging, and the authors have done an excellent job.

      Strengths:

      The authors have carried out a systematic screen of promoters, reporter genes and selectable agents. They have screened numerous for each, and all the data is presented. It is good to see when things did not work as well as when things did, so this data set is extremely useful indeed.

      Weaknesses:

      The findings are reported by reporter gene assay (microscopy). No evidence is given using genetics. The authors claim that the DNA is maintained episomally. However, could it be possible that there is integration? No PCRS/RT-PCRs are shown (although it can safely be assumed that the DNA/RNA is present where the transformation was successful), nor are any Western blots. These would have been useful to show that the P2A ribosomal skipping had occurred, and that proteins were expressed individually rather than as a polyprotein.

    2. Reviewer #2 (Public review):

      This manuscript presents a substantial technical advance for the genetic manipulation of Blastocystis by establishing an integrated workflow for stable episomal transgenesis, antibiotic selection, clonal recovery, and reporter-based imaging in the ST7-B subtype. The study is particularly valuable because it combines multiple previously fragmented approaches into a coherent and practically applicable toolkit, including endogenous regulatory elements, optimized electroporation conditions, selectable markers, and anaerobic compatible fluorescent reporters. This methodological work greatly expands the molecular toolbox and future studies focused on both basic and infection biology can now build on the ability to express and localize proteins in fixed as well as live cells.

      The microscopy data are convincing and clearly demonstrate functional reporter expression and successful recovery of stable transgenic lines. Nevertheless, because this is primarily a methodological paper, the study would be further strengthened by the inclusion of Western blot validation of reporter expression and bicistronic constructs. In particular, biochemical analysis of the P2A-containing constructs would help assess the efficiency of ribosomal skipping and exclude the possible presence of uncleaved fusion proteins, thereby providing stronger support for the interpretation of the imaging data and the functionality of the expression system.

    3. Reviewer #3 (Public review):

      Summary:

      The primary objective of this study was to establish a practical and functional framework for the propagation of stable transgenic cell lines of Blastocystis, a common animal gut microeukaryote. Although the work focused on Blastocystis ST7-B, a subtype with relatively low prevalence in humans, this choice is justified by its association with more frequent negative health effects. Beyond their relevance to the medical field, the methodological advances described here have the potential to also expand cell biology studies of this anaerobic organism, including its unusual mitochondria and redox metabolism.

      Strengths:

      Prior to this work, genetic tools for Blastocystis were very limited, relying on a single strong promoter-terminator combination. The authors successfully expanded the available promoter set across a range of expression strengths by testing two dozen variants in luciferase-based assays. Critically, they developed an integrated workflow from a modular transgenic construct design, to an expanded inventory of molecular components (promoters, reporters), optimized DNA delivery, stepwise antibiotic resistance-mediated clonal selection and propagation, and to reporter validation. The evaluation of several anaerobiosis-compatible labeling strategies for live (and fixed) cell optical imaging will be particularly useful, with the SNAP-tag system appearing especially promising for Blastocystis.

      Weaknesses:

      The presented data generally provide solid support for the conclusions that the work reached, but clarification of reasoning and several inconsistencies, as well as amendments to the visual presentation of the data, would be highly beneficial, as detailed below.

      (1) Episomal persistence of the construct:<br /> The manuscript repeatedly assumes, including in its title, that constructs persist in Blastocystis in their episomal form, but no direct evidence is provided. Although this interpretation is plausible, it should be identified more clearly as provisional. Nuclear genomic integration (e.g., via NHEJ) remains a possible explanation unless supporting evidence or rationale is provided to exclude it. Testing whether the phenotype persists without drug-mediated selection in the generated transgenic cell lines would help strengthen the case for episomal maintenance.

      (2) Promoters and terminators:<br /> 2.1) There is a discrepancy between the claimed number of loci (14), from which promoters used to drive luciferase expression were derived, and those detailed as having been actually generated in Table 1 (11). This inconsistency should be corrected or explained, as it creates uncertainty around the accuracy of the dataset.<br /> 2.2) Based on the presented evidence, constructs benchmarked in bioluminescence assays differed only in their promoter composition. Although terminator selection is mentioned in the Methods section, no additional details are provided; for instance, Table 1 and Figure 2 only list 23 promoters in total. Figure 2A likewise shows only promoter-dependent variation. If the terminator was held constant (LeguP1?), this should be stated explicitly. The authors may then consider revising the wording of having tested "23 promoter-terminator pairs" to better reflect that only promoters varied.<br /> 2.3) Promoter benchmarking was done with a plasmid lacking a selection marker, so it is unclear how the maintenance of the luciferase construct was ensured. Without selection, the observed reporter intensity could reflect differential or stochastic plasmid retention rather than promoter strength alone. The luminescence assay was performed 16-18 hours after transfection, but the rationale for this particular timeframe should be explained. In this context, the authors should explicitly state whether the experiments shown in Fig.2A represent biological triplicates or technical triplicates from a single transfection.

      (3) Figure 2:<br /> 3.1) Several aspects of the current design may lead to ambiguity for the reader. The boxplots are colour-coded, but it is unclear whether the colours carry meaning or are purely decorative. Because the data are already spatially separated into bins, additional random colouring is redundant and may suggest distinctions that are not intended. In addition, part A of Figure 2 is split into two panels, with the scale for the left panel shown in the right panel and some of the boxplot colours falling in the range of the scale, but not in line with their counterparts in the left panel. Because the colour use is not consistent, it is difficult to tell whether the same scale should be applied to both panels or how it should be interpreted.<br /> 3.2) The left panel of part A uses a diverging blue-white-red colour scheme, which is most appropriate when the midpoint represents a meaningful central value such as zero. Because the values shown in this graph are only positive, a non-diverging 2-colour scale or a colour palette such as 'viridis' would make the plot easier to interpret.<br /> 3.3) A black background should be avoided: 'B' and 'C' labels are invisible, and it draws attention to a distracting design feature rather than the data themselves.

      (4) Figure 3:<br /> 4.1) Individual snapshots should be separated more clearly, either by using a white background or by adding visible borders to make the overall composition clearer. As currently displayed, some boundaries between fluorescent channels resemble image artifacts rather than intentional panel divisions.<br /> 4.2) In parts B-D, the legend should explain more clearly what each image shows, and the figure itself would benefit from annotations. There seem to be three sub-panels in each 'condition' of part B (as well as C and D): while the middle and rightmost panel can be easily inferred to represent the fluorescent protein and bright-field image, what the leftmost panels represent is not specified. If DAPI was used to dye DNA, an explanation why mostly multiple labelled regions are visible should be provided.<br /> 4.3) Cell morphology and appearance differ markedly between UnaG/smURFP and SNAP-tag images, which should be explained. A microscope issue is mentioned in the main text, but if that was the cause, the authors should consider replacing the images, as the current distortions complicate interpretation.

    1. Joint Public Review:

      In this manuscript, the authors proposed an approach to systematically characterise how heterogeneity in a protein signalling network affects its emergent dynamics, with particular emphasis on drug-response signalling dynamics in cancer treatments. They named this approach Meta Dynamic Network (MDN) modelling, as it aims to consider the potential dynamic responses globally, varying both initial conditions (i.e., expression levels) and biophysical parameters (i.e., protein interaction parameters). By characterising the "meta" response of the network, the authors propose that the method can provide insights not only into the possible dynamic behaviours of the system of interest but also into the likelihood and frequency of observing these dynamic behaviours in the natural system.

      The authors study the Early Cell Cycle (ECC) network as a proof of concept, focusing on pathways involving PI3K, EGFR, and CDK4/6 with the aim of identifying mechanisms that may underlie resistance to CDK4/6 inhibition in cancer. The biochemical reaction model comprises 50 state variables and 94 kinetic parameters, implemented in SBML and simulated in Matlab. A central component of the study is the generation of large ensembles of model instances, including 100,000 randomly sampled parameter sets intended to represent intra-tumour heterogeneity. On the basis of these simulations, the authors conclude that heterogeneity in kinetic rate parameters plays a stronger role in driving adaptive resistance than variation in baseline protein expression levels, and that resistance emerges as a network-level property rather than from individual components alone. The revised manuscript provides additional clarification regarding aspects of the simulation and filtering procedures and frames the comparison with experimental data as qualitative. Nonetheless, the study is best interpreted as a theoretical and exploratory analysis of the model's behaviour under heterogeneous conditions. Consequently, questions remain regarding the biological grounding of the sampled parameter regimes and the extent to which the reported frequencies of resistance-associated behaviours can be directly interpreted in physiological terms.

      While the authors propose a potentially useful computational framework to explore how heterogeneity shapes dynamic responses to drug perturbation, a number of important conceptual and methodological concerns remain to be addressed:

      (1) The sampling of kinetic parameters constitutes the backbone of the manuscript, yet important concerns remain regarding its biological grounding and transparency. Although the revised version provides additional clarification on the exploration of "model instances", it is still not sufficiently clear how parameter values and initial conditions are generated, nor how the chosen ranges relate to biological measurements. The kinetic rates are sampled over broad intervals without explicit justification in terms of experimentally measured bounds or inferred distributions. As a consequence, it remains uncertain whether the ensemble of simulated behaviours reflects physiologically plausible cellular regimes or primarily the properties of the assumed parameter space. In this context, the large-scale sampling (100,000 parameter sets) resembles a Monte Carlo exploration of the model rather than a biologically calibrated representation of tumour heterogeneity.

      Furthermore, the adequacy of the sampling strategy in such a high-dimensional space (94 free parameters) remains open to question. In the absence of biologically informed constraints, the combinatorial space of possible parameter configurations is vast, and it is unclear to what extent the sampled ensembles can be considered representative. This issue is particularly relevant because the manuscript interprets the frequency of resistance-associated behaviours as indicative of their likelihood.

      The validation presented in Figure 7 does not fully resolve these concerns. The comparison with experimental data is qualitative, and the simulations are performed in arbitrary time units, which complicates direct interpretation alongside time-resolved experimental measurements. Moreover, certain qualitative discrepancies between simulated and experimental trends (e.g., persistent versus decreasing CDK4/6 activity) are not thoroughly discussed. As this figure represents the primary empirical reference point in the manuscript, the extent to which the model captures experimentally observed dynamics remains uncertain.

      Finally, aspects of presentation continue to limit transparency. Parameter ranges are described at different points in the manuscript but are not consolidated clearly in the Methods, and the definition of initial conditions remains ambiguous - particularly whether these correspond to conserved quantities or to the dynamic variables used to initialise simulations. In addition, the exact number of model instances underlying specific analyses and figures is not always explicit. Greater clarity on these issues is essential for assessing reproducibility and for interpreting the quantitative claims of the study.

      (2) A central conclusion of the manuscript is that heterogeneity in protein-protein interaction kinetics is a stronger driver of adaptive resistance than heterogeneity in protein expression levels. To assess the latter, the authors fix a nominal set of kinetic parameters and generate 100,000 random initial concentrations for the 50 model species. However, according to the simulation protocol described in the manuscript, each trajectory includes three phases: (i) simulation under starvation conditions to equilibrium, (ii) mitogenic stimulation to a second ("fed") equilibrium, and (iii) application of drug treatment. The equilibrium concentrations reached in phases (i) and (ii) are determined by the kinetic parameters of the model and are independent of the initial concentrations, provided the system converges to a stable steady state. In dynamical systems terms, stable equilibria are defined by the parameter set and attract all initial conditions within their basin of attraction. Since the kinetic parameters are fixed in this experiment, the pre-treatment equilibrium that serves as the starting point for drug application should likewise be fixed. Under these conditions, it is therefore not unexpected that sampling a large number of initial concentrations has limited influence on the treated dynamics.

      This raises conceptual questions about the interpretation of the comparison between kinetic and expression heterogeneity. If the system converges to a unique stable steady state prior to treatment, then variability in initial concentrations does not propagate into variability in drug response, and the observed dominance of kinetic heterogeneity may partly reflect this structural property of the model rather than a biological principle. Clarification is needed regarding whether multiple steady states exist under the nominal parameter set, and if so, how basins of attraction are explored.

      More broadly, it remains unclear why initial protein concentrations can be sampled independently of the kinetic parameters. In biological systems, steady-state expression levels are typically determined by the underlying kinetic rates. A more consistent approach might require constraining initial concentrations to correspond to equilibrium states of the chosen parameter set, thereby introducing relationships between at least some of the 50 initial conditions and the 94 kinetic parameters. Finally, the manuscript employs a non-standard terminology regarding "initial conditions," which may further obscure interpretation of these results and would benefit from clarification.

      (3) The technical implementation of the modelling and simulation framework remains difficult to evaluate due to insufficient methodological detail. Although the authors state that kinetic parameters are randomly sampled, the manuscript does not specify the distributions from which parameters are drawn, nor whether potential correlations between parameters are considered or explicitly ignored. Without this information, it is not possible to assess how implicit modelling assumptions shape the ensemble of simulated behaviours. Given that the conclusions rely on frequency-based interpretations across sampled parameter sets, greater transparency regarding the sampling procedure is essential.

      A further concern relates to the parameter filtering step. The authors report that the "vast majority" of sampled parameter sets produced systems that were "too stiff," and that these were excluded on the grounds that stiff dynamics are not biologically plausible. However, the manuscript does not clearly define how stiffness is assessed, nor why stiffness is interpreted as biologically unrealistic rather than as a numerical property of the formulation. In standard practice, stiff systems are typically handled using appropriate implicit solvers rather than being discarded. Similarly, parameter sets that produce negative state values are excluded, yet such behaviour may arise from numerical artefacts rather than from intrinsic model inconsistency. The rationale for excluding these parameter sets, rather than adapting the numerical scheme, is not sufficiently justified.

      The reported rejection rate - approximately 90% of sampled parameter sets - is substantial and raises questions regarding the interplay between model structure, parameter ranges, and numerical methods. As currently described, the filtering step appears to select parameter sets based primarily on computational tractability rather than on experimentally motivated biological criteria. The manuscript would be strengthened by clarifying whether the retained parameter sets are representative of biologically meaningful regimes, and by distinguishing clearly between exclusions based on biological plausibility and those arising from numerical considerations.

      Finally, important aspects of the simulation protocol require clarification. The model is simulated under "fasted" and "fed" conditions until equilibrium is reached, yet the criterion used to determine convergence is not specified. It would be important to describe how equilibrium is assessed (e.g., based on the norm of the time derivatives). Additionally, it remains unclear whether the mitogenic stimulus applied in the "fed" phase is assumed to be constant over time and, if so, how this assumption relates to biological experimental conditions. Greater detail on these implementation choices is necessary to ensure interpretability and reproducibility.

      (4) The manuscript states that the modelling conclusions are strongly supported by existing literature; however, the validation presented does not fully substantiate this claim. As noted above, the comparison with CDK2 and CDK4/6 experimental data remains qualitative, and the use of arbitrary simulation time units complicates interpretation of temporal agreement. The extent to which the model quantitatively or mechanistically recapitulates experimentally observed dynamics therefore remains uncertain.

      The claim that the model reproduces known resistance mechanisms is also difficult to assess in light of Figure S10, where a large fraction of network nodes (~80%) appear implicated in resistance under some conditions. If most components of the network can, in at least some parameter regimes, be associated with resistance phenotypes, the resulting lack of selectivity weakens the strength of model-based validation. It becomes challenging to distinguish specific mechanistic insights from generic consequences of network connectivity.<br /> In addition, the Supplementary Information notes that certain components of the mitogenic and cell-cycle pathways were abstracted or excluded in order to maintain computational tractability. While such abstraction is understandable in a large ODE framework, it raises interpretative questions. Proteins identified as potential resistance drivers within the model may, in some cases, represent aggregated or simplified pathway effects. Clarifying in the main text how such abstractions may influence the attribution of resistance mechanisms would strengthen the biological interpretation of the results.

      Drug inhibition is central to the manuscript's conclusions. The revised version clarifies that inhibition is implemented as a fixed fractional modification of specific kinetic rate laws. This abstraction is appropriate for exploring network-level responses, but it represents a stylised perturbation rather than a pharmacologically calibrated model of drug action. For full interpretability and reproducibility, the mathematical form of the modified rate laws, as well as the timing of inhibition relative to network equilibration, should be specified unambiguously. The biological implications of the findings depend critically on understanding this modelling choice.

      The one-at-a-time perturbation analysis presented in Figure 5 provides an interpretable ranking of first-order control points across the ensemble and offers mechanistic insight into primary sensitivities of the network. However, many targeted therapies act on multiple components, and resistance frequently arises through combinatorial mechanisms. The reported rankings should therefore be interpreted as identifying primary influences under isolated perturbations, rather than as a comprehensive account of multi-target drug behaviour.

      Overall, the manuscript succeeds in presenting a conceptual and exploratory framework for analysing how signalling network topology can shape the qualitative landscape of adaptive responses under heterogeneous kinetic conditions. Its principal contribution lies in establishing a systematic platform for large-scale in silico exploration. At the same time, the current limitations in biological calibration, parameter grounding, and validation constrain the extent to which the conclusions can be interpreted as predictive or quantitatively representative of specific tumour contexts. Addressing these issues would further strengthen the connection between the theoretical landscape described here and experimentally observed resistance dynamics.

    1. Reviewer #2 (Public review):

      Summary:

      This paper attempts to examine how rare, extreme events impact decision-making in rats. The paper used an extensive behavioural study with rats to evaluate how the probability and magnitude of outcomes impact preference. The paper, however, provides limited evidence for the conclusions because the design did not allow for the isolation of the rare, extreme events in choice. There are many confounding factors, including the outcome variance and presence of less-rare, and less-extreme outcome in the same conditions.

      Strengths.

      (1) The major strength of the paper is the significant volume of behavioural data with a reasonable sample size of 20 rats.

      (2) The paper attempts to examine losses with rats (a notoriously tricky problem with non-human animals) by substituting time-outs as a proxy for losses. This allows for mixed gambles that have both gain and loss possible outcomes.

      (3) The paper integrates both a behavioural and a modelling approach to get at the factors that drive decision-making.

      (4) The paper takes seriously the question of what it means for an event to be rare, pushing to less frequent outcomes than usually used with non-human animals.

      Weaknesses:

      (1) The primary issue with this work is that the primary experimental manipulation fails to isolate the rare, extreme events in choice. As I understand the task, in all the conditions with a rare extreme event (e.g., 80 pellets with probability epsilon), there is also a less-rare, less-extreme event (e.g., 12 pellets with probability 5). In addition, the variance differs between the two conditions. So, any impact attributable to the rare, extreme event could be due to the less rare event or due difference in the variance (or other statistical moments, like skew or kurtosis). That the distributions can be shown to be different under specific assumption to value maximizing agents (e.g., with Jensen Gaps and Table 2) is not really relevant to what rats are sensitive and what drive their behaviour. The design here does not support the conclusions. Finally, by deliberately confounding rarity and extremity, the design does not allow for assessing the impact of either aspect on rat behaviour.

      (2) The RL modelling work also fails to show a specific impact of the rare extreme event. As best as I can understand Eq 2, the model provides a free parameter that adds a bonus to the value of either the two options with high-variance gains (A and V in the paper) or to the two options with high-variance losses (F and V in the paper). Or equivalently to the ones with "Jackpots" vs the ones with "Black Swans" (see Point 1 above as to how these different aspects are all confounded in this design). This parameter seems to only depends on whether this option could have possibly yielded the rare, extreme outcome (i.e., based on the generative probability) and was not connected to its actual appearance. [This point is unclear as the text says this, but the rebuttal states otherwise; plus some options never received the REE, see Table S11]. That makes it a free parameter that just bumps up (or down) the probability of selecting a pair of options. That may be due to presence of the REE or the other rare event or just the variance difference. Moreover, in the case of the "black swan" or high-variance loss conditions, this seems very much like a loss aversion parameter, but an additive one instead of a multiplicative one. Is there a theoretical claim here that "extreme losses" need an additive loss-aversion parameter?

      (3) The paper presented the methods and results with lots of neologisms and fairly obscure jargon (e.g., fragility, total REE sensitivity). That might it very hard to decipher exactly what was done and what was found. For example, on p. 4, the use of concave and convex was very hard to decipher; the text even has to repeat itself 3 times (i.e., "to repeat" and "in other words") and is still not clear. It would be much clearer (and probably accurate) to say that the options varied along the variance dimension, separately for gains and losses. Option A was low-variance gains and losses. Option B was low-variance losses and high-variance gains. Option C was high-variance losses and low-variance gains, and Option D was high-variance losses gains. That tells much more clearly what the animals experienced without the reader having to master a set of new terminologies around fragility and robustness, which brings a set of theoretical assumption unnecessarily into the description of the experimental design. Alternatively, if the authors are wary of using the term "variance" because other moments of the distribution also differ, they could use "high-value gains" or "high-value losses" or something else which does not obscure the experimental design with jargon. Again, this goes back to point 1 above, whereby the different options differ on so many dimensions (as is made even more apparent in the rebuttal) that the design cannot isolate the impact of the variables of interest.

      (4) Were the probabilities shuffled or truly random (seem to be fixed sequences, so neither)? What were the experienced probabilities? Given the fixed sequences, these experienced ("ex-post") probabilities, could differ tremendously from the scheduled ("ex ante") probabilities. It's quite possible than an animal never experienced the rare, extreme event for a specific option. From Table S11, that is guaranteed to have happened in that 4 animals only ever experienced the "black swan" outcome once. It's even possible (if they only picked a specific option on the 10th/60th choices by chance), that they only ever experienced that rare extreme event. This point still cannot be known given the information provided, which does not break down outcomes by options. The Supplemental in Table S11 only gives overall numbers but does not indicate what the rats experienced for each choice/option-which is what matters here. A simple table that indicates for each of the 4 options, how often they were selected, and how often the animals experienced each of the 6-8 possible outcome would make it much clearer how closely the experience matched the planned outcomes. In addition, by restricting the rare outcome to either the 10th or 60th activations in a session, these are not random. Did the animals learn this association? The text states that they did not, but no evidence is provided.

      (5) The choice data are generally presented in an overprocessed fashion with a sum and a difference (in both figures and tables). The basic datum (probability/frequency of selecting each of the 4 options) is not provided directly in the main text, even if it can theoretically be inferred from the sum and the difference. New right side of Table S4 is probably the most valuable piece in terms of explaining what rats did and should be highlighted a lot more. Inspection of that table reveals some interesting (and potentially worrying) results. Most notably, the vast majority of responding happens on the "anti-fragile" and "robust" option, often totalling around 90% of all selections, especially amongst the most common blue rats. Alas, those were all those the two options that were deliberately assigned to the two most preferred holes in the training phase (see p. 26). Does this reflect genuine preference for reward distributions or does this reflect a spatial hole bias? The assignment strategy makes this impossible to tell apart.

      (6) There is insufficient detail provided on the inferential statistical tests (e.g., no degrees of freedom or effect sizes), and only limited information on exactly what tests were run and how (bootstrapping, but little detail). Without code or data (only summary information is provided in the supplement), this is difficult to evaluate. In addition, the studies seem not to pre-registered in any way, leaving many research degrees of freedom. Not all studies need to be pre-registered and sometimes discovery of new things requires exploratory work, but preregistration does provide additional safeguards against overemphasizing post-hoc detected patterns-a serious issue in behavioural science. Moreover, this promotes transparency in reporting results and analyses, allowing for a better assessment of the strength of evidence for a claim. For example, here, were any alternative analysis pipelines attempted? Also, there were many sub-groupings of the animals and subsequent comparisons between them which all seemed post-hoc. On what grounds were these divisions made-were other divisions examined as well?

      (7) On p. 12 (Fig 4), there is an attempt to look at the impact of a rare, extreme event by plotting a measure of preference for the 10 trials before/after the rare, extreme event. In the human literature, the main impact of experiencing a rare, extreme event is what is known as the wavy recency effect (See Plonsky et al. 2015 in Psych Review for example, now cited). What this means is that there tends to there tends to be some immediate negative recency (e.g., avoiding a rare gain) followed by positive recency (e.g., chasing the rare gain). Typically, this refers to the specific option that yielded that outcome. First, as the other analyses do, the current analysis combines choice of the option that yielded the rare outcome with choice of other options, so that cannot directly assess the impact of the rare, extreme event on choice. Also, using a 10-trial window would thus obscure any impact of this rare, extreme event. There is mention of the very next trial, but an analysis that looks at the 10-trial time course trial-by-trial could reveal any impact that might be predicted from the human literature.

      (8) As I understood the method (p. 31), the assignment of options to physical locations was not random or counterbalanced, but deliberately biased to have one of the options in the preferred location. This would seem to create a bias towards a particular option and a bias away from the other options, which confounds the preference data in subsequent analyses. Table S4 reinforces this concern where the vast majority of response are clustered in the two most preferred options from training.

      (9) Are delays really losses? This is a big assumption. Magnitude and delay are different aspects of experience, which are not necessarily commensurable and can be manipulated independently. And, for the model, how were these delays transformed into outcomes for the model. Eq 1 skips over that. Is there an assumption of linearity? In addition, I was not wholly clear if the delays meant fewer trials in a session or if the delays merely extended the session and meant longer delays until the next choice period.

      Other points:

      (1) I think the authors still misunderstand the concept of "hot-stove effects". The idea is that the experience of a very bad outcome can lead to avoiding the situation again (i.e., not sampling that option) and can provide the appearance of oversensitivity to that bad outcome. Here, that might be more thought as "black-swan avoidance". Imagine if, to the rat, all options are equal in value, then some initial bad luck in encountering the black swan might make the animal avoid that option, even though with enough experience, then it would have been equal in value.

      (2) I am still not convinced that the Jensen inequalities add to this paper in terms of understanding the rat behaviour. That may be more suited for a different paper about the statistical and mathematical properties of certain generative distributions, but not here given what rats actually choose and experience.

      (3) Providing the data open access is very good. The code, however, should be equally available and not just upon request. Code needs to be available for assessment during peer review and for reproducibility checks. There are substantial enough problems with reproducibility in the field that code availability should be a minimum criterion for publication (see Miske et al., 2026 in Nature for the most recent large-scale evaluation of this problem).

      (4) The paper still somewhat mischaracterizes the literature on rare events, posing it as a series of "exceptions", rather than recognizing that a huge chunk of the literature uses rare events rarer than 10%. Also, there is even existing terminology in that literature for exactly the situation that is being created here-rare treasures (aka jackpots here) and rare disasters (aka Black Swans here).

      (5) Defining the observed behaviour in terms convexity, instead of stating choices more plainly obscures what is done/found. This is especially the case here because convex and concave mean different things when applied to gains/losses in terms of whether or not that option can lead to the REE. The use of the terms obscures rather than clarifies and probably is best left for the discussion (and maybe the intro) when mapping from theoretical distributions to the experiment at hand. In the paper, even the bottom of p.5 seems to incorrectly define "Total Sensitivity" as the combined proportion of selecting convex options in either domain, which does not map how convex is defined in Fig 1B or elsewhere in the text.

      (6). Fig 1C is baffling. Why are probabilities drawn moving away from the origin? The standard scientific plotting convention is for numbers to grow when moving away from the origin. That would be vastly clearer. Also, the color coding is confusing. Green-red maps onto convex-concave, but that would naturally seem to indicate gains vs losses, not convex vs concave. And why are probabilities growing larger in both directions from the origin? Much more sensible to communicate the procedure would likely be a standard plot of magnitude vs probability.

      (7) Discussion: I think the main difference between the human situations discussed and this experiment is that humans have not experienced those rare "black swan" outcomes. Rather, they hear about the disasters that are possible and do not incorporate that information, as discussed in the description-experience literature already cited in this paper (though not in that context).

    1. Reviewer #1 (Public review):

      I read this paper with great interest based on my experience in insect sciences. Previous concerns:

      (1) The paper has an original biological question that is overly broad and mechanistically ambitious. The central biological question, namely how CLas infection enhances fecundity of Diaphorina citri via dopamine signaling, is clearly stated and well motivated by previous literature. However, my advice to the authors is that, while the general question is clear, the manuscript attempts to answer multiple mechanistic layers simultaneously. As a result, I feel that the biological narrative becomes diffuse, especially in later sections where DA, miRNA regulation, AKH signaling, and JH signaling are all proposed as parts of a single linear cascade. In summary, my key concern is that the paper often moves from correlation to causal hierarchy without fully disentangling whether these pathways act sequentially, in parallel, or redundantly. A more explicitly framed primary hypothesis (e.g., "DA-DcDop2 is necessary and sufficient for CLas-induced fecundity") may improve conceptual clarity.

      (2) On the novelty of the data, I feel they are moderately novel, with substantial confirmatory components. If I am correct, the novel contributions include the identification of DcDop2 as the DA receptor responsive to CLas infection in D. citri, the discovery that miR-31a directly targets DcDop2, which is supported by luciferase assays and RIP, and thirdly, the integration of dopamine signaling into the already-described CLas-AKH-JH-fecundity framework. My advice to the authors is to focus more on the manuscript's novelty, which lies more in pathway integration than in discovering fundamentally new biological phenomena. This is appropriate for a mechanistic paper, but should be framed as an extension of existing models rather than a paradigm shift.

      (3) On the conclusions, I recommend that the authors modify their statements a little. I feel that there are some overstated or insufficiently supported claims. For instance, the assertion that CLas "hijacks" the DA-DcDop2-miR-31a-AKH-JH cascade implies direct pathogen manipulation, but no CLas-derived effector or mechanism is identified. Also, that the model suggests a linear signaling hierarchy, but the data largely show correlation and partial dependency rather than strict epistasis. In third, the term "mutualistic interaction" may be too strong, as host fitness costs outside fecundity (e.g., longevity, immunity) are not evaluated. In conclusion, I confirm that the data support a functional association, but mechanistic causality and evolutionary interpretation are somewhat overstated.

      Comments on revised version:

      The authors provided a satisfactory revision.

    2. Reviewer #2 (Public review):

      Summary:

      Nian and colleagues comprehensively apply metabolomics, molecular, and genetic approaches to demonstrate that CLas hijacks the DA/DcDop2-miR-31a-AKH-JH signaling cascade to enhance lipid metabolism and fecundity in D. citri, while concurrently promoting its own replication.

      Strengths:

      These findings provide solid evidence of a mutualistic interaction between CLas proliferation and ovarian development in the insect host. This insight significantly advances our understanding of the molecular interplay between plant pathogens and vector insects and offers novel targets and strategies for HLB field management.

      Weaknesses:

      While the article investigates the involvement of dopamine signaling and specific microRNAs in enhancing fecundity and pathogen proliferation, it still needs to provide a detailed mechanistic understanding of these interactions. The precise molecular pathways and feedback mechanisms by which CLas manipulates dopamine signaling in Diaphorina citri remain unclear.

    1. Reviewer #1 (Public review):

      (1) In this study, the authors aimed at characterizing Huntington's Disease (HD) - related microstructural abnormalities in the basal ganglia and thalami as revealed using Soma and Neurite Density Imaging (SANDI) indices (apparent soma density, apparent soma size, extracellular water signal fraction, extracellular diffusivity, apparent neurite density, fractional anisotropy and mean diffusivity).

      (2) The study implements a novel biophysical diffusion model that extends up-to-date methodologies and presents a significant potential for quantifying neurodegenerative processes of the grey matter of the human brain in vivo. The authors comment on the usefulness of this technique in other pathologies, but they exemplify only with multiple sclerosis. Further development of this, building evidence should be provided.

      (3) Study found that HD-related neurodegeneration in the striatum accounted significantly for striatal atrophy and correlated with motor impairments. HD was associated with reduced soma density, increased apparent soma size and extracellular signal fraction in the basal ganglia, but not in the thalami. Additionally, these affects were larger at manifest stage.

      (4) The results of this work demonstrate the impact of HD on basal ganglia and thalami which can be further explored as a non-invasive biomarker of disease progression. Additionally, the study shows that SANDI can be used to explore grey matter microstructure in a variety of neurological conditions.

      Comments on revised version.

      I have no further comments. Thank you

    2. Reviewer #3 (Public review):

      Summary:

      Ioakeimidis and colleagues studied miscrostructural abnormalities in N=56 Huntington's disease (HD) patients compared to N=57 normative controls. The authors used a powerful MRI Connectom scanner and applied the SANDI model to estimate the soma size, neurite size, soma density, and extracellular fraction in key subcortical nuclei related to HD. In the striatum, they found decreased soma density and increased soma size, which also seemed to become more pronounced in advanced HD individuals in the final exploratory analyses. The authors conducted important analyses to find whether the SANDI measures correlate with clinical scores (i.e., QMotor) and whether the variance of the striatal volume is explained by the SANDI measures. They found a relationship of SANDI measures to both.

      Strengths:

      The study is both innovative and of high interest for the HD community. The authors provide a rich pool of statistical analyses and results which anticipate the questions that may emerge in the HD research community. Statistics are carefully chosen and image processing is done with state-of-the-art methods and tools. The sample size gives sufficient credibility to the findings. Altogether, I think this study sets a milestone in the attempts of the HD community to understand neuropathological processes with non-invasive methods, and extends the current knowledge of microstructural anomalies identified in HD with diffusion MRI. More importantly, the newly identified anomalies in soma size and soma density open new avenues for studying these biological effects further, and perhaps develop these biomarkers for use in clinical trials.

      Weaknesses:

      (1) An important question is whether the SANDI measures, which require an expensive scanner and elaborate processing, are better biomarkers than the more traditional DTI measures. Can the authors compare the effect size of FA/MD with SANDI measures. In some of the plots and tables, FA/MD seem to have comparable, if not higher, correlations with QMotor or CAP scores. On the same vein, it is unclear whether DTI measures were included in hierarchical stepwise regression. I wonder if the stepwise models may have picked up FA/MD instead of SANDI measures if they are given a chance. Overall, I hope the authors can discuss their findings also in this light of cost vs. benefit of adopting SANDI in future studies, which is an important topic for clinical trials.

      (2) Similar to the above point, it is very important to consider how strong the biomarking signal is from SANDI measures compared to the good old striatal volume. Some plots seem to indicate that volumes still have the highest correlation with QMotor, and highest effect size in group comparisons. It would be helpful for the community to know where do the new SANDI measures stand compared to the most typically used volumes in terms of effect size.

      (3) The diffusion measures are inevitably correlated to some degree. Please provide a correlation matrix in supplementary material including all DWI measures to enable readers to understand better how similar SANDI measures are between each other or vs. other DTI measures. Perhaps adding volumes to this correlation matrix may also be a good future reference.

      (4) ISS stages:

      (a) The online ISS calculator requires cut-offs derived from the longitudinal Freesurfer pipeline, while the authors do not have longitudinal data. Thus, the ISS classification might be inaccurate to some degree if the authors used the FS cross-sectional pipeline. Please review this issue and see if updated cut-offs should be used to classify participants.<br /> (b) Were there really no participants with ISS 0 among 56 HD individuals, please clarify in the manuscript?<br /> (c) A note on terminology that might be confusing to some readers. According to the creators of ISS, the ISS stages are created for research only, they are not used or applied in the clinic. On the other hand, the terms "premanifest" and "manifest" have a clinical meaning, typically based on the diagnostic confidence level. The assignment of ISS0-1 to premanifest and ISS2-3 to manifest may create some non-trivial confusion, if not opposition, in some segments the HD community. The authors can keep their current terminology but will need to at least clarify to the reader that this assignment is speculative, does not fully match the clinically-based categories, and should not be confused with similarly named groups in the previous literature.

      Comments on revised version.

      The authors have moved to address many points from reviewers. The manuscript had indeed become more objective, transparent, and to the point. The amount of information and analyses is large, which perhaps is inevitable when new methods are being tested for the first time in a neurodegenerative disease.

    1. Reviewer #1 (Public review):

      Integrating large-field stimulation with a retinotopic atlas, this study introduces an fMRI-based method for measuring contrast sensitivity across the visual field. Retinotopy was assessed using pRF mapping and a calibrated Benson atlas. The authors validate their method by replicating known patterns of contrast sensitivity across eccentricities and visual field quadrants in healthy subjects, and demonstrate its potential clinical utility through case studies of both simulated and real visual field loss.

      Comments on revisions:

      I appreciate the addition of the quadrant-scotoma condition and the authors' clarification that the goal is to demonstrate individual-level detection sensitivity. The 95% CI argument is reasonable, and I am satisfied with framing the simulated-scotoma work as proof-of-concept.

    2. Reviewer #2 (Public review):

      Summary

      This study uses functional MRI to evaluate visual contrast sensitivity across the visual field at the level of the visual cortex, testing the method as a proof of principle in a small group of normally sighted individuals, modelling both normal vision and simulated vision loss, as well as a patient with independently verified vision loss. The results suggest a promising technique to measure vision objectively across the visual field and overcomes the requirement for careful fixation which is often challenging in those with low vision or sight loss.

      Strengths

      • Objective measure of central vision: The proposed method may provide a more comprehensive and objective assessment of residual visual function in individuals with sight loss. This may be particularly useful for those with central visual field loss without the requirement of stable fixation or subjective motor responses.

      • More sensitive measure: The use of slope to calculate contrast sensitivity across a range of contrasts within the brain is clever and likely more sensitive than single threshold measurements or standard clinical measures of visual acuity using letter charts. Standard supra-threshold (high contrast) tests are not ideal for capturing residual vision or partial vision loss.

      • Good agreement with standard atlas: The Benson atlas provides a good estimate of visual field maps within V1 based on anatomical landmarks, and the authors take steps to refine this informed by cortical magnification and V1 surface area (brain size) for each individual participant. This could allow the technique to be generalised without the need to collect lengthy individual mapping data from every participant.

      • Within-subject reproducibility: The measurements appear to be sensitive and reproducible, particularly in those with normal vision, and are consistent with known features of visual sensitivity differences in different parts of the visual field.

      • Potential tool to measure visual field sensitivity in controls: Even if the proposed methods are not ideal for widespread clinical translation, they do offer an exciting tool to test hypotheses about visual field differences in healthy controls. For example, there seems to be an increase in sensitivity on either side of the simulated ring scotoma (Fig 6 - perhaps due to the release of lateral inhibition?). Reliability measures suggest that individual differences are consistent in healthy controls (although not tested statistically, perhaps due to the small sample size?). Whether they reflect behaviourally meaningful differences in visual field sensitivity could be tested in individuals by comparing them to behavioural measures across the visual field.

      • Potential tool to test novel treatments: The proposed techniques could be used to test within-subject changes in visual function in environments that are equipped to measure and analyse fMRI data, including clinical trials aimed at determining the success of novel treatments. Preliminary testing in healthy controls with eye movements also suggests that the method is suitable for testing low vision patients with unstable fixation (e.g., nystagmus), and the authors have modelled the effects of varying amounts and types of eye movements on functional outcome measures.

      Weaknesses

      • Questionable sensitivity to differences in patients. The variability in heat maps across healthy control participants is somewhat surprising, and it is uncertain whether they represent actual visual sensitivity differences or an artifact of the measurement technique, e.g., due to signal-to-noise differences introduced by local variations in brain anatomy. Thus, it is uncertain whether the substantial variance across controls will allow for a sufficiently stable baseline to detect meaningful differences in individual patients. Also, as the authors rightly point out, Benson atlas does not model differences along meridians, so that upper/lower field differences might not be detectable. However, the authors acknowledge that this is a pilot study, and further testing a wider range of scotoma types in patients and simulated in controls will only improve the methods. Furthermore, the ability to capture visual field representations in human visual cortex is also likely to improve with computational advances, making the use of atlases more feasible, obviating the need for individualised population receptive field mapping.

      • Potential for clinical translation. Although it is a sensitive measure, functional MRI is costly, is not available in all clinical settings, requires significant post-processing analyses, and may be contraindicated in some individuals due to safety (e.g., metallic implants) or other concerns (e.g., claustrophobia). These could present significant barriers to widespread clinical translation, if this were the ultimate goal of the study.

      • Limited range of spatial frequencies. The spatial frequencies tested were still quite low (0.3 and 3cpd) compared to measures such a visual acuity. Extending the measurements to higher spatial frequencies could allow better characterization of central vision, although necessarily for peripheral vision. However, this may depend on the typical visual abilities of the patient population of interest.

      Appraisal and Impact:

      The authors used appropriate and robust methods to assess and model known features of visual sensitivity differences across the visual field in sighted controls. In addition, the assessment technique successfully captured sensitivity changes due to simulated and actual partial field loss but was also fairly resilient to eye movements and fixation instability, typical of patients with sight loss. Although currently providing a proof of principle, the method is likely to improve with further testing and increasing normative sample sizes, and as computational methods continue to advance visual field map predictions. Although it may not be adopted widely as a standard clinical assessment technique due to the expense and other obstacles, it would provide a valuable tool in assessing clinical populations, for example in the context of clinical trials to assess suitability for treatment interventions or monitor treatment outcomes.

    3. Reviewer #3 (Public review):

      Summary:

      Chow-Wing-Bom et al. introduce an innovative wide-field visual stimulation setup for 3T experiments that enables stimulation up to a diameter of 40{degree sign} visual angle while allowing continuous gaze tracking. Using this setup, the authors systematically investigate contrast sensitivity across the visual field by presenting subjects with sinusoidal gratings varying in contrast and spatial frequency. Their findings confirm the expected organization of contrast sensitivity, demonstrating a preference for high spatial frequencies in the central field and lower frequencies in the periphery. They also extend these measurements to eccentricities up to 20{degree sign}, which exceeds previous fMRI-based reports. Moreover, the study explores the potential of using contrast sensitivity calculations as a method for detecting visual field defects, demonstrated in a healthy subject with simulated ring-shaped and upper-right-quadrant scotomas, and in a patient with LHON. The revised version additionally characterises the robustness of the approach to varying degrees of fixation instability.

      Strengths:

      - The manuscript is well written and provides comprehensive methodological details, ensuring high transparency and reproducibility.

      - The visual stimulation setup represents a significant technical advance by enabling wide-field stimulation with continuous eye tracking, which is crucial for both research and potential clinical applications.

      - The study confirms established findings regarding the organization of contrast sensitivity while extending them to a larger eccentricity range.

      - The efforts to establish a measure for visual field losses aligns with current efforts to develop objective alternatives to conventional perimetry.

      - The revised manuscript includes an empirical assessment of how varying levels of eye movement affect cortical contrast sensitivity estimates, providing useful guidance on the tolerance of the approach to fixation instability.

      Weaknesses:

      - The original version left certain methodological aspects unclear, particularly the correction of eccentricity values from the Benson atlas and the V1 masks used in each analysis branch. The authors have added a dedicated figure illustrating the eccentricity correction procedure and now explicitly state that a manually delineated V1 mask was used for the pRF-based analyses while the Benson V1 label was used for the atlas-based analyses, together with a discussion of how this difference may influence the comparison.

      - Minor inconsistencies in reporting, such as the introduction of a second session in the Results section, have been corrected.

      The conclusion that high-contrast patterns as in pRF mapping are not optimal to test for subtle but potentially clinically relevant changes in the visual field coverage are very valid. The suggested use of contrast sensitivity can therefore be a potentially well-suited parameter for estimating visual field losses. The presented work is an interesting starting point, and the proposed method of using contrast sensitivity as measure for partial vision loss should be further explored.

      Comments on revisions:

      The authors have thoroughly addressed all points raised in my original review, and I have no further concerns.