10,000 Matching Annotations

Oct 2025
www.biorxiv.org www.biorxiv.org

Foveated metamers of the early visual system

3
1. Public_Reviews 14 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 This is an interesting study on the nature of representations across the visual field. The question of how peripheral vision differs from foveal vision is a fascinating and important one. The majority of our visual field is extra-foveal yet our sensory and perceptual capabilities decline in pronounced and well-documented ways away from the fovea. Part of the decline is thought to be due to spatial averaging ('pooling') of features. Here, the authors contrast two models of such feature pooling with human judgments of image content. They use much larger visual stimuli than in most previous studies, and some sophisticated image synthesis methods to tease apart the prediction of the distinct models.
 
 More importantly, in so doing, the researchers thoroughly explore the general approach of probing visual representations through metamers-stimuli that are physically distinct but perceptually indistinguishable. The work is embedded within a rigorous and general mathematical framework for expressing equivalence classes of images and how visual representations influence these. They describe how image-computable models can be used to make predictions about metamers, which can then be compared to make inferences about the underlying sensory representations. The main merit of the work lies in providing a formal framework for reasoning about metamers and their implications, for comparing models of sensory processing in terms of the metamers that they predict, and for mapping such models onto physiology. Importantly, they also consider the limits of what can be inferred about sensory processing from metamers derived from different models.
 
 Overall, the work is of a very high standard and represents a significant advance over our current understanding of perceptual representations of image structure at different locations across the visual field. The authors do a good job of capturing the limits of their approach I particularly appreciated the detailed and thoughtful Discussion section and the suggestion to extend the metamer-based approach described in the MS with observer models. The work will have an impact on researchers studying many different aspects of visual function including texture perception, crowding, natural image statistics and the physiology of low- and mid-level vision.
 
 The main weaknesses of the original submission relate to the writing. A clearer motivation could have been provided for the specific models that they consider, and the text could have been written in a more didactic and easy to follow manner. The authors could also have been more explicit about the assumptions that they make.
 
 Comments following re-submission:
 
 Overall, I think the authors have done a satisfactory job of addressing most of the points I raised.
 
 There's one final issue which I think still needs better discussion.
 
 I think reviewer 2 articulated better than I have the point I was concerned about: the relationship between JNDs and metamers as depicted in the schematics and indeed in the whole conceptualization.
 
 I think the issue here is that there seems to be a conflating of two concepts- 'subthreshold' and 'metamer'-and I'm not convinced it is entirely unproblematic. It's true that two stimuli that cannot be discriminated from one another due to the physical differences being too small to detect reliably by the visual system are a form of metamer in the strict definition 'physically different, but perceptually the same'. However, I don't think this is the scientifically substantial notion of metamer that enabled insights into trichromacy. That form of metamerism is due to the principle of univariance in feature encoding, and involves conditions in which physically very different stimuli are mapped to one and the same point in sensory encoding space whether or not there is any noise in the system. When I say 'physically very different' I mean different by a large enough amount that they would be far above threshold, potentially orders of magnitude larger than a JND if the system's noise properties were identical but the system used a different sensory basis set to measure them. This seems to be a very different kind of 'physically different, but perceptually the same'.
 
 I do think the notion of metamerism can obviously be very usefully extended beyond photoreceptors and photon absorptions. In the interesting case of texture metamers, what I think is meant is that stimuli would be discriminable if scrutinised in the fovea, but because they have the same statistics they are treated as equivalent. I think the discussion of this could still be clearly articulated in the manuscript. It would benefit from a more thorough discussion of the difference between metamerism and subthreshold, especially in the context of the Voronoi diagrams at the beginning.
 
 It needs to be made clear to the reader why it is that two stimuli that are physically similar (e.g., just spanning one of the edges in the diagram) can be discriminable, while at the same time, two stimuli that are very different (e.g., at opposite ends of a cell) can't.
 
 Do the cells include BOTH those sets of stimuli that cannot be discriminated just because of internal noise AND those that can't be discriminated because they are projected to literally the same point in the sensory encoding space? What are the strengths and limits of models that involve the strict binarization of sensory representations, and how can they be integrated with models dealing with continuous differences? These seem like important background concepts that ought to be included in either the introduction of discussion sections. In this context it might also be helpful to refer to the notion of 'visual equivalence' as described by:
 
 Ramanarayanan, G., Ferwerda, J., Walter, B., & Bala, K. (2007). Visual equivalence: towards a new standard for image fidelity. ACM Transactions on Graphics (TOG), 26(3), 76-es.
 
 Other than that, I congratulate the authors on a very interesting study, and look forward to reading the final version.
 
 Review 1
2. Public_Reviews 14 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The authors have improved clarity overall and have spoken to most of the issues raised by the reviewers. There are still two outstanding problems however, where issues raised during the review were inappropriately dismissed in the manuscript. These should be explicitly addressed as limitations to the results presented (no eye tracking), and early pilot experiments that informed the experiments as presented (pink noise) rather than brushed off as 'unnecessary' and 'would be uninformative'.
 
 Eye tracking:
 
 It is generally accepted that experiments testing stimuli presented at specific locations in peripheral vision require eye tracking to ensure that the stimulus is presented as expected, in particular, in the correct location. As I stated in the previous round of review, while a stimulus presentation time of 200ms does help eliminate some saccades, it does not eliminate the possibility that subjects were not fixating well during stimulus onset. I am also unclear what the authors mean by 'trained observer' in this context, though the authors state that an author subject in a different portion of the paper is an 'expert observer'. Does this mean the 'trained observers' are non-expert recruited subjects? Given the conditions tested differ from previous work (Freeman & Simoncelli, 2011) *these differences are a main contribution of the paper!* which DID include eye tracking in a subset of subjects, it is entirely possible to get similar results to this work in the context of non eye-tracking controlled stimulus presentation. The reasons now in the manuscript are not reasons that make eye tracking 'considered unnecessary'.
 
 I appreciate that the authors now state the lack of eye tracking explicitly, but believe the paper needs to at least state that this is a limitation of the results reported, and eyetracking being 'considered unnecessary' is unreasonable, nor a norm in this subfield.
 
 N=1: The authors now state clearly the limitations of a single subject in the manuscript, and state the expertise level of this subject.
 
 Large number of trials: The authors now address this and include an enumeration of the large number of trials.
 
 Simple Models / Physiology comparison: I support the choice to reduce claims regarding tight connections to physiology, and appreciate the explanation of the luminance model.
 
 Previous Work: I appreciate the author's changes to the introduction, both in discussing previous work and citation fixes.
 
 Blurred White, Pink Noise: While the authors now address pink noise, the explanation for such stimuli being expected to be uninformative is confusing to me. The manuscript now first states that pink noise is a natural choice, then claims it would be uninformative, while also stating in the rebuttal (not the manuscript) that they tried it and it indeed reduced the artifacts they note. The logic of the experiments indeed relies on finding the smallest critical scaling value, which is measured by subjects determining if a synthesis is similar or different to a target or second synth. A synthesis free from artifacts would surely affect the subjects responses and the smallest critical scaling measured.
 
 The statement that the authors experimented with pink noise early on and found this able to address the artifacts should be stated in the manuscript itself, not just in the rebuttal, and the blanket statement that this experiment would be 'uninformative' is incorrect. Surely this early pilot the authors mention in the rebuttal was informative to designing the experiments that appear in the final paper, and would be an informative experiment to include.
 
 Review 2
3. Public_Reviews 14 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public Review):
 
 This is an interesting study of the nature of representations across the visual field. The question of how peripheral vision differs from foveal vision is a fascinating and important one. The majority of our visual field is extra-foveal yet our sensory and perceptual capabilities decline in pronounced and well-documented ways away from the fovea. Part of the decline is thought to be due to spatial averaging (’pooling’) of features. Here, the authors contrast two models of such feature pooling with human judgments of image content. They use much larger visual stimuli than in most previous studies, and some sophisticated image synthesis methods to tease apart the prediction of the distinct models.
 
 More importantly, in so doing, the researchers thoroughly explore the general approach of probing visual representations through metamers-stimuli that are physically distinct but perceptually indistinguishable. The work is embedded within a rigorous and general mathematical framework for expressing equivalence classes of images and how visual representations influence these. They describe how image-computable models can be used to make predictions about metamers, which can then be compared to make inferences about the underlying sensory representations. The main merit of the work lies in providing a formal framework for reasoning about metamers and their implications, for comparing models of sensory processing in terms of the metamers that they predict, and for mapping such models onto physiology. Importantly, they also consider the limits of what can be inferred about sensory processing from metamers derived from different models.
 
 Overall, the work is of a very high standard and represents a significant advance over our current understanding of perceptual representations of image structure at different locations across the visual field. The authors do a good job of capturing the limits of their approach and I particularly appreciated the detailed and thoughtful Discussion section and the suggestion to extend the metamer-based approach described in the MS with observer models. The work will have an impact on researchers studying many different aspects of visual function including texture perception, crowding, natural image statistics, and the physiology of low- and mid-level vision.
 
 The main weaknesses of the original submission relate to the writing. A clearer motivation could have been provided for the specific models that they consider, and the text could have been written in a more didactic and easy-to-follow manner. The authors could also have been more explicit about the assumptions that they make.
 
 Thank you for the summary. We appreciate the positives noted above. We address the weaknesses point by point below.
 
 Reviewer #2 (Public Review):
 
 Summary
 
 This paper expands on the literature on spatial metamers, evaluating different aspects of spatial metamers including the effect of different models and initialization conditions, as well as the relationship between metamers of the human visual system and metamers for a model. The authors conduct psychophysics experiments testing variations of metamer synthesis parameters including type of target image, scaling factor, and initialization parameters, and also compare two different metamer models (luminance vs energy). An additional contribution is doing this for a field of view larger than has been explored previously
 
 General Comments
 
 Overall, this paper addresses some important outstanding questions regarding comparing original to synthesized images in metamer experiments and begins to explore the effect of noise vs image seed on the resulting syntheses. While the paper tests some model classes that could be better motivated, and the results are not particularly groundbreaking, the contributions are convincing and undoubtedly important to the field. The paper includes an interesting Voronoi-like schematic of how to think about perceptual metamers, which I found helpful, but for which I do have some questions and suggestions. I also have some major concerns regarding incomplete psychophysical methodology including lack of eye-tracking, results inferred from a single subject, and a huge number of trials. I have only minor typographical criticisms and suggestions to improve clarity. The authors also use very good data reproducibility practices.
 
 Thank you for the summary. We appreciate the positives noted above. We address the weaknesses point by point below.
 
 Specific Comments
 
 Experimental Setup
 
 Firstly, the experiments do not appear to utilize an eye tracker to monitor fixation. Without eye tracking or another manipulation to ensure fixation, we cannot ensure the subjects were fixating the center of the image, and viewing the metamer as intended. While the short stimulus time (200ms) can help minimize eye movements, this does not guarantee that subjects began the trial with correct fixation, especially in such a long experiment. While Covid-19 did at one point limit in-person eye-tracked experiments, the paper reports no such restrictions that would have made the addition of eye-tracking impossible. While such a large-scale experiment may be difficult to repeat with the addition of eye tracking, the paper would be greatly improved with, at a minimum, an explanation as to why eye tracking was not included.
 
 Addressed on pg. 25, starting on line 658.
 
 Secondly, many of the comparisons later in the paper (Figures 9,10) are made from a single subject. N=1 is not typically accepted as sufficient to draw conclusions in such a psychophysics experiment. Again, if there were restrictions limiting this it should be discussed. Also (P11) Is subject sub-00 is this an author? Other expert? A naive subject? The subject’s expertise in viewing metamers will likely affect their performance.
 
 Addressed on pg. 14, starting on line 308.
 
 Finally, the number of trials per subject is quite large. 13,000 over 9 sessions is much larger than most human experiments in this area. The reason for this should be justified.
 
 In general, we needed a large number of trials to fit full psychometric functions for stimuli derived for both models, with both types of comparison, both initializations, and over many target images. We could have eliminated some of these, but feel that having a consistent dataset across all these conditions is a strength of the paper.
 
 In addition to the sentence on pg. 14, line 318, a full enumeration of trials is now described on pg. 23, starting on line 580.
 
 Model
 
 For the main experiment, the authors compare the results of two models: a ’luminance model’ that spatially pools mean luminance values, and an ’energy model’ that spatially pools energy calculated from a multi-scale pyramid decomposition. They show that these models create metamers that result in different thresholds for human performance, and therefore different critical scaling parameters, with the basic luminance pooling model producing a scaling factor 1/4 that of the energy model. While this is certain to be true, due to the luminance model being so much simpler, the motivation for the simple luminance-based model as a comparison is unclear.
 
 The use of simple models is now addressed on pg. 3, starting on line 98, as well as the sentence starting on pg. 4 line 148: the luminance model is intended as the simplest possible pooling model.
 
 The authors claim that this luminance model captures the response of retinal ganglion cells, often modeled as a center-surround operation (Rodieck, 1964). I am unclear in what aspect(s) the authors claim these center-surround neurons mimic a simple mean luminance, especially in the context of evidence supporting a much more complex role of RGCs in vision (Atick & Redlich, 1992). Why do the authors not compare the energy model to a model that captures center-surround responses instead? Do the authors mean to claim that the luminance model captures only the pooling aspects of an RGC model? This is particularly confusing as Figures 6 and 9 show the luminance and energy models for original vs synth aligning with the scaling of Midget and Parasol RGCs, respectively. These claims should be more clearly stated, and citations included to motivate this. Similarly, with the energy model, the physiological evidence is very loosely connected to the model discussed.
 
 We have removed the bars showing potential scaling values measured by electrophysiology in the primate visual system and attempted to clarify our language around the relationship between these models and physiology. Our metamer models are only loosely connected to the physiology, and we’ve decided in revision not to imply any direct connection between the model parameters and physiological measurements. The models should instead be understood as loosely inspired by physiology, but not as a tool to localize the representation (as was done in the Freeman paper).
 
 The physiological scaling values are still used as the mean of the priors on the critical scaling value for model fitting, as described on pg. 27, starting on line 698.
 
 Prior Work:
 
 While the explorations in this paper clearly have value, it does not present any particularly groundbreaking results, and those reported are consistent with previous literature.The explorations around critical eccentricity measurement have been done for texture models (Figure 11) in multiple papers (Freeman 2011, Wallis, 2019, Balas 2009). In particular, Freeman 20111 demonstrated that simpler models, representing measurements presumed to occur earlier in visual processing need smaller pooling regions to achieve metamerism. This work’s measurements for the simpler models tested here are consistent with those results, though the model details are different. In addition, Brown, 2023 (which is miscited) also used an extended field of view (though not as large as in this work). Both Brown 2023, and Wallis 2019 performed an exploration of the effect of the target image. Also, much of the more recent previous work uses color images, while the author’s exploration is only done for greyscale.
 
 We were pleased to find consistency of our results with previous studies, given the (many) differences in stimuli and experimental conditions (especially viewing angle), while also extending to new results with the luminance model, and the effects of initialization. Note that only one of the previous studies (Freeman and Simoncelli, 2011) used a pooled spectral energy model. Moreover, of the previous studies, only one (Brown et al., 2023) used color images (we have corrected that citation - thanks for catching the error).
 
 Discussion of Prior Work:
 
 The prior work on testing metamerism between original vs. synthesized and synthesized vs. synthesized images is presented in a misleading way. Wallis et al.’s prior work on this should not be a minor remark in the post-experiment discussion. Rather, it was surely a motivation for the experiment. The text should make this clear; a discussion of Wallis et al. should appear at the start of that section. The authors similarly cite much of the most relevant literature in this area as a minor remark at the end of the introduction (P3L72).
 
 The large differences we observed between comparison types (original vs synthesized, compared to synthesized vs synthesized) surprised us. Understanding such difference was not a primary motivation for the work, but it is certainly an important component of our results. In the introduction, we thought it best to lay out the basic logic of the metamer paradigm for foveated vision before mentioning the complications that are introduced in both the Wallis and Brown papers (paragraph beginning p. 3, line 109). Our results confirm and bolster the results of both of those earlier works, which are now discussed more fully in the Introduction (lines 109 and following).
 
 White Noise: The authors make an analogy to the inability of humans to distinguish samples of white noise. It is unclear however that human difficulty distinguishing samples of white noise is a perceptual issue- It could instead perhaps be due to cognitive/memory limitations. If one concentrates on an individual patch one can usually tell apart two samples. Support for these difficulties emerging from perceptual limitations, or a discussion of the possibility of these limitations being more cognitive should be discussed, or a different analogy employed.
 
 We now note the possibility of cognitive limits on pg. 8, starting on line 243, as well as pg. 22, line 571. The ability of observers to distinguish samples of white noise is highly dependent on display conditions. A small patch of noise (i.e., large pixels, not too many) can be distinguished, but a larger patch cannot, especially when presented in the periphery. This is more generally true for textures (as shown in Ziemba and Simoncelli (2021)). Samples of white noise at the resolution used in our study are indistinguishable.
 
 Relatedly, in Figure 14, the authors do not explain why the white noise seeds would be more likely to produce syntheses that end up in different human equivalence classes.
 
 In figure 14, we claim that white noise seeds are more likely to end up in the same human equivalence classes than natural image seeds. The explanation as to why we think this may be the case is now addressed on pg. 19, starting on line 423.
 
 It would be nice to see the effect of pink noise seeds, which mirror the power spectrum of natural images, but do not contain the same structure as natural images - this may address the artifacts noted in Figure 9b.
 
 The lack of pink noise seeds is now addressed on pg. 19, starting on line 429.
 
 Finally, the authors note high-frequency artifacts in Figure 4 & P5L135, that remain after syntheses from the luminance model. They hypothesize that this is due to a lack of constraints on frequencies above that defined by the pooling region size. Could these be addressed with a white noise image seed that is pre-blurred with a low pass filter removing the frequencies above the spatial frequency constrained at the given eccentricity?
 
 The explanation for this is similar to the lack of pink noise seeds in the previous point: the goal of metamer synthesis is model testing, and so for a given model, we want to find model metamers that result in the smallest possible critical scaling value. Taking white noise seed images and blurring them will almost certainly remove the high frequencies visible in luminance metamers in figure 4 and thus result in a larger critical scaling value, as the reviewer points out. However, the logic of the experiments requires finding the smallest critical scaling value, and so these model metamers would be uninformative. In an early stage of the project, we did indeed synthesize model metamers using pink noise seeds, and observed that the high frequency artifacts were less prominent.
 
 Schematic of metamerism: Figures 1,2,12, and 13 show a visual schematic of the state space of images, and their relationship to both model and human metamers. This is depicted as a Voronoi diagram, with individual images near the center of each shape, and other images that fall at different locations within the same cell producing the same human visual system response. I felt this conceptualization was helpful. However, implicitly it seems to make a distinction between metamerism and JND (just noticeable difference). I felt this would be better made explicit. In the case of JND, neighboring points, despite having different visual system responses, might not be distinguishable to a human observer.
 
 Thanks for noting this – in general, metamers are subthreshold, and for the purpose of the diagram, we had to discretize the space showing metameric regions (Voronoi regions) around a set of stimuli. We’ve rewritten the captions to explain this better. We address the binary subthreshold nature of the metamer paradigm in the discussion section (pg. 19, line 438).
 
 In these diagrams and throughout the paper, the phrase ’visual stimulus’ rather than ’image’ would improve clarity, because the location of the stimulus in relation to the fovea matters whereas the image can be interpreted as the pixels displayed on the computer.
 
 We agree and have tried to make this change, describing this choice on pg. 3 line 73.
 
 Other
 
 The authors show good reproducibility practices with links to relevant code, datasets, and figures.
 
 Reviewer #1 (Recommendations For The Authors):
 
 In its current form, I found the introduction to be too cursory. I felt that the article would benefit from a clearer motivation for the two models that are considered as the reader is left unclear why these particular models are of special scientific significance. The luminance model is intended to capture some aspects of retinal ganglion cells response characteristics and the spectral energy model is intended to capture some aspects of the primary visual cortex. However, one can easily imagine models that include the pooling of other kinds of features, and it would be helpful to get an idea of why these are not considered. Which aspects of processing in the retina and V1 are being considered and which are being left out, and why? Why not consider representations that capture even higher-order statistical structure than those covered by the spectral energy model (or even semantics)? I think a bit of rewriting with this in mind could improve the introduction.
 
 Along similar lines, I would have appreciated having the logic of the study explained more explicitly and didactically: which overarching research question is being asked, how it is operationalised in the models and experiments, and what are the predictions of the different models. Figures 2 and 3 are certainly helpful, but I felt further explanations would have made it easier for the reader to follow. Throughout, the writing could be improved by a careful re-reading with a view to making it easier to understand. For example, where results are presented, a sentence or two expanding on the implications would be helpful.
 
 I think the authors could also be more explicit about the assumptions they make. While these are obviously (tacitly) included in the description of the models themselves, it would be helpful to state them more openly. To give one example, when introducing the notion of critical scaling, on p.6 the authors state as if it is a self-evident fact that "metamers can be achieved with windows whose size is matched to that of the underlying visual neurons". This presumably is true only under particular conditions, or when specific assumptions about readout from populations of neurons are invoked. It would be good to identify and state such assumptions more directly (this is partly covered in the Discussion section ’The linking proposition underlying the metamer paradigm’, but this should be anticipated or moved earlier in the text).
 
 We agree that our introduction was too cursory and have reworked it. We have also backed off of the direct comparison to physiology and clarified that we chose these two as the simplest possible pooling models. We have also added sentences at the end of each result section attempting to summarize the implication (before discussing them fully in the discussion). Hopefully the logic and assumptions are now clearer.
 
 There are also some findings that warrant a more extensive discussion. For example, what is the broader implication of the finding that original vs. synthesised and synthesised vs. synthesised comparisons exhibit very different scaling values? Does this tell us something about internal visual representations, or is it simply capturing something about the stimuli?
 
 We believe this difference is a result of the stimuli that are used in the experiment and thus the synthesis procedure itself, which interacts with the model’s pooled image feature. We have attempted to update the relevant figures and discussions to clarify this, in the sections starting on pg 17 line 396 and pg. 19 line 417.
 
 At some points in the paper, a third model (’texture model’) creeps into the discussion, without much explanation. I assume that this refers to models that consider joint (rather than marginal) statistics of wavelet responses, as in the famous Portilla & Simoncelli texture model. However, it would be helpful to the reader if the authors could explain this.
 
 Addressed on pg. 3, starting on line 94.
 
 Minor corrections.
 
 Caption of Figure 3: ’top’ and ’bottom’ should be ’left’ and ’right’
 
 Line 177: ’smallest tested scaling values tested’. Remove one instance of ’tested’
 
 Line 212: ’the images-specific psychometric functions’ -> ’image-specific’
 
 Line 215: ’cloud-like pink noise’. It’s not literally pink noise, so I would drop this.
 
 Line 236: ’Importantly, these results cannot be predicted from the model, which gives no specific insight as to why some pairs are more discriminable than others’. The authors should specify what we do learn from the model if it fails to provide insight into why some image pairs are more discriminable than others.
 
 Figure 9: it might be helpful to include small insets with the ’highway’ and ’tiles’ source images to aid the reader in understanding how the images in 9B were generated.
 
 Table 1 placement should be after it is first referred to on line 258.
 
 In the Discussion section "Why does critical scaling depend on the comparison being performed", it would be helpful to consider the case where the two model metamers *are* distinguishable from each other even though each is indistinguishable from the target image. I would assume that this is possible (e.g., if the target image is at the midpoint between the two model images in image space and each of the stimuli is just below 1 JND away from the target). Or is this not possible for some reason?
 
 Regarding line 236: this specific line has been removed, and the discussion about this issue has all been consolidated in the final section of the discussion, starting on pg. 19 line 438.
 
 Regarding the final comment: this is addressed in the paragraph starting on pg. 16 line 386. To expand upon that: the situation laid out by the reviewer is not possible in our conceptualization, in which metamerism is transitive and image discriminability is binary. In order to investigate situations like the one laid out by the reviewer, one needs models whose representations have metric properties, i.e., which allow you to measure and reason about perceptual distance, which we refer to in the paragraph starting on pg. 20 line 460. We also note that this situation has not been observed in this or any other pooling model metamer study that we are aware of. All other minor changes have been addressed.
 
 Reviewer #2 (Recommendations For The Authors):
 
 Original image T should be marked in the Voronoi diagrams.
 
 Brown et al is miscited as 2021 should be ACM Transactions on Applied Perception 2023.
 
 Figure 3 caption: models are left and right, not top and bottom.
 
 Thanks, all of the above have been addressed.
 
 References
 
 BrownReral Encoding, in the Human Visual System. ACM Transactions on Applied Perception. 2023 Jan; 20(1):1–22.http://dx.doi.org/10.1145/356460, Dutell V, Walter B, Rosenholtz R, Shirley P, McGuire M, Luebke D. Efficient Dataflow Modeling of Periph-5, doi: 10.1145/3564605.
 
 Freeman Jdoi: 10.1038/nn.2889, Simoncelli EP. Metamers of the ventral stream. Nature Neuroscience. 2011 aug; 14(9):1195–1201..
 
 Ziemba CMnications. 2021 jul; 12(1)., Simoncelli EP. Opposing Effects of Selectivity and Invariance in Peripheral Vision. Nature Commu-https://doi.org/10.1038/s41467-021-24880-5, doi: 10.1038/s41467-021-24880-5.
 
 AuthorResponse
Visit annotations in context

Tags

Review 2

Review 1

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.05.18.541306v7
www.biorxiv.org www.biorxiv.org

Brain-wide arousal signals are segregated from movement planning in the superior colliculus

5
1. Public_Reviews 14 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This study presents a valuable finding relating to how the state of arousal is represented within the superior colliculus, a principal visuo-oculomotor structure. The main conclusion that the representation of arousal is segregated, and thus influences visual activity but not motor output, is incompletely supported by the evidence, but could be stronger if a specific concern relating to an alternative explanation for the dichotomy was addressed. The work will be of interest to sensory, motor, and cognitive neuroscientists.
 
 Summary
2. Public_Reviews 14 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 Johnston and Smith used linear electrode arrays to record from small populations of neurons in the superior colliculus (SC) of monkeys performing a memory-guided saccade (MGS) task. Dimensionality reduction (PCA) was used to reveal low-dimensional subspaces of population activity reflecting the slow drift of neuronal signals during the delay period across a recording session (similar to what they reported for parts of cortex: Cowley et al., 2020). This SC drift was correlated with a similar slow-drift subspace recorded from the prefrontal cortex, and both slow-drift subspaces tended to be associated with changes in arousal (pupil size). These relationships were driven primarily by neurons in superficial layers of the SC, where saccade sensitivity/selectivity is typically reduced. Accordingly, delay-period modulations of both spiking activity and pupil size were independent of saccade-related activity, which was most prevalent in deeper layers of the SC. The authors suggest that these findings provide evidence of a separation of arousal- and motor-related signals. The analysis techniques expand upon the group's previous work and provides useful insight into the power of large-scale neural recordings paired with dimensionality reduction. This is particularly important with the advent of recording technologies which allow for the measurement of spiking activity across hundreds of neurons simultaneously. Together, these results provide a useful framework for comparing how different populations encode signals related to cognition, arousal, and motor output in potentially different subspaces.
 
 Comments on revised manuscript:
 
 The authors have done a very good job of responding to all of the reviewers' concerns.
 
 Review 1
3. Public_Reviews 14 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 Neurons in motor-related areas have increasingly shown to carry also other, non-motoric signals. This creates a problem of avoidance of interference between the motor and non-motor-related signals. This is a significant problem that likely affects many brain areas. The specific example studied here is interference between saccade-related activity and slow-changing arousal signals in the superior colliculus. The authors identify neuronal activity related to saccades and arousal. Identifying saccade-related activity is straightforward, but arousal-related activity is harder to identify. The authors first identify a potential neuronal correlate of arousal using PCA to identifying a component in the population activity corresponding to slow drift over the recording session. Next, they link this component to arousal by showing that the component is present across different brain areas (SC and PFC), and that it is correlated with pupil size, an external marker of arousal. Having identified an arousal-related component in SC, the authors show next that SC neurons with strong motor-related activity are less strongly affected by this arousal component (both SC and PFC). Lastly, they show that SC population activity pattern related to saccades and pupil size form orthogonal subspaces in the SC population.
 
 Strengths:
 
 A great strength of this research is the clear description of the problem, its relationship with the performed analysis and the interpretation of the results. the paper is very well written and easy to follow. An additional strength is the use of fairly sophisticated analysis using population activity.
 
 Weaknesses:
 
 (1) The greatest weakness in the present research is the fact that arousal is a functionally less important non-motoric variable. The authors themself introduce the problem with a discussion of attention, which is without any doubt the most important cognitive process that needs to be functionally isolated from oculomotor processes. Given this introduction, one cannot help but wonder, why the authors did not design an experiment, in which spatial attention and oculomotor control are differentiated. Absent such an experiment, the authors should spend more time on explaining the importance of arousal and how it could interfere with oculomotor behavior.
 
 (2) In this context, it is particularly puzzling that one actually would expect effects of arousal on oculomotor behavior. Specifically, saccade reaction time, accuracy, and speed could be influenced by arousal. The authors should include an analysis of such effects. They should also discuss the absence or presence of such effects and how they affect their other results.
 
 (3) The authors use the analysis shown in Figure 6D to argue that across recording sessions the activity components capturing variance in pupil size and saccade tuning are uncorrelated. however, the distribution (green) seems to be non-uniform with a peak at very low and very high correlation specifically. The authors should test if such an interpretation is correct. If yes, where are the low and high correlations respectively? Are there potentially two functional areas in SC?
 
 Comments on revised manuscript:
 
 I remain somewhat concerned that the authors jump immediately into an analysis of the 'arousal-related' effects on SC activity. Before that, I would like to see a more detailed discussion justifying the use pupil size alone (i.e., w/o other indicators such as RT) as indicative of fluctuations in general arousal that are causal to concomitant changes in SC activity. Instead, in its current form, the authors find changes in SC activity and describe them immediately as 'arousal-related'.
 
 Other than this conceptual issue, I do not have major problems with the analysis per se.
 
 Review 2
4. Public_Reviews 14 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 This study looked at slow changes in neuronal activity (on the order of minutes to hours) in the superior colliculus (SC) and prefrontal cortex (PFC) of two monkeys. They found that SC activity shows slow drift in neuronal activity like in the cortex. They then computed a motor index in SC neurons. By definition, this index is low if the neuron has stronger visual responses than motor response, and it is low if the neuron has weaker visual responses and stronger motor responses. The authors found that the slow drift in neuronal activity was more prevalent in the low motor index SC neurons and less prevalent in the high motor index neurons. In addition, the authors measured pupil diameter and found it to correlate with slow drifts in neuronal activity, but only in the neurons with lower motor index of the SC. They concluded that arousal signals affecting slow drifts in neuronal modulations are brain-wide. They also concluded that these signals are not present in the deepest SC layers, and they interpreted this to mean that this minimizes the impact of arousal on unwanted eye movements.
 
 Strengths:
 
 The paper is clear and well-written.
 
 Showing slow drifts in the SC activity is important to demonstrate that cortical slow drifts could be brain-wide.
 
 Weaknesses:
 
 The authors find that the SC cells with the low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual sensitivity. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in the most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC.
 
 Of course, the general conclusion is that the motor neurons will not have the arousal signal. It's just the interpretation that is different in the sense that the lack of the arousal signal is due to a lack of visual sensitivity in the motor neurons.
 
 I think that it is important to consider the alternative caveat of different amounts of light entering the system. Changes in light level caused by pupil diameter variations can be quite large. Please also note that I do not mean the luminance transient associated with the target onset. I mean the luminance of the gray display. it is a source of light. if the pupil diameter changes, then the amount of light entering to the visually sensitive neurons also changes.
 
 Comments on revised manuscript:
 
 The authors have addressed my first primary comment. For the light comment, I'm still not sure they addressed it. At the very least, they should explicitly state the possibility that the amount of light entering from the gray background can matter greatly, and it is not resolved by simply changing the analysis interval to the baseline pre-stimulus epoch. I provide more clear details below:
 
 In line 194 of the redlined version of the article (in the Introduction), the citation to Baumann et al., PNAS, 2023 is missing near the citation of Jagadisan and Gandhi, 2022. Besides replicating Jagadisan and Gandhi, 2022, this other study actually showed that the subspaces for the visual and motor epochs are orthogonal to each other
 
 Line 683 (and around) of the redlined version of the article (in the Results): I'm very confused here. When I mentioned visual modulation by changed pupil diameter, I did not mean the transient changes associated with the brief onset of the cue in the memory-guided saccade task. I meant the gray background of the display itself. This is a strong source of light. If the pupil diameter changes across trials, then the amount of light entering the eye also changes from the gray background. Thus, visually-responsive neurons will have different amount of light driving them. This will also happen in the baseline interval containing only a fixation spot. The arguments made by the authors here do not address this point at all. So, please modify the text to explicitly state the possibility that the global luminance of the display (as filtered by the pupil diameter) alters the amount of light driving the visually-responsive neurons and could contribute to the higher effects seen in the more visual neurons.
 
 The figures (everywhere, including the responses to reviewers) are very low resolution and all equations in methods are missing.
 
 I'm very confused by Fig. 2 - supplement 2. Panel B shows a firing rate burst aligned to *microsaccade* onset. Does that mean you were in the foveal SC? i.e. how can neurons have a motor burst to the target of the memory-guided saccade and also for microsaccades? And which microsaccade directions caused such a burst? And what does it mean to compute the motor index and spike count for microsaccades in panel C? if you were in the proper SC location for the saccade target, then shouldn't you *not* get any microsaccade-related burst at all? This is very confusing to me and needs to be clarified
 
 Review 3
5. Public_Reviews 14 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public Review):
 
 (1) The authors make fairly strong claims that "arousal-related fluctuations are isolated from neurons in the deep layers of the SC" (emphasis added). This conclusion is based on comparisons between a "slow drift axis", a low-dimensional representation of neuronal drift, and other measures of arousal (Figures 2C, 3) and motor output sensitivity (Figures 2B, 3B). However, the metrics used to compare the slow-drift axis and motor activity were computed during separate task epochs: the delay period (600-1100 ms) and a perisaccade epoch (25 ms before and after saccade initiation), respectively. As the authors reference, deep-layer SC neurons are typically active only around the time of a saccade. Therefore, it is not clear if the lack of arousal-related modulations reported for deep-layer SC neurons is because those neurons are truly insensitive to those modulations, or if the modulations were not apparent because they were assessed in an epoch in which the neurons were not active. A potentially more valuable comparison would be to calculate a slow-drift axis aligned to saccade onset.
 
 The reviewer makes an important point that the calculation of an axis can depend critically on the time window of neuronal response. We find when considering this that the slow drift axis is less sensitive to this issue because it is calculated on time-averaged activity over multiple trials. In previous work we found that slow drift calculated on the stimulus evoked response in V4 was very well aligned to slow drift calculated on pre-stimulus spontaneous activity (Cowley et al, Neuron, 2020, Supplemental Figure 3A and 3B). To address this issue in the present data, we compared the axis computed for an example session for neural activity during the delay period and neural activity aligned to saccade onset. As shown new Figure 2 – figure supplement 1 in the revised manuscript, we found a similar lack of arousal-related modulations for deep-layer SC neurons when slow drift was computed using the saccade epoch (25ms before to 25ms after the onset of the saccade). Figure 2 – figure supplement 1A shows loadings for the SC slow drift axis when it was computed using spiking responses during the delay period (as in the main manuscript analysis). In contrast, Figure 2 – figure supplement 1B shows loadings from the same session when the SC slow drift axis was computed using spiking responses during the saccade epoch. The plots are highly similar and in both cases the loadings were weaker for neurons recorded from channels at the bottom of the probe which have a higher motor index. Finally, we found that projections onto the SC slow drift axis for this session were strongly correlated when the slow drift axis was computed using spiking responses during the delay period and the saccade epoch (r = 0.66, p < 0.001, Figure 1C). Taken together, these results suggest that arousal-related modulations are less evident in deep-layer SC neurons irrespective of whether slow drift was computed during the delay or saccade epoch (see also Public Reviews, Reviewer 1, Point 2).
 
 (2) More generally, arousal-related signals may persist throughout multiple different epochs of the task. It would be worthwhile to determine whether similar "slow-drift" dynamics are observed for baseline, sensory-evoked, and saccade-related activity. Although it may not be possible to examine pupil responses during a saccade, there may be systematic relationships between baseline and evoked responses.
 
 Similar to the point above, slow drift dynamics tend to be similar across different response epochs because they are averaged across many trials and seem to tap into responsivity trends that are robust across epochs. As shown in Author response image 1 below, and the Figure 2 – figure supplement 1 in the revised manuscript, similar dynamics were observed when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade epochs. We did not investigate differences between baseline and evoked pupil responses in the current paper. However, these effects were characterized in one of our previous papers that focused exclusively on the relationship between slow drift and eye-related metrics (Johnston et al., 2022, Cereb. Cortex, Figure 6). In this previous work, we found a negative correlation between baseline and evoked pupil size. Both variables were significantly correlated with slow drift, the only difference being the sign of the correlation.
 
 Author response image 1.
 
 (A-C) Dynamics of slow drift for three example sessions when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade epochs. Baseline = 100ms before the onset of the target stimulus; Delay = 600 to 1100ms after the offset of the target stimulus; Stim = 25ms to 125ms after the onset of the target stimulus; Sac = 25ms before to 25ms after the onset of the saccade.
 
 Johnston R, Snyder AC, Khanna SB, Issar D, Smith MA (2022) The eyes reflect an internal cognitive state hidden in the population activity of cortical neurons. Cereb Cortex 32:3331–3346.
 
 (3) The relationships between changes in SC activity and pupil size are quite small (Figures 2C & 5C). Although the distribution across sessions (Figure 2C) is greater than chance, they are nearly 1/4 of the size compared to the PFC-SC axis comparisons. Likewise, the distribution of r2 values relating pupil size and spiking activity directly (Figure 5) is quite low. We remain skeptical that these drifts are truly due to arousal and cannot be accounted for by other factors. For example, does the relationship persist if accounting for a very simple, monotonic (e.g., linear) drift in pupil size and overall firing rate over the course of an individual session?
 
 Firstly, it is important to note that the strength of the relationship between projections onto the SC slow drift axis and pupil size (r2 = 0.06) is within the range reported by Joshi et al. (2016, Neuron, Figure 3). They investigated the median variance explained between the spiking responses of individual SC neurons and pupil size and found it to be approximately 0.02 across sessions. Secondly, our statistical approach of testing the actual distribution of r2 values against a shuffled distribution was specifically designed to rule out the possibility that the relationship between SC spiking responses and pupil size occurred due to linear drifts. The shuffled distribution in Figure 2C of the main manuscript represents the variance that can be explained by one session’s slow drift correlated with another session’s pupil, which would contain effects that occurred due to linear drifts alone. That the actual proportion of variance explained was significantly greater than this distribution suggests that the relationship between projections onto the SC slow drift axis and pupil size reflects changes in arousal rather than other factors related to linear drifts.
 
 Joshi S, Li Y, Kalwani RM, Gold JI (2016) Relationships between Pupil Diameter and Neuronal Activity in the Locus Coeruleus, Colliculi, and Cingulate Cortex. Neuron 89:221–234.
 
 (4) It is not clear how the final analysis (Figure 6) contributes to the authors' conclusions. The authors perform PCA on: (i) residual spiking responses during the delay period binned according to pupil size, and (ii) spiking responses in the saccade epoch binned according to target location (i.e., the saccade tuning curve). The corresponding PCs are the spike-pupil axis and the saccade tuning axis, respectively. Unsurprisingly, the spikepupil axis that captures variance associated with arousal (and removes variance associated with saccade direction) was not correlated with a saccade-tuning axis that captures variance associated with saccade direction and omits arousal. Had these measures been related it would imply a unique association between a neuron's preferred saccade direction and pupil control- which seems unlikely. The separation of these axes thus seems trivial and does not provide evidence of a "mechanism...in the SC to prevent arousal-related signals interfering with the motor output." It remains unknown whether, for example, arousal-related signals may impact trial-by-trial changes in neuronal gain near the time of a saccade, or alter saccade dynamics such as acceleration, precision, and reaction time.
 
 The reviewer makes a good point, and we agree that more evidence is needed to determine if the separation of the pupil size axis and saccade tuning axis is the mechanism through which cognitive and arousal-related signals can be intermixed in the SC. In the revised manuscript (lines 679-682), we have raised this as a possible explanation that necessitates further study rather than stating definitively that it is the exact mechanism through which these signals are kept separate. Our analysis here is similar to the one from Smoulder et al (2024, Neuron, Fig. 2F), in which the interactions between reward signals and target tuning in M1 were examined (and found to be orthogonal). While we agree with the reviewer that it may seem “trivial” for these axes to be orthogonal, it does not have to be so. If, for example, neural tuning curves shifted with changes in pupil size through gain changes that revealed tuning or affected tuning curve shape, there could be projections of the pupil axis onto the target tuning axis. Thus, while we agree with the reviewer that it appears sensible for these two axes to be orthogonal, our result is nonetheless a novel finding. We have edited the text in our revised manuscript, however, to make sure the nuance of this point is conveyed to the reader.
 
 Smoulder AL, Marino PJ, Oby ER, Snyder SE, Miyata H, Pavlovsky NP, Bishop WE, Yu BM, Chase SM, Batista AP. A neural basis of choking under pressure. Neuron. 2024 Oct 23;112(20):3424-33.
 
 Reviewer #2 (Public Review):
 
 (1) The greatest weakness in the present research is the fact that arousal is a functionally less important non-motoric variable. The authors themselves introduce the problem with a discussion of attention, which is without any doubt the most important cognitive process that needs to be functionally isolated from oculomotor processes. Given this introduction, one cannot help but wonder, why the authors did not design an experiment, in which spatial attention and oculomotor control are differentiated. Absent such an experiment, the authors should spend more time explaining the importance of arousal and how it could interfere with oculomotor behavior.
 
 Although attention does represent an important cognitive process, we did not design an experiment in which attention and oculomotor control are differentiated because attention does not appear to be related to slow drift. In our first paper that reported on this phenomenon, we investigated the effects of spatial attention on slow fluctuations in neural activity by cueing the monkeys to attend to a stimulus in the left or right visual field in a block-wise manner. Each block lasted ~20 minutes and we found that slow drift did not covary with the timing of cued blocks (see Figure 4A, Cowley et al., 2020, Neuron). Furthermore, there is a large body of work showing that arousal also impacts motor behavior leading to changes in a range of eye-related metrics (e.g., pupil size, microsaccade rate and saccadic reaction time - for review, see Di Stasi et al. 2013, Neurosci. Biobehav. Rev.). We also note that the terms attention and arousal are often used in nonspecific and overlapping ways in the literature, adding to some potential confusion here. Nonetheless, pupil-linked arousal is an important variable that impacts motor performance. This has now been stated clearly in the Introduction of the revised manuscript (lines 108-114) to address the reviewer’s concerns and highlight the importance of studying how precise fixation and eye movements are maintained even in the presence of signals related to ongoing changes in brain state.
 
 Cowley BR, Snyder AC, Acar K, Williamson RC, Yu BM, Smith MA (2020) Slow Drift of Neural Activity as a Signature of Impulsivity in Macaque Visual and Prefrontal Cortex. Neuron 108:551-567.e8.
 
 (2) In this context, it is particularly puzzling that one actually would expect effects of arousal on oculomotor behavior. Specifically, saccade reaction time, accuracy, and speed could be influenced by arousal. The authors should include an analysis of such effects. They should also discuss the absence or presence of such effects and how they affect their other results.
 
 As described above, several studies across species have demonstrated that arousal impacts motor behavior e.g., saccade reaction time, saccade velocity and microsaccade rate (for review, see Di Stasi et al. 2013, Neurosci. Biobehav. Rev.). This has been clarified in the Introduction of the revised manuscript to address the reviewer's concerns (lines 108-114). Our prior work (Johnston et al, Cerebral Cortex, 2022) shows that slow drift impacts several types of oculomotor behavior. Overall, these studies highlight the impact of arousal on eye movements as a robust effect, and support the present investigation into arousal and oculomotor control signals. While we agree reaction time, accuracy, and speed all can be influenced by arousal depending on task demands, the present study is focused on the connection between slow fluctuations in neural activity, linked to arousal, and different subpopulations of SC neurons.
 
 Di Stasi LL, Catena A, Cañas JJ, Macknik SL, Martinez-Conde S (2013) Saccadic velocity as an arousal index in naturalistic tasks. Neurosci Biobehav Rev 37:968–975.
 
 Johnston R, Snyder AC, Khanna SB, Issar D, Smith MA (2022) The eyes reflect an internal cognitive state hidden in the population activity of cortical neurons. Cereb Cortex 32:3331–3346.
 
 (3) The authors use the analysis shown in Figure 6D to argue that across recording sessions the activity components capturing variance in pupil size and saccade tuning are uncorrelated. however, the distribution (green) seems to be non-uniform with a peak at very low and very high correlation specifically. The authors should test if such an interpretation is correct. If yes, where are the low and high correlations respectively? Are there potentially two functional areas in SC?
 
 We agree with the reviewer that our actual data distribution was non-uniform. We examined individual sessions with high and low variance explained and did not find notable differences. One source of this variation has to do with session length. Longer sessions in principle should have a chance distribution of variance explained closer to zero because they contained more time bins. Given that we had no specific hypothesis for a non-uniform distribution, we have simply displayed the full distribution of values in our figure and the statistical result of a comparison to a shuffled distribution.
 
 Reviewer #3 (Public Review):
 
 (1) However, I am concerned about two main points: First, the authors repeatedly say that the "output" layers of the SC are the ones with the highest motor indices. This might not necessarily be accurate. For example, current thresholds for evoking saccades are lowest in the intermediate layers, and Mohler & Wurtz 1972 suggested that the output of the SC might be in the intermediate layers. Also, even if it were true that the high motor index neurons are the output, they are very few in the authors' data (this is also true in a lot of other labs, where it is less likely to see purely motor neurons in the SC). So, this makes one wonder if the electrode channels were simply too deep and already out of the SC? In other words, it seems important to show distributions of encountered neurons (regardless of the motor index) across depth, in order to better know how to interpret the tails of the distributions in the motor index histogram and in the other panels of Figure Supplement 1. I elaborate more on these points in the detailed comments below.
 
 The reviewer makes a good point about the efferent signals from SC. It is true that electrical thresholds are often lowest in intermediate layers, though deep layers do project to the oculomotor nuclei (Sparks, 1986; Sparks & Hartwich-Young, 1989) and often intermediate and deep layers are considered to function together to control eye movements (Wurtz & Albano, 1980). As suggested by the reviewer, we have edited the text throughout the manuscript to say that slow drift was less evident in SC neurons with a higher motor index, as well as included the above references and points about the intermediate and deep layers (Lines 73-81). Aside from the question of which layers of the SC function as the “motor output”, the reviewer raises a separate and important question – are our deep recordings still in SC. Here, we can say definitively that they are. We removed neurons if they did not exhibit elevated (above baseline) firing rates during the visual or saccade epochs of the MGS task (see Methods section on “Exclusion criteria”). All included neurons possessed a visual, visuomotor or motor response, consistent with the response properties of neurons in the SC. In addition, we found a number of neurons well above the bottom of the probe with strong motor responses and minimal loadings onto the slow drift axis (see Figure 2 – figure supplement 1A), consistent with the reviewer’s comment that intermediate layer neurons are tuned for movement and play a role in saccade production.
 
 Mohler CW, Wurtz RH. Organization of monkey superior colliculus: intermediate layer cells discharging before eye movements. Journal of neurophysiology. 1976 Jul 1;39(4):722-44.
 
 Sparks DL. Translation of sensory signals into commands for control of saccadic eye movements: role of primate superior colliculus. Physiol Rev. 1986 Jan;66(1):118-71. doi: 10.1152/physrev.1986.66.1.118. PMID: 3511480.
 
 Sparks DL, Hartwich-Young R. The deep layers of the superior colliculus. Reviews of oculomotor research. 1989 Jan 1;3:213-55.
 
 Wurtz RH, Albano JE. Visual-motor function of the primate superior colliculus. Annu Rev Neurosci. 1980;3:189-226. doi: 10.1146/annurev.ne.03.030180.001201. PMID: 6774653.
 
 (2) Second, the authors find that the SC cells with a low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual responses. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC.
 
 The reviewer makes an important point about the SC’s visual responses. Neurons with a low motor index are, conversely, likely to have a stronger visual response index. However, we do not believe that changes in luminance can explain why the correlation between SC spiking response and pupil size is weaker for neurons with a lower motor index. Firstly, the changes in pupil size observed in the current paper and our previous work are slow and occur on a timescale of minutes (Cowley et al., 2020, Neuron) and are correlated with eye movement measures such as reaction time and microsaccade rate (Johnston et al., 2022, Cerebral Cortex). This is in stark contrast to luminance-evoked changes in pupil size that occur on a timescale of less than a second. Secondly, as shown the new Figure 5 – figure supplement 1 in the revised manuscript, very similar results were found when SC spiking responses were correlated with pupil size during the baseline period, when only the fixation point was on the screen. Although the luminance of the small peripheral target stimulus can result in small luminance-evoked changes in pupil size, no changes in luminance occurred during the baseline period which was defined as 100ms before the onset of the target stimulus. In Figure 2 – figure supplement 1 and Author response image 1 above, we show that slow drift is the same whether calculated on the baseline response, delay period, or peri-saccadic epoch. Thus, the measurement of slow drift is insensitive to the precise timing of the selection of both the window for the spiking response and the window for the pupil measurement. If luminance were the explanation for the slow changes in firing observed in visually responsive SC neurons, it would require those neurons to exhibit robust, sustained tuned responses to the small changes in retinal illuminance induced by the relatively small fluctuations in pupil size we observed from minute to minute. We are aware of no reports of such behavior in visually-responsive neurons in SC. We have included these analyses and this reasoning in the revised manuscript on lines 478-495.
 
 Reviewer#1 (Recommendations for the author):
 
 (1) It would be useful to provide line numbers in subsequent manuscripts for reviewers.
 
 Line numbers have been added in the revised version of the manuscript.
 
 (2) Page #6; last sentence: "...even impact processing at the early to mid stages of the visuomotor transformation, without leading to unwanted changes in motor output." I do not believe the authors have provided evidence that arousal levels were not associated with changes in motor output.
 
 As suggested by Reviewer 3 (see Public Reviews, Reviewer 3, Point 2), we have edited the text throughout the manuscript to say that slow drift was less evident in SC neurons with a higher motor index. This sentence in the revised manuscript now reads:
 
 “This provides a potential mechanism through which signals related to cognition and arousal can exist in the SC, and even impact processing at the early to mid stages of the visuomotor transformation, without leading to unwanted changes in SC neurons that are linked to saccade execution.”
 
 (3) Page #8; last paragraph: Although deep-layer SC neurons may not have been obtained during every recording session, a summary of the motor index scores observed along the probe across sessions would be useful to confirm their assumptions.
 
 See Author response image 2 below which shows the motor index of each recoded SC neuron on the x-axis and session number on the y-axis. The points are colored by to the squared factor loading which represents the variance explained between the response a neuron and the slow drift axis (see Figure 3B of the main manuscript). You can see from this plot that neurons with a stronger component loading (shown in teal to yellow) typically have a lower motor index whereas the opposite is true for neurons with a weaker component loading (shown in dark blue).
 
 Author response image 2.
 
 Scatter plot showing the motor index of each recorded neuron along with the session number in which it was recorded. The points are colored by to the squared factor loading for each neuron along the slow drift axis. Note that loadings above 0.5 (33 data points in total) have been thresholded at 0.5 so that we could effectively use the color range to show all of the slow drift axis loadings.
 
 (4) Page #10; first paragraph: The authors should state the time window of the delay period used, since it may be distinct from the pupil analysis (first 200ms of delay).
 
 This has been stated in the revised version of the manuscript. The sentence now reads:
 
 “We first asked if arousal-related fluctuations are present in the SC. As in previous studies that recorded from neurons in the cortex (Cowley et al., 2020), we found that the mean spiking responses of individual SC neurons during the delay period (chosen at random on each trial from a uniform distribution spanning 600-1100ms, see Methods) fluctuated over the course of a session while the monkeys performed the MGS task (Figure 2A, left).”
 
 (5) Page #10; second paragraph: Extra period at the end of a sentence: " most variance in the data..".
 
 Fixed in the revised version of the manuscript.
 
 (6) Page #12: "between projections onto the SC slow drift axis and mean pupil size during the first 200ms of the delay period when a task-related pupil response could be observed." What criteria was used to determine whether a task-related pupil response was observed?
 
 This was chosen based on the results of a previous study in our lab that used the same memory-guided saccade task to investigate the relationship between slow drift and changes in based and evoked pupil size (see Johnston et al., 2022, Cereb. Cortex, Figure 6B). The period was chosen based on plotting the average pupil size aligned on different trial epochs. As we show in Figure 5-figure supplement 3 above, the pupil interactions with slow drift did not depend on the particular time window of the pupil we chose.
 
 (7) Page #14; Figure 2A: The axes for the individual channels are strangely floating and quite different from all other figures. Please label the channel in the figure legend that was used as an example of the projected values onto the slow drift axis.
 
 The figure has been changed in the revised version of the manuscript so that the tick mark denoting zero residual spikes per second is on the top layer of each plot. A scale bar was chosen instead of individual axes to reduce clutter in the figure as it was used to demonstrate how slow drift was computed. Residual spiking responses from all neurons were projected on the slow drift axis to generate the scatter plot in the bottom right-hand corner of Figure 2A. There is no single neuron to label.
 
 (8) Page #16: "These results demonstrate that even though arousal-related fluctuations are present in the SC, they are isolated from deep-layer neurons that elicit a strong saccadic response and presumably reside closer to the motor output." In line with our major comments, lack of arousal-related activity during the delay period is meaningless for deep-layer SC neurons that are generally inactive during this time. It does not imply that there is no arousal signal!
 
 Addressed in Public Reviews, Reviewer 1, Point 1 & 2. We found a similar lack of arousal-related modulations reported for deep-layer SC neurons when slow drift was computed using the saccade epoch (Figure 1 above). In addition, similar dynamics were observed when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade period (Figure 2).
 
 (9) Page #18: "These findings provide additional support for the hypothesis that arousalrelated fluctuations are isolated from neurons in the deep layers of the SC." The same criticism from above applies.
 
 Addressed in Public Reviews, Reviewer 1, Point 1 & 2.
 
 (10) Page #20; paragraph 3: "Taken together, the findings outlined above..." Would be useful to be more specific when referring to "activity" ; e.g., "...these neurons did not exhibit large fluctuations in delay-period activity over time".
 
 This sentence has been changed in the revised manuscript in light of the reviewer’s comments. It now reads:
 
 “In addition to being more weakly correlated with pupil size, the spiking responses of these neurons did not exhibit large fluctuations over time (Figure 2), and when considering the neuronal population as a whole, explained less variance in the slow drift axis when it was computed using population activity in the SC (Figure 3) and PFC (Figure 4).”
 
 Reviewer #3 (Recommendations for the author):
 
 The paper is clear and well-written. However, I am concerned about two main points:
 
 (1) First, the authors repeatedly say that the "output" layers of the SC are the ones with the highest motor indices. This might not necessarily be accurate. For example, current thresholds for evoking saccades are lowest in the intermediate layers, and Mohler & Wurtz 1972 suggested that the output of the SC might be in the intermediate layers. Also, even if it were true that the high motor index neurons are the output, they are very few in the authors' data (this is also true in a lot of other labs, where it is less likely to see purely motor neurons in the SC). So, this makes one wonder if the electrode channels were simply too deep and already out of the SC. In other words, it seems important to show distributions of encountered neurons (regardless of motor index) across depth, in order to better know how to interpret the tails of the distributions in the motor index histogram and in the other panels of the figure supplement 1. I elaborate more on these points in the detailed comments below.
 
 Addressed in Public Reviews, Reviewer 3, Point 1.
 
 (2) Second, the authors find that the SC cells with a low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual responses. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC.
 
 Addressed in Public Reviews, Reviewer 3, Point 2.
 
 (3) I think that a remedy to the first point above is to change the text to make it a bit more descriptive and less interpretive. For example, just say that the slow drifts were less evident among the neurons with high motor index.
 
 We thank the reviewer for this suggestion (see Public Reviews, Reviewer 3, Point 1).
 
 (4) For the second point, I think that it is important to consider the alternative caveat of different amounts of light entering the system. Changes in light level caused by pupil diameter variations can be quite large.
 
 We thank the reviewer for this suggestion (see Public Reviews, Reviewer 3, Point 2).
 
 (5) Line 31: I'm a bit underwhelmed by this kind of statement. i.e. we already know that cognitive processes and brain states do alter eye movements, so why is it "critical" that high precision fixation and eye movements are maintained? And, isn't the next sentence already nulling this idea of criticality because it does show that the brain state alters the SC neurons? In fact, cognitive processes are already known to be most prevalent in the intermediate and deep layers of the SC.
 
 It seems clear that while cognitive state does affect eye movements, it is desirable to have some separation between cognitive state and eye movement control. Covert attention, for instance, is precisely a situation where eye movement control is maintained to avoid overt saccades to the attended stimulus, and yet there are clear indications of attention’s impact on microsaccades and fixation. We stand by our statement that an important goal of vision is to have precise fixation and movements of the eye, and yet at the same time the eyes are subject to numerous influences by cognitive state.
 
 (6) Line 65: it is better to clarify that these are "functional layers" because there are actually more anatomical layers.
 
 We have edited this sentence in the revised version of the manuscript so that it now reads:
 
 “The role of these projections in the visuomotor transformation depends on the functional layer of the SC in which they terminate”.
 
 (7) Line 73: this makes it sound like only the deepest layers are topographically organized, which is not true. Also, as early as Mohler & Wurtz, 1972, it was suggested that the intermediate layers have the biggest impacts downstream of the SC. This is also consistent with electrical microstimulation current thresholds for evoking saccades from the SC.
 
 We have addressed the reviewers’ comments about the intermediate layers having the biggest impact downstream of the SC in Public Reviews, Reviewer 3, Point 1. Furthermore, line 73 has been changed in the revised manuscript so that it now reads:
 
 “As is the case for neurons in the superficial and intermediate layers, they [SC motor neurons] form a topographically organized map of visual space (White et al. 2017; Robinson 1972; Katnani and Gandhi 2011)”.
 
 (8) Line 100: there is an analogous literature regarding the question of why unwanted muscle contractions do not happen. Specifically, in the context of why SC visual bursts do not automatically cause saccades (which is a similar problem to the ones you mention about cognitive signals interfering by generating unwanted eye movements), both Jagadisan & Gandhi, Curr Bio, 2022 and Baumann et al, PNAS, 2023 also showed that SC population activity not only has different temporal structure (Jagadisan & Gandhi) but also occupy different subspaces (Baumann et al) under these two different conditions (visual burst versus saccade burst). This is conceptually similar to the idea that you are mentioning here with respect to arousal. So, it is worth it to mention these studies here and again in the discussion.
 
 We are grateful to the reviewer for these suggestions and have included text in the Introduction (Lines 125-128) and Discussion (Lines 678-682) of the revised manuscript along with the references cited above.
 
 (9) Line 147: as mentioned above, it is now generally accepted that there are quite a few "pure" motor neurons in the SC. This is consistent with what you find. E.g. Baumann et al., 2023. And, again see Mohler and Wurtz in the 1970's. So, I wonder how useful it is to go too much into this idea of the deeper motor neurons (e.g. the correlations in the other panels of the Figure 1 supplement).
 
 This is related to the reviewer’s comment that the output of the SC might be in the intermediate layers. This concern has been addressed in Public Reviews, Reviewer 3, Point 1.
 
 (10) Figure 1 should say where the RF was for the shown spike rasters. i.e. were these the same saccade target across trials? And where was that location relative to the RF? It would help also in the text to say whether the saccade was always to the RF center or whether you were randomizing the target location.
 
 We centered the array of saccade targets using the microstimulation-evoked eye movement for SC (see Methods section “Memory-guided saccade task”) to find the evoked eccentricity, and then used saccade targets with equal spacing of 45 degrees starting at zero (rightward saccade target). We did not do extensive RF mapping beyond this microstimulation centering. In Figure 1, the spike rasters are shown for a target that was visually identified to be within the neuron’s RF based on assessing responses to all 8 target angles. We have added information about this to the figure caption.
 
 (11) Line 218: but were there changes in the eye movement statistics? For example, the slow drift eye movements during fixation? Or even the microsaccades?
 
 Addressed in Public Reviews, Reviewer 2, Point 2.
 
 (12) Line 248: shuffling what exactly? I think that more explanation would be needed here.
 
 Addressed in Public Reviews, Reviewer 1, Point 3.
 
 (13) Line 263: but isn't this reflecting a sensory transient in the pupil diameter, since the target just disappeared?
 
 Addressed in Public Reviews, Reviewer 3, Point 2.
 
 (14) Line 271: I suspect that slow drift eye movements (in between microsaccades) would show higher correlations. Not sure how well you can analyze those with a video-based eye tracker.
 
 We agree that fixational drift would be a worthwhile metric, but it is not one we have focused on here and to our knowledge does require higher precision tracking.
 
 (15) Line 286: again, see above about similar demonstrations with respect to the visual and motor burst intervals, which clearly cause the same problem (even stronger) as the one studied here.
 
 See reply, including Figure 2.
 
 (16) Line 330: again, I'm not sure deeper necessarily automatically means closer to the output. For example, current thresholds for evoked saccades grow higher as you go deeper. Maybe the authors can ask their colleague Neeraj Gandhi about this point specifically, just to be safe. Maybe the safest would be to remain descriptive about the data, and just say something like: arousal-related fluctuations were absent in our deepest recorded sites.
 
 Addressed in Public Reviews, Reviewer 3, Point 1.
 
 (17) Line 332: likewise, statements like this one here would be qualified if the output was the intermediate layers......anyway if I understand what I read so far in the paper, the signal will be anyway orthogonal to the motor burst population subspace. So, maybe there's no need to emphasize that it goes away in the very deepest layers.
 
 See reply above, Public Reviews, Reviewer 1, Point 4.
 
 (18) Figure 3A: related to the above, I think one issue could be that the deeper contacts might already be out of the SC. Maybe some cell count distribution from each channel should help in this regard. i.e. were you finding way fewer saccade-related neurons in the deepest channels (even though the few that you found were with high motor index)? If so, then wouldn't this just mean that the channel was too deep? I think there needs to be an analysis like this, to convince readers that the channels were still in the SC. Ideally, electrical stimulation current thresholds for evoking saccades at different depths would be tested, but I understand that this can be difficult at this stage.
 
 Addressed in Public Reviews, Reviewer 3, Point 1.
 
 (19) I keep repeating this because in general, cognitive effects are stronger in the intermediate/deeper layers than in the superficial layers. If these interfere with eye movements like arousal, then why should arousal be different?
 
 Few studies have investigated the effects of attention on “pure” movement SC neurons that only discharge during a saccade. One study, which we cited in Introduction (Ignashchenkova et al., 2004, Nat. Neurosci.), found significant differences in spiking responses between trials with and without attentional cueing for visual and visuomotor neurons. No significant difference was found for motor neurons, consistent with our hypothesis that signals related to cognition and arousal are kept separate from saccade-related signals in the SC.
 
 (20) The problem with Figure 5 and its related text is that the neurons with low motor index are additionally visual. So, of course, they can be modulated if the pupil diameter changes!
 
 Addressed in Public Reviews, Reviewer 3, Point 2.
 
 (21) I had a hard time understanding Figure 6.
 
 See reply above, Public Reviews, Reviewer 1, Point 4.
 
 (22) Line 586: these cells have more visual responses and will be affected by the amount of light entering the eye.
 
 Addressed in Public Reviews, Reviewer 3, Point 2.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.04.26.591284v2
www.biorxiv.org www.biorxiv.org

Glycogen Engineering Improves the Starvation Resistance of Mesenchymal stem cells and their Therapeutic Efficacy in Pulmonary Fibrosis

4
1. Public_Reviews 13 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This important study presents a novel approach to enhance the therapeutic potential of mesenchymal stromal cells (MSCs) by genetically modifying their glycogen synthesis pathway, resulting in increased glycogen accumulation and improved cell survival under starvation conditions, particularly in the context of experimental pulmonary fibrosis. The methods and findings are generally solid and could be strengthened in the future by investigating the kinetics of persistence, the immunomodulatory effects, and the underlying improved mechanism of action of MSCs in this pulmonary fibrosis model. If confirmed, this approach could suggest potential methods to improve the therapeutic functionality of MSCs in cell therapy strategies.
 
 Summary
2. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This study provides the first evidence that glucose availability, previously shown to support cell survival in other models, is also a key determinant for post-implantation MSC survival in the specific context of pulmonary fibrosis. To address glucose depletion in this context, the authors propose an original, elegant, and rational strategy: enhancing intracellular glycogen stores to provide transplanted MSCs with an internal energy reserve. This approach aims to prolong their viability and therapeutic functionality after implantation.
 
 Strengths:
 
 The efficacy of this metabolic engineering strategy is robustly demonstrated both in vitro and in an orthotopic mouse model of pulmonary fibrosis.
 
 Review 1
3. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 In this article, the authors investigate enhancing the therapeutic and regenerative properties of mesenchymal stem cells (MSCs) through genetic modification, specifically by overexpressing genes involved in the glycogen synthesis pathway. By creating a non-phosphorylatable mutant form of glycogen synthase (GYSmut), the authors successfully increased glycogen accumulation in MSCs, leading to significantly improved cell survival under starvation conditions. The study highlights the potential of glycogen engineering to improve MSC function, especially in inflammatory or energy-deficient environments. However, critical gaps in the study's design, including the lack of validation of key findings, limited differentiation assessments, and missing data on MSC-GYSmut resistance to reactive oxygen species (ROS), necessitate further exploration.
 
 Strengths:
 
 (1) Novel Approach: The study introduces an innovative method of enhancing MSC function by manipulating glycogen metabolism.
 
 (2) Increased Glycogen Storage: The genetic modification of GYS1, resulting in GYSmut, significantly increased glycogen accumulation, leading to improved MSC survival under starvation, which has strong implications for enhancing MSC therapeutic properties in energy-deficient environments.
 
 (3) Potential Therapeutic Impact: The findings suggest significant therapeutic potential for MSCs in conditions that require improved survival, persistence, and immunomodulation, especially in inflammatory or energy-limited settings.
 
 (4) In Vivo Validation: The in vivo murine model of pulmonary fibrosis demonstrated the improved survival and persistence of MSC-GYSmut, supporting the translational potential of the approach.
 
 Weaknesses:
 
 (1) Lack of Differentiation Assessments: The study did not evaluate key MSC differentiation pathways, including chondrogenic and osteogenic differentiation. The absence of analysis of classical MSC surface markers and multipotency limits the understanding of the full potential of MSC-GYSmut.
 
 (2) Missing Validation of RNA Sequencing Data: Although RNA sequencing data revealed promising transcriptomic changes in chondrogenesis and metabolic pathways, these findings were not experimentally validated, limiting confidence.
 
 (3) Lack of ROS Resistance Analysis: Resistance to reactive oxygen species (ROS), an important feature for MSCs under regenerative conditions, was not assessed, leaving out a critical aspect of MSC function.
 
 (4) Limited Exploration of Immunosuppressive Properties: The study did not address the immunosuppressive functions of MSC-GYSmut, which are critical for MSC-based therapies in clinical settings.
 
 Conclusion:
 
 The study presents an exciting new direction for enhancing MSC function through glycogen metabolism engineering. While the results show promise, key experiments and validations are missing, and several areas, such as differentiation capacity, ROS resistance, and immunosuppressive properties, require further investigation. Addressing these gaps would solidify the conclusions and strengthen the potential clinical applications of MSC-GYSmut in regenerative medicine.
 
 Review 2
4. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public Review)：
 
 (1) Glycogen biosynthesis typically involves several enzymes. In this context, could the authors comment on the effect of overexpressing a single enzyme - especially a mutant version - on the structure or quality of the glycogen synthesized?
 
 While quantitative molecular weight analysis of synthesized glycogen was not performed, we documented changes in glycogen particle morphology. GYSmut overexpression resulted in significantly enlarged singular glycogen granules, suggesting potential high molecular mass, while GYS-GYG co-overexpression in MSCs (GYG being the essential enzyme for glycogen synthesis initiation) produced a diffuse glycogen distribution pattern rather than particulate structures. We have incorporated this result as new Figure S2C.
 
 These results suggest that overexpression of specific glycogen-metabolizing enzymes significantly influences glycogen structure. Consequently, targeted modulation of glycogen architecture and properties through key enzymes represents a potential avenue for future investigation.
 
 (2) Regarding the in vitro starvation experiments (Figure 2C), what oxygen conditions (pO₂) were used? Are these conditions physiologically relevant and representative of the in vivo lung microenvironment?
 
 Our in vitro starvation experiments (Figure 3C) were conducted under normoxic (21%). The oxygen concentration in human lungs is physiologically lower than atmospheric levels, with healthy individuals exhaling air containing approximately 16% oxygen (Thalakkotur Lazar Mathew, Diagnostics 2015). To our knowledge, direct measurements of alveolar oxygen concentration in pulmonary fibrosis are rare. Therefore, to evaluate the performance of GYSmut under hypoxic conditions, in the revised manuscript, Figure S2 has been augmented to include assessment of cell performance under combined hypoxia （oxygen concentration < 5%）and nutrient deprivation stress, which further corroborate the superiority of the GYSmut group over the control under different oxygen concentrations.
 
 (3) In the in vitro model, how many hours does it take for the intracellular glycogen reserve to be completely depleted under starvation conditions?
 
 While quantitative cell viability data were recorded up to 72 hours post-implantation (Fig 3C), we observed cell viability at approximately 96 hours. We noticed that the presence of glycogen particles exhibited a correlation with sustained cell viability. However, reliable quantitative assessment of glycogen became increasingly challenging upon significant depletion of viable cells, thereby limiting our measurements during later time points.
 
 (4) For the in vivo model, is there a quantitative analysis of the survival kinetics of the transplanted cells over time for each group? This would help to better assess the role and duration of glycogen stores as an energy buffer after implantation.
 
 We tracked the in vivo distribution and persistence of implanted MSCs using enzymatic activity quantification assays (using Gluc luciferase assay) and live animal imaging (using Akaluc luciferase). The revised manuscript includes quantitative analysis of the in vivo fluorescence imaging data, which has been supplemented as Figure S4. Glycogen-engineered MSCs and control cells were quantitatively assessed at three discrete time points post-implantation. This quantification revealed a transient divergence in cell viability between the experimental and control groups around day 7. However, fluorescence in both cohorts subsequently declined to similar levels over the extended observation period.
 
 (5) Finally, the study was performed in male mice only. Could sex differences exist in the efficacy or metabolism of the engineered MSCs? It would be helpful to discuss whether the approach could be expected to be similarly effective in female subjects.
 
 We appreciate the reviewer’s important question regarding potential sex differences. Our study used male mice based on three key considerations: 1) Clinical Relevance: Idiopathic pulmonary fibrosis (IPF) shows significant male predominance, with diagnosis rates 3.5-fold higher in men (37.8% vs 10.6%, p<0.0001) and greater diagnostic confidence (Assayag et al., Thorax 2020). 2) Model Consistency: The bleomycin model (our chosen method) demonstrates more consistent fibrotic responses in male mice (Gul et al., BMC Pulm Med 2023). 3) Biological Rationale:
 
 Estrogen’s protective effects in females may confound therapeutic assessments (cited in Assayag et al.).
 
 We fully acknowledge this limitation and will include female subjects in subsequent translational studies. The therapeutic principle should theoretically apply to both sexes, but we agree this requires experimental validation.
 
 (6) The number of mice for each group and time point should be specified.
 
 The manuscript text has been revised to enhance clarity, and the number of mice for each group and time point has been specified (line 170 to 182).
 
 Reviewer #2 (Public Review):
 
 (4) Inconsistencies in In Vivo Data: There is a discrepancy between the number of animals shown in the figures and the graph (three individuals vs. five animals), as well as missing details on how luciferase signal intensity was quantified, requiring further clarification.
 
 To assess MSC survival in vivo, we employed two strategies utilizing distinct luciferases optimized for specific detection modalities. MSC viability was quantified ex vivo through Gaussia luciferase (Gluc) activity, leveraging its high sensitivity and established commercial assay kits (n = 3 mice per group per time point). For non-invasive longitudinal tracking within living animals, MSC distribution and viability were monitored via in vivo bioluminescence imaging using Akaluc luciferase, selected for its superior tissue penetration and sensitivity in situ (n = 5 mice per group).The manuscript text has been revised to enhance clarity, and the experiment protocols for luciferase signal detection and quantification has been added into Methods.
 
 （1) (2) (3) (5):
 
 We fully agree that further investigation into the functional consequences of glycogen engineering in MSCs – encompassing core cellular functions, immunomodulatory properties, and associated signaling pathways – is important to fully elucidate the underlying mechanisms. Cellular metabolism is intrinsically intertwined with diverse physiological processes. Consequently, we believe that glycogen engineering exerts multifaceted effects on MSCs, likely extending beyond the modulation of any single specific pathway. Studying the metabolic perturbation induced by such engineering approaches in mammalian cells represents an interesting field. The exploration of these aspects remains an long-term research objective within our group.
 
 Reviewer #2 (Recommendations for the authors):
 
 (6) Clarification of Data in the Murine Model:
 
 In Figure 4B, there is a discrepancy between the number of animals shown in the image (five) and those represented in the graph (three). This discrepancy needs clarification. Additionally, the study lacks information regarding the intensity of the signal in the luciferase assays. It is unclear how luciferase expression in the mice was quantified, and providing this detail would enhance the understanding of the data presented.
 
 We sincerely appreciate these valuable suggestions. We have revised the relevant text for greater clarity. Figure 4B and Figure 4C present results from two distinct experimental approaches, each employing different luciferase reporters and measurement methodologies, and different num of mice were used in these two experiments.
 
 Quantitative data derived from the in vivo bioluminescence imaging has been supplemented as Figure S4. The experiment protocols for luciferase signal detection and quantification has been added into Methods.
 
 To other recommendations of reviewer 2：
 
 We sincerely appreciate your valuable insights, which demonstrate your deep expertise. We fully agree that beyond nutrient availability, factors such as reactive oxygen species (ROS) and the immune microenvironment are also critical limitations affecting the survival and therapeutic efficacy of implanted MSCs.
 
 We propose that glycogen engineering exerts broad effects on MSCs. These effects manifest as changes in multiple cellular characteristics, including proliferation, differentiation, surface marker expression, antioxidant capacity, and immunomodulatory activity – all crucial factors for the therapeutic purpose of MSCs.
 
 We believe these changes likely involve complex networks of interconnected regulatory factors. The underlying mechanisms might be clarified through proteomic and metabolomic profiling.
 
 However, comprehensively investigating these interconnected aspects requires significant time and resources. Some components of this research extend beyond the current scope of our project. Nevertheless, exploring these mechanisms remains an important objective, and we will actively work to investigate them further in our ongoing studies.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.02.21.639504v2
www.biorxiv.org www.biorxiv.org

Toward Robust Neuroanatomical Normative Models: Influence of Sample Size and Covariates Distributions

4
1. Public_Reviews 13 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This important manuscript evaluates how sample size and demographic balance of reference cohorts affect the reliability of normative models. The evidence supporting the conclusions is convincing, although some additional analysis and clarifications could improve the generalisability of the conclusions. This work will be of interest to clinicians and scientists working with normative models.
  
  Summary
2. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  Overall, this is a well-designed and carefully executed study that delivers clear and actionable guidance on the sample size and representative demographic requirements for robust normative modelling in neuroimaging. The central claims are convincingly supported.
  
  Strengths:
  
  The study has multiple strengths. First, it offers a comprehensive and methodologically rigorous analysis of sample size and age distribution, supported by multiple complementary fit indices. Second, the learning-curve results are compelling and reproducible and will be of immediate utility to researchers planning normative modelling projects. Third, the study includes both replication in an independent dataset and an adaptive transfer analysis from UK Biobank, highlighting both the robustness of the results and the practical advantages of transfer learning for smaller clinical cohorts. Finally, the clinical validation ties the methodological work back to clinical application.
  
  Weaknesses:
  
  There are two minor points for consideration:
  
  (1) Calibration of percentile estimates could be shown for the main evaluation (similar to that done in Figure 4E). Because the clinical utility of normative models often hinges on identifying individuals outside the 5th or 95th percentiles, readers would benefit from visual overlays of model-derived percentile curves on the curves from the full training data and simple reporting of the proportion of healthy controls falling outside these bounds for the main analyses (i.e., 2.1. Model fit evaluation).
  
  (2) The larger negative effect of left-skewed sampling likely reflects a mismatch between the younger training set and the older test set; accounting explicitly for this mismatch would make the conclusions more generalisable.
  
  Review 1
3. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The authors test how sample size and demographic balance of reference cohorts affect the reliability of normative models in ageing and Alzheimer's disease. Using OASIS-3 and replicating in AIBL, they change age and sex distributions and number of samples and show that age alignment is more important than overall sample size. They also demonstrate that models adapted from a large dataset (UK Biobank) can achieve stable performance with fewer samples. The results suggest that moderately sized but demographically well-balanced cohorts can provide robust performance.
  
  Strengths:
  
  The study is thorough and systematic, varying sample size, age, and sex distributions in a controlled way. Results are replicated in two independent datasets with relatively large sample sizes, thereby strengthening confidence in the findings. The analyses are clearly presented and use widely applied evaluation metrics. Clinical validation (outlier detection, classification) adds relevance beyond technical benchmarks. The comparison between within-cohort training and adaptation from a large dataset is valuable for real-world applications.
  
  The work convincingly shows that age alignment is crucial and that adapted models can reach good performance with fewer samples. However, some dataset-specific patterns (noted above) should be acknowledged more directly, and the practical guidance could be sharper.
  
  Weaknesses:
  
  The paper uses a simple regression framework, which is understandable for scalability, but limits generalization to multi-site settings where a hierarchical approach could better account for site differences. This limitation is acknowledged; a brief sensitivity analysis (or a clearer discussion) would help readers weigh trade-offs. Other than that, there are some points that are not fully explained in the paper:
  
  (1) The replication in AIBL does not fully match the OASIS results. In AIBL, left-skewed age sampling converges with other strategies as sample size grows, unlike in OASIS. This suggests that skew effects depend on where variability lies across the age span.
  
  (2) Sex imbalance effects are difficult to interpret, since sex is included only as a fixed effect, and residual age differences may drive some errors.
  
  (3) In Figure 3, performance drops around n≈300 across conditions. This consistent pattern raises the question of sensitivity to individual samples or sub-sampling strategy.
  
  (4) The total outlier count (tOC) analysis is interesting but hard to generalize. For example, in AIBL, left-skew sometimes performs slightly better despite a weaker model fit. Clearer guidance on how to weigh model fit versus outlier detection would strengthen the practical message.
  
  (5) The suggested plateau at n≈200 seems context-dependent. It may be better to frame sample size targets in relation to coverage across age bins rather than as an absolute number.
  
  Review 2
4. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Author response
  
  We would like to thank the editors and two reviewers for the assessment and the constructive feedback on our manuscript, “Toward Robust Neuroanatomical Normative Models: Influence of Sample Size and Covariates Distributions”. We appreciate the thorough reviews and believe the constructive suggestions will substantially strengthen the clarity and quality of our work. We plan to submit a revised version of the manuscript and a full point-by-point response addressing both the public reviews and the recommendations to the authors.
  
  Reviewer 1.
  
  In revision, we plan to address the reviewer’s comments by: (i) strengthen the interpretation of model fit through reporting the proportion of healthy controls within and outside the extreme percentile bounds; (ii) adding age-resolved overlays of model-derived percentile curves compared to those from the full reference cohort for key sample sizes and regions; (iii) quantifying age-distribution alignment between train and test set; and (iv) summarizing model performance as a joint function of age-distribution alignment and sample size.
  
  Reviewer 2.
  
  In the revised manuscript, we will (i) expand the Discussion to more clearly outline the trade-offs between simple regression frameworks and hierarchical models for normative modeling (e.g., scalability, handling of multi-site variation, computational considerations), and discuss alternative approaches and harmonization as important directions for multi-site settings; (ii) contextualize OASIS-3 vs AIBL differences by quantifying train– test age-alignment across sampling strategies and emphasize that skewness should be interpreted relative to the target cohort’s alignment rather than absolute numbers. (iii) reassess sex-imbalance effects by reporting expected age distributions per condition and re-evaluate sex effects while controlling for age; (iv) investigate the apparent dip at n≈300 dip by increasing sub-sampling seeds, testing neighboring sample sizes, and using an alternative age-binning scheme to clarify the observed artifact; (v) clarify potential divergence between tOC separation and global fit under discrepancies in demographic distributions and relate tOC to age-alignment distance; (vi) reframe the sample-size guidance in terms of distributional alignment rather than an absolute n.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.26.672402v2
www.biorxiv.org www.biorxiv.org

Degradation of LMO2 in T cell leukaemia results in collateral breakdown of transcription complex partners and causes LMO2-dependent apoptosis

3
1. Public_Reviews 13 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This important paper reports the development of proteins and small molecules that induce degradation of a clinically-relevant oncogenic transcription factor, LMO2. The findings provide a proof of concept that PROTAC-type chemicals can be developed against intrinsically disordered proteins. The methods provide a blueprint for rational design of PROTACs starting from intracellular antibody paratopes. Overall, the paper is supported by solid evidence and will be of interest to chemical biologists and cancer pharmacologists.
 
 Summary
2. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 Sereesongsaeng et al. aimed to develop degraders for LMO2, an intrinsically disordered transcription factor activated by chromosomal translocation in T-ALL. The authors first focused on developing biodegraders, which are fusions of an anti-LMO2 intracellular domain antibody (iDAb) with cereblon. Following demonstrations of degradation and collateral degradation of associated proteins with biodegraders, the authors proceeded to develop PROTACs using antibody paratopes (Abd) that recruit VHL (Abd-VHL) or cereblon (Abd-CRBN). The authors show dose-dependent degradation of LMO2 in LMO2+ T-ALL cell lines, as well as concomitant dose-dependent degradation of associated bHLH proteins in the DNA-binding complex. LMO2 degradation via Abd-VHL was also determined to inhibit proliferation and induce apoptosis in LMO2+ T-ALL cell lines.
 
 Strengths:
 
 The topic of degrader development for intrinsically disordered proteins is of high interest and the authors aimed to tackle a difficult drug target. The authors evaluated methods including the development of biodegraders, as well as PROTACs that recruit two different E3 ligases. The study includes important chemical control experiments, as well as proteomic profiling to evaluate selectivity.
 
 Weaknesses:
 
 Several weaknesses remain in this study:
 
 (1) The overall degradation achieved is not highly potent (although important proof-of-concept);
 
 (2) The mechanism of collateral degradation is not completely addressed. The authors acknowledge possible explanations, which would require mutagenesis and structural studies to further dissect;
 
 (3) The proteomics experiments do not detect LMO2, which the authors attribute to its size, making it difficult to interpret.
 
 Review 1
3. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The authors describe the degradation of an intrinsically disordered transcription factor (LMO2) via PROTACs (VHL and CRBN) in T-ALL cells. Given the challenges of drugging transcription factors, I find the work solid and a significant scientific contribution to the field.
 
 Strengths:
 
 (1) Validation of LMO2 degradation by starting with biodegraders, then progressing to chemical degrades.
 
 (2)interrogation of the biology and downstream pathways upon LMO2 degradation (collateral degradation §
 
 (3) Cell line models that are dependent/overexpression of LMO2 vs LMO2 null cell lines.
 
 (4) CRBN and VHL-derived PROTACs were synthesized and evaluated.
 
 Weaknesses:
 
 (1) The conventional method used to characterize PROTACs in the literature is to calculate the DC50 and Dmax of the degraders, I did not find this information in the manuscript.
 
 As noted in the reply to referee’s point 4 below, our first generation compounds are not highly potent. The DC50 values have been computed specifically using Western blot reflected in the data shown in Fig. 2. The revised version Supplementary Fig. S3 shows these quantified Western blot data from a time course of treating KOPT-K1 cells with either Abd-CRBN and Abd-VHL, where the 24 hour blot data are shown in Figure 2, G and E, and the quantified data from each 24 hour treatment are quantified in Supplementary Fig. S3). With these data, the DC50 values 9 μM for Abd-CRBN and 15 μM Abd-VHL), included in in the main text and the Supplementary Fig. S3 figure legend.
 
 In addition, the loss of signal of the LMO2-Rluc reporter protein from PROTAC treated cells shown in Fig. 2M has been used to calculate a half-point of degradation; although strictly not DC50, as it measures a reporter protein, this yielded values are 10 μM for Abd-CRBN and 9 μM Abd-VHL.
 
 (2) The proteomics data is not very convincing, and it is not clear why LMO2 does not show in the volcano plot (were higher concentrations of the PROTAC tested? and why only VHL was tested and not CRBN-based PROTAC?).
 
 Due to the relatively small size of the LMO2 protein, it is challenging to produce enough unique peptides for reliable identification, especially to distinguish some proteins in the LMO2 complex.
 
 (3) The correlation between degradation potency and cell growth is not well-established (compare Figure 4C: P12-Ichikawa blots show great degradation at 24 and 48 hrs, but it is unclear if the cell growth in this cell line is any better than in PF-382 or MOLT-16) - Can the authors comment on the correlation between degradation and cell growth?
 
 In this study (Fig. 4) we did not aim to compare the effect of LMO2 loss on cell growth among LMO2 positive cells. Rather, we aimed to evaluate the LMO2 importance for cell growth in LMO2-expressing T-ALL cells compared to non-expressing cells and to correlate the loss of the protein with this effect on the cell growth. In addition, the treatment of cells with the LMO2 compounds did now show an effect to LMO2 negative cells until at least 48 hours of treatment indicating that low toxicity of our PROTAC compounds and providing correlation between LMO2 loss and cell growth.
 
 (4) The PROTACs are not very potent (double-digit micromolar range?) - can the authors elaborate on any challenges in the optimization of the degradation potency?
 
 The Abd methodology to use intracellular domain antibodies to screen for compounds that bind to intrinsically disordered proteins such as the LMO2 transcription factors offers a tractable approach to hard drug targets but, in so doing, creates challenging factors to improve the potency that are not the same as those targets for which structural data are available. LMO2 is an intrinsically disordered protein, for which soluble recombinant protein is not readily available to identify the binding pocket of compounds. The potency has so far been optimized solely based on the different moieties substituted in cell-based SAR studies (http://advances.sciencemag.org/cgi/content/full/7/15/eabg1950/DC1 ) and all new compounds were tested with BRET assays. Thus, currently optimization of the degradation potency (including properties such as improved solubility) for the LMO2-binding compounds relies on chemical modification the three areas of the compounds indicated in Fig. 2 B,C.
 
 (5) The authors mentioned trying six iDAb-E3 ligase proteins; I would recommend listing the E3 ligases tried and commenting on the results in the main text.
 
 The six chimaeric iDAb-E3 ligase proteins involved one anti-LMO2 iDAb and three different E3 ligase where either fused at the N- or the C-terminus of the VH (giving six protein formats). These six fusion proteins were described in the text referring to the degrader studies described in Supplementary Fig. 1.
 
 Reviewer #2 (Public review):
 
 Summary:
 
 Sereesongsaeng et al. aimed to develop degraders for LMO2, an intrinsically disordered transcription factor activated by chromosomal translocation in T-ALL. The authors first focused on developing biodegraders, which are fusions of an anti-LMO2 intracellular domain antibody (iDAb) with cereblon. Following demonstrations of degradation and collateral degradation of associated proteins with biodegraders, the authors proceeded to develop PROTACs using antibody paratopes (Abd) that recruit VHL (Abd-VHL) or cereblon (Abd-CRBN). The authors show dose-dependent degradation of LMO2 in LMO2+ T-ALL cell lines, as well as concomitant dose-dependent degradation of associated bHLH proteins in the DNA-binding complex. LMO2 degradation via Abd-VHL was also determined to inhibit proliferation and induce apoptosis in LMO2+ T-ALL cell lines.
 
 Strengths:
 
 The topic of degrader development for intrinsically disordered proteins is of high interest, and the authors aimed to tackle a difficult drug target. The authors evaluated methods, including the development of biodegraders, as well as PROTACs that recruit two different E3 ligases. The study includes important chemical control experiments, as well as proteomic profiling to evaluate selectivity.
 
 Weaknesses:
 
 The overall degradation is relatively weak, and the mechanism of potential collateral degradation is not thoroughly evaluated
 
 The purpose of the study was to evaluate effects of LMO2 degraders. The mechanism of the observed collateral degradation could not be investigated directly within the scope of our study. In the main text, discussed two possible, not exclusive, explanations. One being that our work (and previously published, cited work) indicates that the DNA-binding bHLH proteins have relatively short half file (Supplementary Fig. S12) and may therefore be subject to normal turnover when the LMO2, which is in the complex, turns over. Further, the known structure of the LMO2-bHLH interactions (from Omari et al, doi: 10.1016/j.celrep.2013.06.008) was also examined for the location of lysines in the TAL1 & E47 partners (Supplementary Fig. S11). It is possible that their local association with the LMO2-E3-ligase complex created by the PROTAC interaction, could cause their concurrent degradation. Mutagenesis and structural analysis would be needed to establish this point.
 
 In addition, experiments comparing the authors' prior work with their anti-LMO2 iDAb or Abl-L are lacking, which would improve our understanding of the potential advantages of a degrader strategy for LMO2.
 
 A major motivation behind developing the Antibody-derived (Abd) method to select compounds, which are surrogates of the antibody paratope, is because using iDAbs directly as inhibitors requires the development of delivery technologies for these macromolecules, as protein directly or as vectors or mRNA for their expression. Ultimately, high affinity anti-LMO2 iDAbs should directly be used as tractable inhibitors when delivery methods redeveloped. In the meantime, Abd compounds were envisaged as being surrogates suitable for development into reagents, and potentially drugs, by medicinal chemistry. We evaluated selected first generation LMO2-binding Abd compounds previously, finding their ability to interfere with LMO2-iDAb BRET signal to ECmax about 50% but these compounds do not have potency to have an effect on the interaction of LMO2 with a non-mutated iDAb (nM affinity). These data indicated that efficacy improvement for the PROTACs was needed. In addition, in the current study, we observed viability effects in T-ALL lines at high concentrations (20 μM) irrespective of LMO2 expression (Supplementary Fig. S 2A, B) These data indicated that efficacy improvement was needed and potentially converting the degraders (PROTACs) would add to in-cell potency. By adding the E3 ligase ligands, we found the toxicity of non-LMO2 expressing Jurkat was significantly reduced (Supplementary Fig. S 2E, F).
 
 Reviewer #2 (Recommendations for the authors):
 
 Suggestions for additional experiments:
 
 (1) The data presented is primarily focused on demonstrating targeted degradation of LMO2, with a focus on phenotypes such as proliferation and apoptosis. In this manuscript, there are limited comparative evaluations of anti-LMO2 iDAb or Abl-L to show the potential benefits of a degrader approach to their previously described work, as well as why targeted degradation is in fact, advantageous. For example, the authors' previous work has shown that anti-LMO2 iDAb inhibits tumor growth in a mouse transplantation model. Comparisons in vitro would be supportive of the importance of continued degrader optimization/development.
 
 we have previously shown that an anti-LMO2 scFv inhibits tumour growth in a mouse model but this work used an expressed scFv antibody that binds to LMO2 in nM range. The Abd compounds are much lower potency that the antibody and, because recombinant LMO2 is difficult to work with, we could only evaluate interactions of compounds with LMO2 in cell-based assays like BRET (LMO2-iDAb BRET). In this cell-based assay, the first generation Abd compounds do not have sufficient potency to block LMO2-iDAb interaction unless the affinity of the iDAb is reduced to sub-μM. The justification for proceeding on the degrader process rather than just using the protein-protein interaction (PPI) inhibition was based largely around the low potency of the first generation PPI compounds in cell assays and that incorporation protein degradation with PPI inhibition would enhance the efficacy.
 
 In addition, the viability experiments are also very short-term; is there a reason why the authors did not carry out these experiments for 3-5 days to fully understand the impacts on proliferation?
 
 In Supplementary Fig. S5, we did show assays up to 3 days. In KOPT-K1 (LMO2+), the LMO2 levels were reduced during the time course of this assay (from a single compound dose at time zero) (Supplementary Fig S 5A, B). We also show CellTitreGlo assays up to 3 days and, with these second generation compounds, we observed sustained effects on KOPT-K1 (LMO2+) but low non-DMSO toxicity in Jurkat (LMO2-) (revised version Supplementary (Fig S5 C, D).
 
 (2) The potential mechanism of collateral degradation is interesting and important in evaluating the on-target responses and consequences of degrading LMO2. At this time, the data supporting collateral degradation is limited and would be strengthened by showing that it is not due to a change in mRNA levels and not due to complex dissociation. Overall, the kinetics and depth of loss of complex members such as E47 in Figure 3 appear more substantial than LMO2 itself, and as presented, collateral degradation is not effectively demonstrated. In addition, to aid in the readers' assessments, additional background and references around the roles of TAL1 and E47 would be helpful. For example, structurally, where do they (and other associated proteins that are not degraded) fit in the complex?
 
 We have responded above in relation to the Public Review Comments and note that a structure of the complex was in submitted version (now revised version Supplementary Fig. S11).
 
 (3) In Figure 1A, the blots show decreased levels of endogenous CRBN with iDAB-CRBN. Is this a known consequence of this approach in these cell lines? Does the partial recovery of endogenous CRBN in KOPTK1 cells have any indication of iDAB-CRBN levels?
 
 We cannot be sure why the endogenous level of CRBN decreases in doxycycline treated cells. It has been shown (DOI:10.1371/journal.pone.0064561) that doxycycline used in the inducible expression system (and its derivatives), such as the lentivirus we used, has an effect to gene expression patterns, which can be increase or decrease expression. Although the published study did not examine CRBN expression, the effect might explain the CRBN expression decrease on doxycycline addition and remains the same level after that.
 
 (4) In Figure S7, the authors do not fully explain the results and why there is minimal rescue with epoxomicin (S7A) or MLN4924 (S7J). This could indicate an alternative mechanism of degradation and loss at play, given the lack of rescue. Can the authors comment on this discrepancy, and have they looked autophagy inhibitor or other agents to achieve the chemical rescue?
 
 In the experiments such as in revised version Supplementary Fig. S6, we used KOPT-K1 cells with a single concentration of the inhibitors and the cells may less susceptible to the epoxomicin (0.8 μM) but lenalidomide and free thalidomide restored the LMO2 levels fully. In the main text Fig. 3D, we also showed that including epoxomicin and thalidomide with the Abd-CRBN in KOPT-K1 and CCRF-CEM restore LMO2 levels, supporting the conclusion that the main mechanism of degradation is through ubiquitination proteosomal route.
 
 (5) For the proteomics data, it would be helpful to have the proteins in yellow highlighted to have them noted in 5D and 5E. In addition, can the authors comment on why LMO2 or their collateral targets are not confirmed in the table? Furthermore, 5C is difficult to interpret; if there are no significantly changing proteins in the Jurkat cells, why are there pathways that are identified?
 
 As mentioned in reply to referee 1, due to the relatively small size of the LMO2 protein, it is challenging to produce enough unique peptides for reliable identification, especially to distinguish some proteins in the LMO2 complex where expression levels are low.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.12.09.627495v3
www.biorxiv.org www.biorxiv.org

Independent Validation of Transgenerational Inheritance of Learned Pathogen Avoidance in Caenorhabditis elegans

5
1. Public_Reviews 13 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This valuable study concerns a model for transgenerational epigenetic inheritance, the learned avoidance by C. elegans of the PA14 pathogenic strain of Pseudomonas aeruginosa. A recent study questioned whether transgenerational inheritance in this paradigm lacks robustness. The authors of this study have worked independently of the group that reported the original phenomenon and also independently of the group that challenged the original report. With solid data, this study independently validates findings previously reported by the Murphy group, confirming that the paradigm is reproducible elsewhere. The reviewers also appreciated the information on reagent sources used by different groups. The present study is therefore of broad interest to anyone studying genetics, epigenetics, or learned behavior.
 
 Summary
2. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The manuscript addresses the discordant reports of the Murphy (Moore et al., 2019; Kaletsky et al., 2020; Sengupta et al., 2024) and Hunter (Gainey et al., 2025) groups on the existence (or robustness) of transgenerational epigenetic inheritance (TEI) controlling learned avoidance of C. elegans to Pseudomonas aeruginosa. Several papers from Colleen Murphy's group describe and characterize C. elegans transgenerational inheritance of avoidance behaviour. In the hands of the Murphy group, the learned avoidance is maintained for up to four generations, however, Gainey et al. (2025) reported an inability to observe inheritance of learned avoidance beyond the F1 generation. Of note, Gainey et al used a modified assay to measure avoidance, rather than the standard assay used by the Murphy lab. A response from the Murphy group suggested that procedural differences explained the inability of Gainey et al.(2025) to observe TEI. They found two sources of variability that could explain the discrepancy between studies: the modified avoidance assay and bacterial growth conditions (Kaletsky et al., 2025). The standard avoidance assay uses azide as a paralytic to capture worms in their initial decision, while the assay used by the Hunter group does not capture the worm's initial decision but rather uses cold to capture the location of the population at one point in time.
 
 In this short report, Akinosho, Alexander, and colleagues provide independent validation of transgenerational epigenetic inheritance (TEI) of learned avoidance to P. aeruginosa as described by the Murphy group by demonstrating learned avoidance in the F2 generation. These experiments used the protocol described by the Murphy group, demonstrating reproducibility and robustness.
 
 Strengths:
 
 Despite the extensive analyses carried out by the Murphy lab, doubt may remain for those who have not read the publications or for those who are unfamiliar with the data, which is why this report from the Vidal-Gadea group is so important. The observation that learned avoidance was maintained in the F2 generation provides independent confirmation of transgenerational inheritance that is consistent with reports from the Murphy group. It is of note that Akinosho, Alexander et al. used the standard avoidance assay that incorporates azide, and followed the protocol described by the Murphy lab, demonstrating that the data from the Moore and Kaletsky publications are reproducible, in contrast to what has been asserted by the Hunter group.
 
 Comments on revised version:
 
 I am happy with the responses to reviews.
 
 Review 1
3. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The manuscript "Independent validation of transgenerational inheritance of learned pathogen avoidance in C. elegans" by Akinosho and Vidal-Gadea offers evidence that learned avoidance of the pathogen PA14 can be inherited for at least two generations. In spite of initial preference for the pathogen when exposed in a 'training session', 24 hours of feeding on this pathogen evoked avoidance. The data are robust, replicated in 4 trials, and the authors note that diminished avoidance is inherited in generations F1 and F2.
 
 Strengths:
 
 These results contrast with those reported by Gainey et al, who only observed intergenerational inheritance for a single generation. Although the authors' study does not explain why Gainey et el fail to reproduce the Murphy lab results, one possibility is that a difference in a media ingredient could be responsible.
 
 Comments on revised version:
 
 The responses to the reviewer comments appear reasonable for the most part.
 
 Review 2
4. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 This short paper aims to provide an independent validation of the transgenerational inheritance of learned behaviour (avoidance) that has been published by the Murphy lab. The robustness of the phenotype has been questioned by the Hunter lab. In this paper, the authors present one figure showing that transgenerational inheritance can be replicated in their hands. Overall, it helps to shed some light on a controversial topic.
 
 Strengths:
 
 The authors clearly outline their methods, particularly regarding the choice of assay, so that attempting to reproduce the results should be straightforward. It is nice to see these results repeated in an independent laboratory.
 
 Comments on revised version:
 
 I'm happy with the response to reviewers.
 
 Review 3
5. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Author response
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public Review):
 
 Confirmation of daf-7::GFP data and inheritance beyond F2
 
 Reviewer suggested confirming daf-7::GFP molecular marker data and testing inheritance beyond the F2 generation to further strengthen the findings.
 
 We agree these experiments would provide valuable mechanistic insights into the molecular basis of transgenerational inheritance. However, our study was specifically designed as a reproducibility study focusing on the central controversy regarding F2 inheritance (Gainey et al. vs. Murphy lab findings). The daf-7::GFP molecular marker experiments, while important for understanding mechanisms, represent a different research question requiring extensive additional resources and expertise beyond the scope of this validation study. Our primary goal was to provide independent confirmation of the disputed F2 inheritance using standardized behavioral assays. It is our hope that future work will pursue these important mechanistic validations.
 
 "Exhaustive attempts" language
 
 Reviewer disagreed with characterizing Gainey et al.'s efforts as "exhaustive attempts" since they modified the original protocol.
 
 We revised this statement in the Results and Discussion to more accurately reflect the experimental situation: "In contrast, Gainey et al. (2025), representing the Hunter group, reported that while parental and F1 avoidance behaviors were evident, transgenerational inheritance was not reliably observed beyond the F1 generation under their experimental conditions."
 
 Importance of sodium azide
 
 Reviewer suggested including more discussion about the recent findings on the importance of sodium azide in the assay, referencing the Murphy group's response paper.
 
 We have prominently highlighted the critical role of sodium azide in our Introduction with strengthened language that emphasizes its importance for resolving the scientific controversy: "Critically, Kaletsky et al. (2025) demonstrated that omission of sodium azide during scoring can completely abolish detection of inherited avoidance, revealing that this key methodological difference may explain the conflicting results between laboratories. The use of sodium azide to immobilize worms at the moment of initial bacterial choice appears essential for capturing the inherited behavioral response. These findings highlight how seemingly minor methodological variations can dramatically impact detection of transgenerational inheritance and underscore the need for independent replication using standardized protocols."
 
 Protocol fidelity statement
 
 Reviewer requested a more direct statement clarifying that we followed the Murphy group protocol, noting that we made some modifications.
 
 We followed the core Murphy lab protocol with two evidence-based optimizations that preserve the essential experimental elements: 1) We used 400 mM sodium azide instead of 1 M based on preliminary data showing the higher concentration caused premature paralysis before worms could make behavioral choices, and 2) We used liquid NGM buffer instead of M9 to maintain chemical consistency with the solid NGM plates used for worm culture, minimizing potential osmotic stress. These modifications improved experimental reliability while maintaining the critical components: sodium azide immobilization, bacterial lawn density standardization (OD600 = 1.0), and synchronized scoring conditions that are essential for detecting inherited avoidance.
 
 Overstated dilution claim
 
 Reviewer noted that the statement about "gradual decrease" in avoidance strength was overstated and didn't reflect the actual data presented in the manuscript.
 
 We removed this statement.
 
 Environmental variables phrasing
 
 Reviewer found the sentence about environmental variables unclear, noting that Gainey et al. didn't actually acknowledge variability but saw it as indicating error or stochastic processes.
 
 We refined this statement for greater precision and clarity: "This underscores the assay's sensitivity to environmental variables, such as synchronization method and bacterial lawn density. This highlights the importance of consistency across experimental setups and support the view that context-dependent variation may underlie previously reported discrepancies."
 
 Reviewer #2 (Public Review):
 
 Reagent sourcing
 
 Reviewer suggested listing the sources of media ingredients with company names and catalog numbers, as this might be important for reproducibility.
 
 To ensure complete reproducibility, we created a comprehensive Table S3 listing all reagents, suppliers, and catalog numbers used in our experiments. This detailed information enables exact replication of our experimental conditions and addresses potential variability that might arise from different reagent sources between laboratories.
 
 Reviewer #3 (Public Review):
 
 Raw data transparency
 
 Reviewer noted that while a spreadsheet with choice assay results was provided, the individual raw data from assays was not included, which would be helpful for assessing sample sizes.
 
 We now provide complete experimental transparency through Table S2, which contains individual choice indices from all 138 assays conducted across four independent trials. This comprehensive dataset allows full assessment of our experimental outcomes, statistical robustness, and reproducibility while enabling other researchers to perform independent statistical analyses.
 
 F1/F2 assay disparity
 
 Reviewer questioned whether the higher number of F2 assays compared to F1 represented truly independent assays, asking if multiple F2 assays were performed from offspring of one F1 plate (which would not represent independent assays).
 
 We clarified this important statistical consideration in Methods (Transgenerational Testing): "Each behavioral assay was conducted using animals from a biologically independent growth plate. While F2 plates were derived from pooled embryos from multiple F1 parents, each assay represents an independent biological replicate with no reuse of animals across assays. F2 assays (n=45) exceeded F1 assays (n=20) due to PA14-induced fecundity reduction in trained worms, limiting the number of viable F1 progeny. The higher number of F2 assays reflects the greater reproductive success of healthy F1 animals and provides additional statistical power for population-level behavioral comparisons." We also enhanced our Controls section to clarify that "Our experimental design employed population-level comparisons across generations using unpaired statistical analyses, with no attempt to track individual lineages across generations."
 
 Methodological variations overstatement
 
 Reviewer felt the Introduction overstated the findings by suggesting the authors "address potential methodological variations," when they only used one assay setup throughout.
 
 We have corrected the Introduction to accurately reflect our study design and scope: "Here, we adapted the protocol established by the Murphy group, maintaining the critical use of sodium azide to paralyze worms at the time of choice, to test whether parental exposure to PA14 elicits consistent avoidance in subsequent generations. Our study specifically focuses on the transmission of learned avoidance through the F2 generation, beyond the intergenerational (F1) effect, because this is where divergence between published studies begins."
 
 Reviewer #1 (Recommendations for the authors):
 
 Worm numbers
 
 Reviewer noted that information about the number of worms used should be included in the training and choice assay methods section rather than separated.
 
 We clarified worm numbers and sample sizes in the Methods (Controls and Additional Considerations): "Each individual assay averaged 62 ± 43 animals (range: 15-150 worms per assay), with a total of 138 assays conducted across four independent experimental trials. The variation in worm numbers per assay reflects natural variation in worm recovery and immobilization efficiency during choice assays. We conducted an average of 8.5 assays per condition during each of the four replicates."
 
 Figure 1 legend and consistency
 
 Reviewer identified several issues: inconsistent terminology ("treated" vs "trained"), incorrect statistical test naming, missing p-value annotations, and need for consistency between figure and legend. We have systematically addressed all figure consistency and statistical annotation issues:
 
 Replaced inconsistent "treated" terminology with "trained" throughout
 
 Corrected the statistical test description to accurately reflect our analysis: "Kruskal-Wallis oneway ANOVA followed by Dunn's post hoc" which properly corresponds to the statistical tests detailed in Table S1
 
 Added explicit p-value annotations in the figure legend: "*p<0.05, **p<0.01 means and SEM shown (see Table S1 for statistics and Table S2 for raw data)"
 
 Ensured consistent terminology between figure and legend
 
 NGM vs. M9 buffer
 
 Reviewer questioned whether we used NGM buffer or M9 buffer for washing steps, noting that NGM isn't usually referred to as "buffer."
 
 We have prominently featured and thoroughly clarified our rationale for using liquid NGM buffer in the Methods (Synchronization of Worms section). The explanation now appears upfront in the methods: "We used liquid NGM buffer instead of M9 buffer (as specified in the original Murphy protocol) to maintain chemical consistency with the solid NGM culture plates. This modification minimizes potential osmotic stress since liquid NGM matches the pH (6.0) and ionic composition of the growth medium, whereas M9 buffer has a different pH (7.0) and ionic profile." We provide detailed chemical differences and explain that this modification maintains consistency with culture conditions while preserving essential experimental procedures.
 
 Grammar/typos
 
 Reviewer noted that the manuscript needed thorough proofreading to address grammatical errors and typographical mistakes.
 
 We have conducted comprehensive proofreading and editing throughout the manuscript to resolve grammatical and typographical errors. Specific improvements include: clarified sentence structure in the Introduction and Results sections, corrected technical terminology consistency, improved figure legend clarity, and enhanced overall readability while maintaining scientific precision.
 
 Sodium azide concentration
 
 Reviewer noted that our sodium azide concentration differed from the Moore paper and requested comment on this difference.
 
 We have included explicit justification for our sodium azide concentration choice in the Methods (Training and Choice Assay): "We used 400 mM sodium azide rather than the 1 M concentration reported by Moore et al. (2019) because preliminary trials showed that higher concentrations caused premature paralysis before worms could reach either bacterial spot, potentially biasing choice measurements. The 400 mM concentration provided sufficient immobilization while preserving the behavioral choice window."
 
 Reviewer #2 (Recommendations for the authors):
 
 Comparative reagent analysis
 
 Reviewer suggested creating a supplemental table comparing reagent sources between our study, Gainey et al., and Murphy et al., proposing that media ingredient differences might explain the discrepancies.
 
 While direct reagent comparison between laboratories was beyond the scope of this validation study, we recognize this as an important consideration for understanding experimental variability. Our comprehensive reagent sourcing information (Table S3) provides the foundation for future comparative studies. We encourage collaborative efforts to systematically compare reagent sources across laboratories, as media component differences could contribute to the experimental variability observed between research groups. Such analyses would be valuable for establishing standardized protocols across the field.
 
 Conclusion
 
 We hope that these revisions satisfactorily address the reviewers’ concerns. We believe these improvements significantly strengthened the manuscript's contribution to resolving this important scientific controversy.
 
 We thank the reviewers again for their invaluable insights and constructive feedback, which have substantially improved the quality and impact of our work.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.04.03.647070v2
www.biorxiv.org www.biorxiv.org

Adult-neurogenesis allows for representational stability and flexibility in early olfactory system

4
1. Public_Reviews 13 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This paper presents a valuable theory and analysis of the role of neurogenesis and inhibitory plasticity in the drift of neural representations in the olfactory system. For one of the findings, regarding the impact of neurogenesis on the drift, the evidence remains incomplete. The reason lies in the differences in variability/drift of the mitral/tufted cell responses observed in the model compared to experimental observations, where these responses remain stable over extended time scales.
 
 Summary
2. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The authors build a network model of the olfactory bulb and the piriform cortex and use it to run simulations and test their hypotheses. Given the model's settings, the authors observe drift across days in the responses to the same odors of both the mitral/tufted cells, as well as of piriform cortex neurons. When representing the M/T and PCx responses within a lower-dimensional space, the apparent drift is more prominent in the PCx, while the M/T responses appear in comparison more stable. The authors further note that introducing spike-time dependent plasticity (STDP) at bulb synapses involving abGCs slows down the drift in the PCx representations, and further link this to the observation that repeated exposure to the same odorant slows down drift in the piriform cortex.
 
 The model is clearly explained and relies on several assumptions and observations:
 
 (1) Random projections of MTC from the olfactory bulb to the piriform cortex, random intra-piriform connectivity, and random piriform to bulb connectivity.
 
 (2) Higher dimensionality of piriform cortex representations compared to M/T responses, which enables superior decoding of odor identity in the piriform cortex.
 
 (3) Spike time-dependent plasticity (STDP) at synapses involving the abGCs.
 
 The authors address an open topical problem, and the model is elegant in its simplicity. I have however, several major concerns with the hypotheses underlying the model and with its biological plausibility.
 
 Concerns:
 
 (1) In their model, the authors propose that MTC remain stable at the population level, despite changes in individual MTC responses.
 
 The authors cite several experimental studies to support their claims that individual MTC responses to the same odors change (some increase, some decrease) across days. Interpreting the results of these studies must, however, take into account the variability of M/T responses across odor presentation repeats within the same session vs. across sessions. In the Shani-Narkiss et al., Frontiers in Neural Circuits, 2023 study referenced, a large fraction of the variability across days in M/T responses is also observed across repeats to the same odorant in the same session (Shani-Narkiss et al., Figure 4), while the authors have M/T responses in the same session that are highly reproducible. This is an important point to consider and address, since it constrains how much of the variability in M/T responses can be attributed to adult neurogenesis in the olfactory bulb versus to other networks' inhibitory mechanisms, which do not rely on neurogenesis. In the authors' model, the variability in M/T responses observed across days emerges as a result of adult-born neurogenesis, which does not need to be the main source of variability observed in imaging experiments (Shani-Narkiss et al., Figure 4).
 
 Another study (Kato et al., Neuron, 2012, Figure 4) reported that mitral cell responses to odors experienced repeatedly across 7 days tend to sparsen and decrease in amplitude systematically, while mitral cell responses to the same odor on day 1 vs. day 7 when the odor is not presented repeatedly in between seem less affected (although the authors also reported a decrease in the CI for this condition). As such, Kato et al. mostly report decreases in mitral cell odor responses with repeated odor exposure at both the individual and population level, and not so much increases and decreases in the individual mitral cell responses, and stability at the population level.
 
 (2) In Figure 1, a set of GCs is killed off, and new GCs are integrated in the network as abGC. Following the elimination of 10% of GCs in the network, new cells are added and randomly assigned synaptic weights between these abGCs and MTC, GCs, SACs, and top-down projections from PCx. This is done for 11 days, during which time all GCs have gone through adult neurogenesis.
 
 Is the authors' assumption here that across the 11 days, all GCs are being replaced? This seems to depart from the known biology of the olfactory bulb granule cells, i.e., GCs survive for a large fraction of the animal's life.
 
 (3) The authors' model relies on several key assumptions: random projections of MTC from the olfactory bulb to the piriform cortex, random intra-piriform connectivity, and random piriform to bulb connectivity. These assumptions are not necessarily accurate, as recent work revealed structure in the projections from the olfactory bulb to the piriform cortex and structure within the piriform cortex connectivity itself (Fink et al., bioRxiv, 2025; Chae et al., Cell, 2022; Zeppilli et al., eLife, 2021).
 
 How do the results of the model relating adult neurogenesis in the bulb to drift in the piriform cortex representations change when considering an alternative scenario in which the olfactory bulb to piriform and intra-piriform connectivity is not fully distributed and indistinguishable from random, but rather is structured?
 
 (4) I didn't understand the logic of the low-dimensional space analysis for M/T cells and piriform cortex neurons (Figures 2 & 3). In the authors' model, the full-ensemble M/T responses are reorganized over time, presumably due to the adult-born neurogenesis. Analyzing a lower-dimensional projection of the ensemble trajectories reveals a lower degree of re-organization. This is the same for the piriform cortex, but relatively, the piriform ensembles displayed in a low-dimensional embedding appear to drift more compared to the M/T ensembles.
 
 This analysis triggers a few questions: which representation is relevant for the brain function - the high or the low-dimensional projection? What fraction of response variance is included in the low-dimensional space analysis? How did the authors decide the low-dimensional cut-off? Why does STDP cause more drift in piriform cortex ensembles vs. M/T ensembles? Is this because of the assumed higher dimensionality of the piriform cortex representations compared to the mitral cells?
 
 (5) Could the authors comment whether STDP at abGC synapses and its impact on decreasing drift represent a new insight, and also put it into context? Several studies (e.g., Lledo, Murthy, Komiyama groups) reported that abGC integrates in the network in an activity-dependent manner, and not randomly, and as such stabilizes the active neuronal responses, which is consistent with the authors' report.
 
 Related, I couldn't find through the manuscript which synapses involving abGCs they focus on, or what is the relative contribution of the various plastic synapses shown in the cartoon from Figure 4 A1 (circles and triangles).
 
 6) The study would be strengthened, in my opinion, by including specific testable predictions that the authors' models make, which can be further food for thought for experimentalists. How does suppression of adult-born neurogenesis in the OB impact the stability of mitral cell odor responses? How about piriform cortex ensembles?
 
 Review 1
3. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The authors address a critical problem in olfactory coding. It has long been known that adult neurogenesis, specifically in the form of adult-born granule cells that embed into the existing inhibitory networks on the olfactory bulb, can potentially alter the responses of Mitral/Tufted neurons that project activity to the Piriform Cortex and to other areas of the brain. Fundamentally, it would seem that these granule cells could alter the stability of neural codes in the OB over time. The authors develop a spiking network model to explore how stability can be achieved both in the OB over time and in the PC, which receives inputs. The model recapitulates published activity recordings of M/T cells and shows how activity in different M/T cells from the same glomerulus shifts over time in ways that, in spite of the shift, preserve population/glomerular level codes. However, these different M/T cells fan out onto different pyramidal cells of the PC, which gives rise to instability at that level. STDP then, is necessary to maintain stability at the PC level as long as odor environments remain constant. These results may also apply to a similar neurogenesis-based change in the Dentate Gyrus, which generates instability in CA1/3 regions of the hippocampus
 
 Strengths:
 
 A robust network model that untangles important, seemingly contradictory mechanisms that underlie olfactory coding.
 
 Weaknesses:
 
 The work is a significant contribution to understanding olfactory coding. But the manuscript would benefit from a brief discussion of why neurogenesis occurs in the first place - e.g., injury, ongoing needs for plasticity, and adapting to turnover of ORNs. There is literature on this topic. It seems counterintuitive to have a process in the MOB (and for that matter in the DG) that potentially disrupts the ability to generate stable codes both in the MOB and PC, and in particular a disruption that requires two different mechanisms - multiple M/T cells per glomerulus in the MOB and STDP in the PC - to counteract.
 
 Given that neurogenesis has an important function, and a mechanism is in place to compensate for it in the MOB, why would it then be disrupted in fan-out projections to the PC? The answer may lie in the need for fan-out projections so that pyramidal neurons in the PC can combinatorially represent many different inputs from the MOB. So something like STDP would be needed to maintain stability in the face of the need for this coding strategy.
 
 This kind of discussion, or something like it, would help readers understand why these mechanisms occur in the first place. It is interesting that PC stability requires that odor environments be stable, and that this stability drives PC representational stability. This result suggests experimental work to test this hypothesis. As such, it is a novel outcome of the research.
 
 Review 2
4. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary
 
 The authors set out to explore the potential relationship between adult neurogenesis of inhibitory granule cells in the olfactory bulb and cumulative changes over days in odor-evoked spiking activity (representational drift) in the olfactory stream. They developed a richly detailed spiking neuronal network model based on Izhikevich (2003), allowing them to capture the diversity of spiking behaviors of multiple neuron types within the olfactory system. This model recapitulates the circuit organization of both the main olfactory bulb (MOB) and the piriform cortex (PCx), including connections between the two (both feedforward and corticofugal). Adult neurogenesis was captured by shuffling the weights of the model's granule cells, preserving the distribution of synaptic weights. Shuffling of granule cell connectivity resulted in cumulative changes in stimulus-evoked spiking of the model's M/T cells. Individual M/T cell tuning changed with time, and ensemble correlations dropped sharply over the temporal interval examined (long enough that almost all granule cells in the model had shuffled their weights). Interestingly, these changes in responsiveness did not disrupt low-dimensional stability of olfactory representations: when projected into a low-dimensional subspace, population vector correlations in this subspace remained elevated across the temporal interval examined. Importantly, in the model's downstream piriform layer, this was not the case. There, shuffled GC connectivity in the bulb resulted in a complete shift in piriform odor coding, including for low-dimensional projections. This is in contrast to what the model exhibited in the M/T input layer. Interestingly, these changes in PCx extended to the geometrical structure of the odor representations themselves. Finally, the authors examined the effect of experience on representational drift. Using an STDP rule, they allowed the inputs to and outputs from adult-born granule cells to change during repeated presentations of the same odor. This stabilized stimulus-evoked activity in the model's piriform layer.
 
 Strengths
 
 This paper suggests a link between adult neurogenesis in the olfactory bulb and representational drift in the piriform cortex. Using an elegant spiking network that faithfully recapitulates the basic physiological properties of the olfactory stream, the authors tackle a question of longstanding interest in a creative and interesting manner. As a purely theoretical study of drift, this paper presents important insights: synaptic turnover of recurrent inhibitory input can destabilize stimulus-evoked activity, but only to a degree, as representations in the bulb (the model's recurrent input layer) retain their basic geometrical form. However, this destabilized input results in profound drift in the model's second (piriform) layer, where both the tuning of individual neurons and the layer's overall functional geometry are restructured. This is a useful and important idea in the drift field, and to my knowledge, it is novel. The bulb is not the only setting where inhibitory synapses exhibit turnover (whether through neurogenesis or synaptic dynamics), and so this exploration of the consequences of such plasticity on drift is valuable. The authors also elegantly explore a potential mechanism to stabilize representations through experience, using an STDP rule specific to the inhibitory neurons in the input layer. This has an interesting parallel with other recent theoretical work on drift in the piriform (Morales et al., 2025 PNAS), in which STDP in the piriform layer was also shown to stabilize stimulus representations there. It is fascinating to see that this same rule also stabilizes piriform representations when implemented in the bulb's granule cells.
 
 The authors also provide a thoughtful discussion regarding the differential roles of mitral and tufted cells in drift in piriform and AON and the potential roles of neurogenesis in archicortex.
 
 In general, this paper puts an important and much-needed spotlight on the role of neurogenesis and inhibitory plasticity in drift. In this light, it is a valuable and exciting contribution to the drift conversation.
 
 Weaknesses
 
 I have one major, general concern that I think must be addressed to permit proper interpretation of the results.
 
 I worry that the authors' model may confuse thinking on drift in the olfactory system, because of differences in the behavior of their model from known features of the olfactory bulb. In their model, the tuning of individual bulbar neurons drifts over time. This is inconsistent with the experimental literature on the stability of odor-evoked activity in the olfactory bulb.
 
 In a foundational paper, Bhalla & Bower (1997) recorded from mitral and tufted cells in the olfactory bulb of freely moving rats and measured the odor tuning of well-isolated single units across a five-day interval. They found that the tuning of a single cell was quite variable within a day, across trials, but that this variability did not increase with time. Indeed, their measure of response similarity was equivalent within and across days. In what now reads as a prescient anticipation of the drift phenomenon, Bhalla and Bower concluded: "it is clear, at least over five days, that the cell is bounded in how it can respond. If this were not the case, we would expect a continual increase in relative response variability over multiple days (the equivalent of response drift). Instead, the degree of variability in the responses of single cells is stable over the length of time we have recorded." Thus, even at the level of single cells, this early paper argues that the bulb is stable.
 
 This basic result has since been replicated by several groups. Kato et al. (2012) used chronic two-photon calcium imaging of mitral cells in awake, head-fixed mice and likewise found that, while odor responses could be modulated by recent experience (odor exposure leading to transient adaptation), the underlying tuning of individual cells remained stable. While experience altered mitral cell odor responses, those responses recovered to their original form at the level of the single neuron, maintaining tuning over extended periods (two months). More recently, the Mizrahi lab (Shani-Narkiss et al., 2023) extended chronic imaging to six months, reporting that single-cell odor tuning curves remained highly similar over this period. These studies reinforce Bhalla and Bower's original conclusion: despite trial-to-trial variability, olfactory bulb neurons maintain stable odor tuning across extended timescales, with plasticity emerging primarily in response to experience. (The Yamada et al., 2017 paper, which the authors here cite, is not an appropriate comparison. In Yamada, mice were exposed daily to odor. Therefore, the changes observed in Yamada are a function of odor experience, not of time alone. Yamada does not include data in which the tuning of bulb neurons is measured in the absence of intervening experience.)
 
 Therefore, a model that relies on instability in the tuning of bulbar neurons risks giving the incorrect impression that the bulb drifts over time. This difference should be explicitly addressed by the authors to avoid any potential confusion. Perhaps the best course of action would be to fit their model to Mizrahi's data, should this data be available, and see if, when constrained by empirical observation, the model still produces drift in piriform. If so, this would dramatically strengthen the paper. If this is not feasible, then I suggest being very explicit about this difference between the behavior of the model and what has been shown empirically. I appreciate that in the data there is modest drift (e.g., Shani-Narkiss' Figure 8C), but the changes reported there really are modest compared to what is exhibited by the model. A compromise would be to simply apply these metrics to the model and match the model's similarity to the Shani-Narkiss data. Then the authors could ask what effect this has on drift in piriform.
 
 The risk here is that people will conclude from this paper that drift in piriform may simply be inherited from instability in the bulb. This view is inconsistent with what has been documented empirically, and so great care is warranted to avoid conveying that impression to the community.
 
 Major comments (all related to the above point)
 
 (1) Lines 146-168: The authors find in their model that "individual M/T cells changed their responses to the same odor across days due to adult-neurogenesis, with some cells decreasing the firing rate responses (Fig.2A1 top) while other cells increased the magnitude of their responses (Fig. 2A2 bottom, Fig. S2)" they also report a significant decrease in the "full ensemble correlation" in their model over time. They claim that these changes in individual cell tuning are "similar to what has been observed by others using calcium imaging of M/T cell activity (Kato et al., 2012 and Yamada et al., 2017)" and that the decrease in full ensemble correlation is "consistent with experimental observations (Yamada et al., 2017)." However, the conditions of the Kato and Yamada experiments that demonstrate response change are not comparable here, as odors were presented daily to the animals in these experiments. Therefore, the changes in odor tuning found in the Kato and Yamada papers (Kato Figure 4D; Yamada Figure 3E) are a function of accumulated experience with odor. This distinction is crucial because experience-induced changes reflect an underlying learning process, whereas changes that simply accumulate over time are more consistent with drift. The conditions of their model are more similar to those employed in other experiments described in Kato et al. 2012 (Figure 6C) as well as Shani-Narkiss et al. (2023), in which bulb tuning is measured not as a function of intervening experience, but rather as a function of time (Kato's "recovery" experiment). What is found in Kato is that even across two months, the tuning of individual mitral cells is stable. What alters tuning is experience with odor, the core finding of both the Kato et al., 2012 paper and also Yamada et al., 2017. It is crucial that this is clarified in the text.
 
 (2) The authors show that in a reduced-space correlation metric, the correlation of low-dimensional trajectories "remained high across all days"..."consistent with a recent experimental study" (Shani-Narkiss et al., 2023). It is true that in the Shani-Narkiss paper, a consistent low-dimensional response is found across days (t-SNE analysis in Shani-Narkiss Figure 7B). However, the key difference between the Shani-Narkiss data and the results reported here is that Shani-Narkiss also observed relative stability in the native space (Shani-Narkiss Figure 8). They conclude that they "find a relatively stable response of single neurons to odors in either awake or anesthetized states and a relatively stable representation of odors by the MC population as a whole (Figures 6-8; Bhalla and Bower, 1997)." This should be better clarified in the text.
 
 (3) In the discussion, the authors state that "In the MOB, individual M/T cells exhibited variable odor responses akin to gain control, altering their firing rate magnitudes over time. This is consistent with earlier experimental studies using calcium-imaging." (L314-6). Again, I disagree that these data are consistent with what has been published thus far. Changes in gain would have resulted in increased variability across days in the Bhalla data. Moreover, changes in gain would be captured by Kato's change index ("To quantify the changes in mitral cell responses, we calculated the change index (CI) for each responsive mitral cell-odor pair on each trial (trial X) of a given day as (response on trial X - the initial response on day 1)/(response on trial X + the initial response on day 1). Thus, CI ranges from −1 to 1, where a value of −1 represents a complete loss of response, 1 represents the emergence of a new response, and 0 represents no change." Kato et al.). This index will capture changes in gain. However, as shown in Figure 4D (red traces), Figure 6C (Recovery and Odor set B during odor set A experience and vice versa), the change index is either zero or near zero. If the authors wish to claim that their model is consistent with these data, they should also compute Kato's change index for M/T odor-cell pairs in their model and show that it also remains at 0 over time, absent experience.
 
 Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.07.02.601573v3
www.biorxiv.org www.biorxiv.org

Distinct cortical encoding of acoustic and electrical cochlear stimulation

4
1. Public_Reviews 13 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This valuable study compares auditory cortex responses to sounds and cochlear implant stimulation measured with surface electrode grids in rats. Beyond the reduced frequency resolution of cochlear implants observed previously, this study suggests key discrepancies between neuronal representations of cochlear stimulations and natural sounds. However, the evidence for this potentially interesting result is incomplete because there is a lack of evidence for the effectiveness of the comparison method. This study is of interest to researchers in the auditory neuroscience field and clinicians implementing treatments with cochlear implants.
  
  Summary
2. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  This manuscript addresses an important question: whether cortical population codes for cochlear-implant (CI) stimulation resemble those for natural acoustic input or constitute a qualitatively different representation. The authors record intracranial EEG (µECoG) responses to pure tones in normal-hearing rats and to single-channel CI pulses in bilaterally deafened, acutely implanted rats, analysing the data with ERP/high-gamma measures, tensor component analysis (TCA), and information-theoretic decoding. Across several readouts, the acoustic condition supports better single-trial stimulus classification than the CI condition. However, stronger decoding does not, on its own, establish that the acoustic responses instantiate a "richer" cortical code, and the evidence for orderly spatial organisation is not compelling for CI, and is also less evident than expected for normal-hearing, given prior knowledge. The overall narrative is interesting, but at present, the conclusions outpace the data because of statistical, methodological, and presentation issues.
  
  Strengths:
  
  The study poses a timely, clinically relevant question with clear implications for CI strategy. The analytical toolkit is appropriate: µECoG captures mesoscale patterns; TCA offers a transparent separation of spatial and temporal structure; and mutual-information decoding provides an interpretable measure of single-trial discriminability. Within-subject recordings in a subset of animals, in principle, help isolate modality effects from inter-animal variability. Where analyses are most direct, the acoustic condition yields higher single-trial decoding accuracy, which is a meaningful and clearly presented result.
  
  Weaknesses:
  
  Several limitations constrain how far the conclusions can be taken. Parts of the statistical treatment do not match the data structure: some comparisons mix paired and unpaired animals but are analysed as fully paired, raising concerns about misestimated uncertainty. Methodological reporting is incomplete in places; essential parameters for both acoustic and electrical stimulation, as well as objective verification of implantation and deafening, are not described with sufficient detail to support confident interpretation or replication. Figure-level clarity also undermines the message. In Figure 2, non-significant slopes for CI, repeated identification of a single "best channel," mismatched axes, and unclear distinctions between example and averaged panels make the assertion of spatial organisation unconvincing; importantly, the normal-hearing panels also do not display tonotopy as clearly as expected, which weakens the key contrast the paper seeks to establish. Finally, the decoding claims would be strengthened by simple internal controls, such as within-modality train/test splits and decoding on raw ERP/high-gamma features to demonstrate that poor cross-modal transfer reflects genuine differences in the underlying responses rather than limitations of the modelling pipeline.
  
  Review 1
3. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This article reports measurements of iEEG signals on the rat auditory cortex during cochlear implant or sound stimulation in separate groups of rats. The observations indicate some spatial organization of cochlear implant stimuli, but that is very different from cochlear implants.
  
  Strengths:
  
  The study includes interesting analyses of the sound and cochlear implant representation structure based on decoders.
  
  Weaknesses:
  
  The observation that responses to cochlear implant stimulation (stimulation) are spatially organized is not new (e.g., Adenis et al. 2024).
  
  The claim that spatial and temporal dimensions contribute information about the sound is also not new; there is a large literature on this topic. Moreover, the results shown here are extremely weak. They show similar levels of information in the spatial and temporal dimensions, and no synergy between the two dimensions. This is however, likely the consequence of high measurement noise leading to poor accuracy in the information estimates, as the authors state.
  
  The main claim of the study - the mismatch between cochlear implant and sound representation - is not supported. The responses to each modality are measured in different animals. The authors do not show that they actually can compare representations across animals (e.g., for the same sounds). Without this positive control, there is no reason to think that it is possible to decode from one animal with a decoder trained on another, and the negative result shown by the authors is therefore not surprising.
  
  Review 2
4. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  Through micro-electroencephalography, Hight and colleagues studied how the auditory cortex in its ensemble responds to cochlear implant stimulation compared to the classic pure tones. Taking advantage of a double-implanted rat model (Micro-ECoG and Cochlear Implant), they tracked and analyzed changes happening in the temporal and spatial aspects of the cortical evoked responses in both normal hearing and cochlear-implanted animals. After establishing that single-trial responses were sufficient to encode the stimuli's properties, the authors then explored several decoder architectures to study the cortex's ability to encode each stimulus modality in a similar or different manner. They conclude that a) intracranial EEG evoked responses can be accurately recorded and did not differed between normal hearing and cochlear-implanted rats; b) Although coarsely spatially organized, CI-evoked responses had higher trial-by-trial variability than pure tones; c) Stimulus identity is independently represented by temporal and spatial aspect of cortical representations and can be accurately decoded by various means from single trials; d) and that Pure tones trained decoder can't decode CI-stimulus identity accurately.
  
  Strength:
  
  The model combining micro-eCoG and cochlear implantation and the methodology to extract both the Event Related Potentials (ERPs) and High-Gammas (HGs) is very well designed and appropriately analyzed. Likewise, the PCA-LDA and TCA-LDA are powerful tools that take full advantage of the information provided by the cortical ensembles.
  
  The overall structure of the paper, with a paced and exhaustive progress through each step and evolution of the decoder, is very appreciable and easy to follow. The exploration of single-trial encoding and stimulus identity through temporal and spatial domains is providing new avenues to characterize the cortical responses to CI stimulations and their central representation. The fact that single trials suffice to decode the stimulus identity regardless of their modality is of great interest and noteworthy. Although the authors confirm that iEEG remains difficult to transpose in the clinic, the insights provided by the study confirm the potential benefit of using central decoders to help in clinic settings.
  
  Weaknesses:
  
  The conclusion of the paper, especially the concept of distinct cortical encoding for each modality, is unfortunately partially supported by the results, as the authors did not adequately consider fundamental limitations of CI-related stimulation.
  
  First, the reviewer assumed that the authors stimulated in a Monopolar mode, which, albeit being clinically relevant, notoriously generates a high current spread in rodent models. Second, comparing the averaged BF maps for iEEG (Figure 2A, C), BFs ranged from 4 to 16kHz with a predominance of 4kHz BFs. The lack of BFs at higher frequencies hints at a potential location mismatch between the frequency range sampled at the level of the cortex (low to medium frequencies) and the frequency range covered by the CI inserted mostly in the first turn-and-a-half of the cochlea (high to medium frequencies). Looking at Figure 2F (and to some extent 2A), most of the CI electrodes elicited responses around the 4kHz regions, and averaged maps show a predominance of CI-3-4 across the cortex (Figure 2C, H) from areas with 4kHz BF to areas with 16kHz BF. It is doubtful that CI-3-4 are located near the 4kHz region based on Müller's work (1991) on the frequency representation in the rat cochlea.
  
  Taken together with the Pearsons correlations being flat, the decoder examples showing a strong ability to identify CI-4 and 3 and the Fig-8D, E presenting a strong prediction of 4kHz and 8kHz for all the CI electrodes when using a pure tone trained decoder, it is possible that current spread ended stimulating indistinctly higher turns of the cochlea or even the modiolus in a non-specific manner, greatly reducing (or smearing) the place-coding/frequency resolution of each electrode, which in turn could explain the coarse topographic (or coarsely tonotopic according to the manuscript) organization of the cortical responses. Thus, the conclusion that there are distinct encodings for each modality is biased, as it might not account for monopolar smearing. To that end, and since it is the study's main message and title, it would have benefited from having a subgroup of animals using bipolar stimulations (or any focused strategy since they provide reduced current spread) to compare the spatial organization of iEEG responses and the performances of the different decoders to dismiss current spread and strengthen their conclusion.
  
  Nevertheless, the reviewer wants to reiterate that the study proposed by Hight et al. is well constructed, relevant to the field, and that the overall proposal of improving patient performances and helping their adaptation in the first months of CI use by studying central responses should be pursued as it might help establish new guidelines or create new clinical tools.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.01.668170v1
www.biorxiv.org www.biorxiv.org

Frequency and Laminar Profile of Feature Specific Visual Activity Revealed by Interleaved EEG-fMRI

4
1. Public_Reviews 13 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This important study uses simultaneous EEG and fMRI recordings to shed light on the relationship between alpha and gamma oscillations and specific cortical layers. The sophisticated methodology provides solid evidence for correlations between oscillatory power and the strength and contents of fMRI signals in different cortical layers, though some caveats remain. This paper will be of interest to neuroscientists studying the role and mechanisms of alpha and gamma oscillations.
 
 Summary
2. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 In this manuscript, Clausner and colleagues use simultaneous EEG and fMRI recordings to clarify how visual brain rhythms emerge across layers of early visual cortex. They report that gamma activity correlates positively with feature-specific fMRI signals in superficial and deep layers. By contrast, alpha activity generally correlated negatively with fMRI signals, with two higher frequencies within the alpha reflecting feature-specific fMRI signals. This feature-specific alpha code indicates an active role of alpha oscillations in visual feature coding, providing compelling evidence that the functions of alpha oscillations go beyond cortical idling or feature-unspecific suppression.
 
 The study is very interesting and timely. Methodologically, it is state-of-the-art. The findings on a more active role of alpha activity that goes beyond the classical idling or suppression accounts are in line with recent findings and theories. In sum, this paper makes a very nice contribution. I still have a few comments that I outline below, regarding the data visualization, some methodological aspects, and a couple of theoretical points.
 
 (1) The authors put a lot of effort into the figure design. For instance, I really like Figure 1, which conveys a lot of information in a nice way. Figures 3 and 4, however, seem overengineered, and it takes a lot of time to distill the contents from them. The fact that they have a supplementary figure explaining the composition of these figures already indicates that the authors realized this is not particularly intuitive. First of all, the ordering of the conditions is not really intuitive. Second, the indication of significance through saturation does not really work; I have a hard time discerning the more and less saturated colors. And finally, the white dots do not really help either. I don't fully understand why they are placed where they are placed (e.g., in Figure 3). My suggestion would be to get rid of one of the factors (I think the voxel selection threshold could go: the authors could run with one of the stricter ones, and the rest could go into the supplement?) and then turn this into a few line plots. That would be so much easier to digest.
 
 (2) The division between high- and low-frequency alpha in the feature-specific signal correspondence is very interesting. I am wondering whether there is an opposite effect in the feature-unspecific signal correspondence. Would the high-frequency alpha show less of a feature-unspecific correlation with the BOLD?
 
 (3) In the discussion (line 330 onwards), the authors mention that low-frequency alpha is predominantly related to superficial layers, referencing Figure 4A. I have a hard time appreciating this pattern there. Can the authors provide some more information on where to look?
 
 (4) How did the authors deal with the signal-to-noise ratio (SNR) across layers, where the presence of larger drain veins typically increases BOLD (and thereby SNR) in superficial layers? This may explain the pattern of feature-unspecific effects in the alpha (Figure 3). Can the authors perform some type of SNR estimate (e.g., split-half reliability of voxel activations or similar) across layers to check whether SNR plays a role in this general pattern?
 
 (5) The GLM used for modelling the fMRI data included lots of regressors, and the scanning was intermittent. How much data was available in the end for sensibly estimating the baseline? This was not really clear to me from the methods (or I might have missed it). This seems relevant here, as the sign of the beta estimates plays a major role in interpreting the results here.
 
 (6) Some recent research suggests that gamma activity, much in contrast to the prevailing view of the mechanism for feedforward information propagation, relates to the feedback process (e.g., Vinck et al., 2025, TiCS). This view kind of fits with the localization of gamma to the deep layer here?
 
 (7) Another recent review (Stecher et al., 2025, TiNS) discusses feature-specific codes in visual alpha rhythms quite a bit, and it might be worth discussing how your results align with the results reported there.
 
 Review 1
3. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 The authors address a long-standing controversy regarding the functional role of neural oscillations in cortical computations and layer-specific signalling. Several studies have implicated gamma oscillations in bottom-up processing, while lower-frequency oscillations have been associated with top-down signalling. Therefore, the question the authors investigate is both timely and theoretically relevant, contributing to our understanding of feedforward and feedback communication in the brain. This paper presents a novel and complicated data acquisition technique, the application of simultaneous EEG and fMRI, to benefit from both temporal and spatial resolution. A sophisticated data analysis method was executed in order to understand the underlying neural activity during a visual oddball task. Figures are well-designed and appropriately represent the results, which seem to support the overall conclusions. However, some of the claims (particularly those regarding the contribution of gamma oscillations) feel somewhat overstated, as the results offer indeed some significant evidence, but most seem more like a suggestive trend. Nonetheless, the paper is well-written, addresses a relevant and timely research question, introduces a novel and elegant analysis approach, and presents interesting findings. Further investigation will be important to strengthen and expand upon these insights.
 
 One of the main strengths of the paper lies in the use of a well-established and straightforward experimental paradigm (the visual oddball task). As a result, the behavioural effects reported were largely expected and reassuring to see replicated. The acquisition technique used is very novel, and while this may introduce challenges for data analysis, the authors appear to have addressed these appropriately.
 
 Later findings are very interesting, and mainly in line with our current understanding of feedback and feedforward signalling. However, the layer weight calculation is lacking in the manuscript. While it is discussed in the methods, it would help to briefly explain in the results how these weights are calculated, so that the reader can better follow what is being interpreted.
 
 Line 104 states there is one virtual channel per hemisphere for low and high frequencies. It may be helpful to include the number of channels (n=4) in the results section, as specified in the methods. Also, this raises the question of whether a single virtual channel (i.e., voxel) provides sufficient information for reproducibility.
 
 One area that would benefit from further clarification is the interpretation of gamma oscillations. The evidence for gamma involvement in the observed effects appears somewhat limited. For example, no significant gamma-related clusters were found for the feature-unspecific BOLD signal (Figure 2). Significant effects emerged only when the analysis was restricted to positively responding voxels, and even then, only for the contrast between EEG-coherent and EEG-incoherent conditions in the feature-specific BOLD response. It remains unclear how to interpret this selective emergence of gamma-related effects. Given previous literature linking gamma to feedforward processing, one might expect more robust involvement in broader, feature-unspecific contrasts. The current discussion presents the gamma-related findings with some confidence, and the manuscript would benefit from a more nuanced reflection on why these effects may not have appeared more broadly. The explanation provided in line 230, that restricting the analysis to positively responding voxels may have increased the SNR, is reasonable, but it may not fully account for the absence of gamma effects in V1's feature-unspecific response. Including the actual beta values from Figure 4 in the legend or main text would also help readers better assess the strength and specificity of the reported effects.
 
 Relating to behavioural findings for underlying neural activity, could the authors test on a trial-by-trial basis how behavioural performance relates to the BOLD signal / oscillatory activity change? Line 305 states that "Since behavioural performance in the present study was consistently high at 94% on average and participants were instructed to respond quickly to potential oddball stimuli, a higher alpha frequency might reflect a more successful stimulus encoding and hence faster and more accurate behavioural performance." Also, this might help to relate the findings to the lower vs upper alpha functionality difference.
 
 In Figure 4, the EEG alpha specificity plot shows relatively large error bars, and there is visible overlap between the lower and upper alpha in both congruent and incongruent conditions. While upper alpha shows a positive slope across conditions and lower alpha remains flat, the interaction appears to be driven by the change from congruent to incongruent in upper alpha. It is worth clarifying whether the simple effects (e.g., lower vs upper within each condition) were tested, given the visual similarity at the incongruent condition. Overall, the significant interaction (p < 0.001, FDR-corrected) is consistent with diverging trends, but a breakdown of simple effects would help interpret the result more clearly. Was there a significant difference between lower and upper alpha in congruent or incongruent conditions?
 
 Overall, this study provides a valuable contribution to the literature on oscillatory dynamics and laminar fMRI, though some interpretations would benefit from further clarification or qualification.
 
 Review 2
4. Public_Reviews 13 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 Clausner et al. investigate the relationship between cortical oscillations in the alpha and gamma bands and the feature-specific and feature-unspecific BOLD signals across cortical layers. Using a well-designed stimulus and GLM, they show a method by which different BOLD signals can be differentiated and investigated alongside multiple cortical oscillatory frequencies. In addition to the previously reported positive relationship between gamma and BOLD signals in superficial layers, they show a relationship between gamma and feature-specific BOLD in the deeper layers. Alpha-band power is shown to have a negative relationship with the negative BOLD response for both feature-specific and feature-unspecific contrasts. When separated into lower (8-10Hz) and upper (11-13Hz) alpha oscillations, they show that higher frequency alpha showed a significantly stronger negative relationship with congruency, and can therefore be interpreted as more feature-specific than lower frequency alpha.
 
 Strengths:
 
 The use of interleaved EEG-fMRI has provided a rich dataset that can be used to evaluate the relationship of cortical layer BOLD signals with multiple EEG frequencies. The EEG data were of sufficient quality to see the modulation of both alpha-band and gamma-band oscillations in the group mean VE-channel TFS. The good EEG data quality is backed up with a highly technical analysis pipeline that ultimately enables the interpretation of the cortical layer relationship of the BOLD signal with a range of frequencies in the alpha and gamma bands. The stimulus design allowed for the generation of multiple contrasts for the BOLD signal and the alpha/gamma oscillations in the GLM analysis. Feature-specific and unspecific BOLD contrasts are used with congruently or incongruently selected EEG power regressors to delineate between local and global alpha modulations. A transparent approach is used for the selection of voxels contributing to the final layer profiles, for which statistical analysis is comprehensive but uses an alternative statistical test, which I have not seen in previous layer-fMRI literature.
 
 A significant negative relationship between alpha-band power and the BOLD signal was seen in congruently (EEGco) selected voxels (predominantly in superficial layers) and in feature-contrast (EEGco-inco) selected (superficial and deep layers). When separated into lower (8-10Hz) and upper (11-13Hz) alpha oscillations, they show that higher frequency alpha showed a significantly stronger negative relationship with congruency than lower frequency alpha. This is interpreted as a frequency dissociation in the alpha-BOLD relationship, with upper frequency alpha being feature-specific and lower frequency alpha corresponding to general modulation. These results are a valuable addition to the current literature and improve our current understanding of the role of cortical alpha oscillations.
 
 There is not much work in the literature on the relationship between alpha power and the negative BOLD response (NBR), so the data provided here are particularly valuable. The negative relationship between the NBR and alpha power shown here suggests that there is a reduction in alpha power, linked to locally reduced BOLD activity, which is in line with the previously hypothesized inhibitory nature of alpha.
 
 Weaknesses:
 
 It is not entirely clear how the draining vein effect seen in GE-BOLD layer-fMRI data has been accounted for in the analysis. For the contrast of congruent-incongruent, it is assumed that the underlying draining effect will be the same for both conditions, and so should be cancelled out. However, for the other contrasts, it is unclear how the final layer profiles aren't confounded by the bias in BOLD signal towards the superficial layers. Many of the profiles in Figure 3 and Figure 4A show an increased negative correlation between alpha power and the BOLD signal towards the superficial layers.
 
 When investigating if high alpha (8-10 Hz) and low alpha (11-13 Hz) are two different sources of alpha, it would be beneficial to show if this effect is only seen at the group level or can be seen in any single subjects. Inter-subject variability in peak alpha power could result in some subjects having a single low alpha peak and some a single high alpha peak rather than two peaks from different sources.
 
 The figure layout used to present the main findings throughout is an innovative way to present so much information, but it is difficult to decipher the main findings described in the text. The readability would be improved if the example (Appendix 0 - Figure 1) in the supplementary material is included as a second panel inside Figure 3, or, if this is not possible, the example (Appendix 0 - Figure 1) should be clearly referred to in the figure caption.
 
 Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.07.31.605816v2
www.biorxiv.org www.biorxiv.org

Allocentric and egocentric cues constitute an internal reference frame for real-world visual search

4
1. Public_Reviews 13 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This important study shows that visual search for upright and rotated objects is affected by rotating participants in a VR and gravitational reference frame. However, the evidence supporting this conclusion is incomplete, given the authors' use of normalized response time and the assumption that object recognition across rotations requires mental rotation.
  
  Summary
2. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The current study sought to understand which reference frames humans use when doing visual search in naturalistic conditions. To this end, they had participants do a visual search task in a VR environment while manipulating factors such as object orientation, body orientation, gravitational cues, and visual context (where the ground is). They generally found that all cues contributed to participants' performance, but visual context and gravitational cues impacted performance the most, suggesting that participants represent space in an allocentric reference frame during visual search.
  
  Strengths:
  
  The study is valuable in that it sheds light on which cues participants use during visual search. Moreover, I appreciate the use of VR and precise psychophysical predictions (e.g., slope vs. intercept) to dissociate between possible reference frames.
  
  Weaknesses:
  
  It's not clear what the implications of the study are beyond visual search. Moreover, I have some concerns about the interpretation of Experiment 1, which relies on an incorrect interpretation of mental rotation. Thus, most of the conclusions rely on Experiment 2, which has a small sample size (n = 10). Finally, the statistical analyses could be strengthened with measures of effect size and non-parametric statistics.
  
  Review 1
3. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This paper addresses an interesting issue: how is the search for a visual target affected by its orientation (and the viewer's) relative to other items in the scene and gravity? The paper describes a series of visual search tasks, using recognizable targets (e.g., a cat) positioned within a natural scene. Reaction times and accuracy at determining whether the target was present or absent, trial-to-trial, were measured as the target's orientation, that of the context, and of the viewer themselves (via rotation in a flight simulator) were manipulated. The paper concludes that search is substantially affected by these manipulations, primarily by the reference frame of gravity, then visual context, followed by the egocentric reference frame.
  
  Strengths:
  
  This work is on an interesting topic, and benefits from using natural stimuli in VR / flight simulator to change participants' POV and body position.
  
  Weaknesses:
  
  There are several areas of weakness that I feel should be addressed.
  
  (1) The literature review/introduction seems to be lacking in some areas. The authors, when contemplating the behavioral consequences of searching for a 'rotated' target, immediately frame the problem as one of rotation, per se (i.e., contrasting only rotation-based explanations; "what rotates and in which 'reference frame[s]' in order to allow for successful search?"). For a reader not already committed to this framing, many natural questions arise that are worth addressing.
  
  1a) Why do we need to appeal to rotation at all as opposed to, say, familiarity? A rotated cat is less familiar than a typically oriented one. This is a long-standing literature (e.g., Wang, Cavanagh, and Green (1994)), of course, with a lot to unpack.
  
  1b) What are the triggers for the 'corrective' rotation that presumably brings reference frames back into alignment? What if the rotation had not been so obvious (i.e. for a target that may not have a typical orientation, like a hand, or a ball, or a learned, nonsense object?) or the background had not had such clear orientation (like a cluttered non-naturalistic background of or a naturalistic backdrop, but viewed from an unfamiliar POV (e.g., from above) or a naturalistic background, but not all of the elements were rotated)? What, ultimately, is rotated? The entire visual field? Does that mean that searching for multiple targets at different angles of rotation would interfere with one another?
  
  1c) Relatedly, what is the process by which the visual system comes to know the 'correct' rotation? (Or, alternatively, is 'triggered to realize' that there is a rotation in play?) Is this something that needs to be learned? Is it only learned developmentally, through exposure to gravity? Could it be learned in the context of an experiment that starts with unfamiliar stimuli?
  
  1d) Why the appeal to natural images? I appreciate any time a study can be moved from potentially too stripped-down laboratory conditions to more naturalistic ones, but is this necessary in the present case? Would the pattern of results have been different if these were typical laboratory 'visual search' displays of disconnected object arrays?
  
  1e) How should we reconcile rotation-based theories of 'rotated-object' search with visual search results from zero gravity environments (e.g., for a review, see Leone (1998))?
  
  1f) How should we reconcile the current manipulations with other viewpoint-perspective manipulations (e.g., Zhang & Pan (2022))?
  
  (2) The presentation/interpretation of results would benefit from more elaboration and justification.
  
  2a) All of the current interpretations rely on just the RT data. First, the RT results should also be presented in natural units (i.e., seconds/ms), not normalized. As well, results should be shown as violin plots or something similar that captures distribution - a lot of important information is lost when just presenting one 'average' dot across participants. More fundamentally, I think we need to have a better accounting for performance (percent correct or d') to help contextualize the RT results. We should at least be offered some visualization (Heitz, 2014) of the speed accuracy trade-off for each of the conditions. Following this, the authors should more critically evaluate how any substantial SAT trends could affect the interpretation of results.
  
  2b) Unless I am missing something, the interpretation of the pattern of results (both qualitatively and quantitatively in their 'relative weight' analysis) relies on how they draw their contrasts. For instance, the authors contrast the two 'gravitational' conditions (target 0 deg versus target 90 deg) as if this were a change in a single variable/factor. But there are other ways to understand these manipulations that would affect contrasts. For instance, if one considers whether the target was 'consistent' (i.e., typically oriented) with respect to the context, egocentric, and gravitational frames, then the 'gravitational 0 deg' condition is consistent with context, egocentric view, but inconsistent with gravity. And, the 'gravitational 90 deg' condition, then, is inconsistent with context, egocentric view, but consistent with gravity. Seen this way, this is not a change in one variable, but three. The same is true of the baseline 0 deg versus baseline 90 deg condition, where again we have a change in all three target-consistency variables. The 'one variable' manipulations then would be: 1) baseline 0 versus visual context 0 (i.e., a change only in the context variable); 2) baseline 0 versus egocentric 0 (a change only in the egocentric variable); and 3) baseline 0 versus gravitational 0 (a change only in the gravitational variable). Other contrasts (e.g., gravitational 90 versus context 90) would showcase a change in two variables (in this case, a change in both context and gravity). My larger point is, again, unless I am really missing something, that the choice of how to contrast the manipulations will affect the 'pattern' of results and thereby the interpretation. If the authors agree, this needs to be acknowledged, plausible alternative schemes discussed, and the ultimate choice of scheme defended as the most valid.
  
  2c) Even with this 'relative weight' interpretation, there are still some patterns of results that seem hard to account for. Primarily, the egocentric condition seems hard to account for under any scheme, and the authors need to spend more time discussing/reconciling those results.
  
  2d) Some results are just deeply counterintuitive, and so the reader will crave further discussion. Most saliently for me, based on the results of Experiment 2 (specifically, the fact that gravitational 90 had better performance than gravitational 0), designers of cockpits should have all gauges/displays rotate counter to the airplane so that they are always consistent with gravity, not the pilot. Is this indeed a fair implication of the results?
  
  2e) I really craved some 'control conditions' here to help frame the current results. In keeping with the rhetorical questions posed above in 1a/b/c/d, if/when the authors engage with revisions to this paper, I would encourage the inclusion of at least some new empirical results. For me the most critical would be to repeat some core conditions, but with a symmetric target (e.g. a ball) since that would seem to be the only way (given the current design) to tease out nuisance confounding factors such as, say, the general effect of performing search while sideways (put another way, the authors would have to assume here that search (non-normalized RT's and search performance) for a ball-target in the baseline condition would be identical to that in the gravitational condition.)
  
  Review 2
4. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  The study tested how people search for objects in natural scenes using virtual reality. Participants had to find targets among other objects, shown upright or tilted. The main results showed that upright objects were found faster and more accurately. When the scene or body was rotated, performance changed, showing that people use cues from the environment and gravity to guide search.
  
  The manuscript is clearly written and well designed, but there are some aspects related to methods and analyses that would benefit from stronger support.
  
  First, the sample size is not justified with a power analysis, nor is it explained how it was determined. This is an important point to ensure robustness and replicability.
  
  Second, the reaction time data were processed using different procedures, such as the use of the median to exclude outliers and an ad hoc cut-off of 50 ms. These choices are not sufficiently supported by a theoretical rationale, and could appear as post-hoc decisions.
  
  Third, the mixed-model analyses are overall well-conducted; however, the specification of the random structure deserves further consideration. The authors included random intercepts for participants and object categories, which is appropriate. However, they did not include random slopes (e.g., for orientation or set size), meaning that variability in these effects across participants was not modelled. This simplification can make the models more stable, but it departs from the maximal random structure recommended by Barr et al. (2013). The authors do not explicitly justify this choice, and a reviewer may question why participant-specific variability in orientation effects, for example, was not allowed. Given the modest sample sizes (20 in Experiment 1 and 10 in Experiment 2), convergence problems with more complex models are likely. Nonetheless, ignoring random slopes can, in principle, inflate Type I error rates, so this issue should at least be acknowledged and discussed.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.04.14.648618v2
www.biorxiv.org www.biorxiv.org

Absence of Systematic Effects of Internalizing Psychopathology on Learning Under Uncertainty

3
1. Public_Reviews 13 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This study provides important results with regard to the ongoing debate of the relationship between internalizing psychopathology and learning under uncertainty. The methods and analyses are solid, and the results are backed by a large sample size, yet the study could still benefit from a more detailed discussion about the difference in experimental design and analysis compared to previous studies. If these concerns are addressed, this study would be of interest to researchers in clinical and computational psychiatry for the behavioral markers of psychopathological symptoms.
  
  Summary
2. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  The authors conducted a series of experiments using two established decision-making tasks to clarify the relationship between internalizing psychopathology (anxiety and depression) and adaptive learning in uncertain and volatile environments. While prior literature has reported links between internalizing symptoms - particularly trait anxiety - and maladaptive increases in learning rates or impaired adjustment of learning rates, findings have been inconsistent. To address this, the authors designed a comprehensive set of eight experiments that systematically varied task conditions. They also employed a bifactor analysis approach to more precisely capture the variance associated with internalizing symptoms across anxiety and depression. Across these experiments, they found no consistent relationship between internalizing symptoms and learning rates or task performance, concluding that this purported hallmark feature may be more subtle than previously assumed.
  
  Strengths:
  
  (1) A major strength of the paper lies in its impressive collection of eight experiments, which systematically manipulated task conditions such as outcome type, variability, volatility, and training. These were conducted both online and in laboratory settings. Given that trial conditions can drive or obscure observed effects, this careful, systematic approach enables a robust assessment of behavior. The consistency of findings across online and lab samples further strengthens the conclusions.
  
  (2) The analyses are impressively thorough, combining model-agnostic measures, extensive computational modeling (e.g., Bayesian, Rescorla-Wagner, Volatile Kalman Filter), and assessments of reliability. This rigor contributes meaningfully to broader methodological discussions in computational psychiatry, particularly concerning measurement reliability.
  
  (3) The study also employed two well-established, validated computational tasks: a game-based predictive inference task and a binary probabilistic reversal learning task. This choice ensures comparability with prior work and provides a valuable cross-paradigm perspective for examining learning processes.
  
  (4) I also appreciate the open availability of the analysis code that will contribute substantially to the field using similar tasks.
  
  Weakness:
  
  (1) While the overall sample size (N = 820 across eight experiments) is commendable, the number of participants per experiment is relatively modest, especially in light of the inherent variability in online testing and the typically small effect sizes in correlations with mental health traits (e.g., r = 0.1-0.2). The authors briefly acknowledge that any true effects are likely small; however, the rationale behind the sample sizes selected for each experiment is unclear. This is especially important given that previous studies using the predictive inference task (e.g., Seow & Gillan, 2020, N > 400; Loosen et al., 2024, N > 200) have reported non-significant associations between trait anxiety symptoms and learning rates.
  
  (2) The motivation for focusing on the predictive inference task is also somewhat puzzling, given that no cited study has reported associations between trait anxiety and parameters of this task. While this is mitigated by the inclusion of a probabilistic reversal learning task, which has a stronger track record in detecting such effects, the study misses an opportunity to examine whether individual differences in learning-related measures correlate across the two tasks, which could clarify whether they tap into shared constructs.
  
  (3) The parameterization of the tasks, particularly the use of high standard deviations (SDs) of 20 and 30 for outcome distributions and hazard rates of 0.1 and 0.16, warrants further justification. Are these hazard rates sufficiently distinct? Might the wide SDs reduce sensitivity to volatility changes? Prior studies of the circle version of this predictive inference task (e.g., Vaghi et al., 2019; Seow & Gillan, 2020; Marzuki et al., 2022; Loosen et al., 2024; Hoven et al., 2024) typically used SDs around 12. Indeed, the Supplementary Materials suggest that variability manipulations did not seem to substantially affect learning rates (Figure S5)-calling into question whether the task manipulations achieved their intended cognitive effects.
  
  (4) Relatedly, while the predictive inference task showed good reliability, the reversal learning task exhibited only "poor-to-moderate" reliability in its learning-rate estimates. Given that previous findings linking anxiety to learning rates have often relied on this task, these reliability issues raise concerns about the robustness and generalizability of conclusions drawn from it.
  
  (5) As the authors note, the study relies on a subclinical sample. This limits the generalizability of the findings to individuals with diagnosed disorders. A growing body of research suggests that relationships between cognition and symptomatology can differ meaningfully between general population samples and clinical groups. For example, Hoven et al. (2024) found differing results in the predictive inference task when comparing OCD patients, healthy controls, and high- vs. low-symptom subgroups.
  
  (6) Finally, the operationalization of internalizing symptoms in this study appears to focus on anxiety and depression. However, obsessive-compulsive disorder is also generally considered an internalizing disorder, which presents a gap in the current cited literature of the paper, particularly when there have been numerous studies with the predictive inference task and OCD/compulsivity (e.g., Vaghi et al., 2019; Seow & Gillan, 2020; Marzuki et al., 2022; Loosen et al., 2024; Hoven et al., 2024), rather than trait anxiety per se.
  
  Overall:
  
  Despite the named limitations, the authors have done very impressive work in rigorously examining the relationship between anxiety/internalizing symptoms and learning rates in commonly used decision-making tasks under uncertainty. Their conclusion is well supported by the consistency of their null findings across diverse task conditions, though its generalizability may be limited by some features of the task design and its sample. This study provides strong evidence that will guide future research, whether by shifting the focus of examining dysfunctions of larger effect sizes or by extending investigations to clinical populations.
  
  Review 1
3. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  In this work, the authors recruited a large sample of participants to complete two well-established paradigms: the predictive inference task and the volatile reversal learning task. With this dataset, they not only replicated several classical findings on uncertainty-based learning from previous research but also demonstrated that individual differences in learning behavior are not systematically associated with internalizing psychopathology. These results provide valuable large-scale evidence for this line of research.
  
  Strengths:
  
  (1) Use of two different tasks.
  
  (2) Recruitment of a large sample of participants.
  
  (3) Inclusion of multiple experiments with different conditions, demonstrating strong scientific rigor.
  
  Weaknesses:
  
  Below are questions rather than 'weaknesses':
  
  (1) This study uses a large human sample, which is a clear strength. However, was the study preregistered? It would also be useful to report a power analysis to justify the sample size.
  
  (2) Previous studies have tested two core hypotheses: (a) that internalizing psychopathology is associated with overall higher learning rates, and (b) that it is associated with learning rate adaptation. In the first experiment, the findings seem to disconfirm only the first hypothesis. I found it unclear how, in the predator task, participants were expected to adjust their learning rate to adapt to volatility. Could the authors clarify this point?
  
  (3) According to the Supplementary Information, Model 13 showed the best fit, yet the authors selected Model 12 due to the larger parameter variance in Model 13. What would the results of Model 13 look like? Furthermore, do Models 12 and 13 correspond to the optimal models identified by Gagne et al. (2020)? Please clarify.
  
  (4) In the Discussion, the authors addressed both task reliability and parameter reliability. However, the term reliability seems to be used differently in these two contexts. For example, good parameter recovery indicates strong reliability in one sense, but can we then directly equate this with parameter reliability? It would be helpful to define more precisely what is meant by reliability in each case.
  
  (5) The Discussion also raises the possibility that limited reliability may represent a broader challenge facing the interdisciplinary field of computational psychiatry. What, in the authors' view, are the key future directions for the field to mitigate this issue?
  
  Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.05.12.653409v1
www.biorxiv.org www.biorxiv.org

MerQuaCo: a computational tool for quality control in image-based spatial transcriptomics

4
1. Public_Reviews 13 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This valuable study describes MerQuaCo, a computational and automatic quality control tool for spatial transcriptomics datasets. The authors have collected a remarkable number of tissues to construct the main algorithm. The compelling strength of the evidence is demonstrated through a combination of empirical observations, automated computational approaches, and validation against existing software packages. MerQuaCo will interest researchers who routinely perform spatial transcriptomic imaging (especially MERSCOPE), as it provides an imperfection detector and quality control measures for reliable and reproducible downstream analysis.
  
  Summary
2. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The authors present MerQuaCo, a computational tool that fills a critical gap in the field of spatial transcriptomics: the absence of standardized quality control (QC) tools for image-based datasets. Spatial transcriptomics is an emerging field where datasets are often imperfect, and current practices lack systematic methods to quantify and address these imperfections. MerQuaCo offers an objective and reproducible framework to evaluate issues like data loss, transcript detection variability, and efficiency differences across imaging planes.
  
  Strengths:
  
  (1) The study draws on an impressive dataset comprising 641 mouse brain sections collected on the Vizgen MERSCOPE platform over two years. This scale ensures that the documented imperfections are not isolated or anecdotal but represent systemic challenges in spatial transcriptomics. The variability observed across this large dataset underscores the importance of using sufficiently large sample sizes when benchmarking different image-based spatial technologies. Smaller datasets risk producing misleading results by over-representing unusually successful or unsuccessful experiments. This comprehensive dataset not only highlights systemic challenges in spatial transcriptomics but also provides a robust foundation for evaluating MerQuaCo's metrics. The study sets a valuable precedent for future quality assessment and benchmarking efforts as the field continues to evolve.
  
  (2) MerQuaCo introduces thoughtful metrics and filters that address a wide range of quality control needs. These include pixel classification, transcript density, and detection efficiency across both x-y axes (periodicity) and z-planes (p6/p0 ratio). The tool also effectively quantifies data loss due to dropped images, providing tangible metrics for researchers to evaluate and standardize their data. Additionally, the authors' decision to include examples of imperfections detectable by visual inspection but not flagged by MerQuaCo reflects a transparent and balanced assessment of the tool's current capabilities.
  
  Weaknesses:
  
  (1) The study focuses on cell-type label changes as the main downstream impact of imperfections. Broadening the scope to explore expression response changes of downstream analyses would offer a more complete picture of the biological consequences of these imperfections and enhance the utility of the tool.
  
  (2) While the manuscript identifies and quantifies imperfections effectively, it does not propose post-imaging data processing solutions to correct these issues, aside from the exclusion of problematic sections or transcript species. While this is understandable given the study is aimed at the highest quality atlas effort, many researchers don't need that level of quality to compare groups. It would be important to include discussion points as to how those cut-offs should be decided for a specific study.
  
  (3) Although the authors demonstrate the applicability of MerQuaCo on a large MERFISH dataset, and the limited number of sections from other platforms, it would be helpful to describe its limitations in its generalizability.
  
  Review 1
3. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The authors present MerQuaCo, a computational tool for quality control in image-based spatial transcriptomic, especially MERSCOPE. They assessed MerQuaCo on 641 slides that are produced in their institute in terms of the ratio of imperfection, transcript density, and variations of quality by different planes (x-axis).
  
  Strengths:
  
  This looks to be a valuable work that can be a good guideline of quality control in future spatial transcriptomics. A well-controlled spatial transcriptomics dataset is also important for the downstream analysis.
  
  Weaknesses:
  
  The results section needs to be more structured.
  
  Review 2
4. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  MerQuaCo is an open-source computational tool developed for quality control in image-based spatial transcriptomics data, with a primary focus on data generated by the Vizgen MERSCOPE platform. The authors analyzed a substantial dataset of 641 fresh-frozen adult mouse brain sections to identify and quantify common imperfections, aiming to replace manual quality assessment with an automated, objective approach, providing standardized data integrity measures for spatial transcriptomics experiments.
  
  Strengths:
  
  The manuscript's strengths lie in its timely utility, rigorous empirical validation, and practical contributions to methodology and biological discovery in spatial transcriptomics.
  
  Weaknesses:
  
  While MerQuaCo demonstrates utility in large datasets and cross-platform potential, its generalizability and validation require expansion, particularly for non-MERSCOPE platforms and real-world biological impact.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.12.04.626766v1
www.biorxiv.org www.biorxiv.org

MerQuaCo: a computational tool for quality control in image-based spatial transcriptomics

4
1. Public_Reviews 13 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This study provides a valuable contribution to spatial transcriptomics by introducing MerQuaCo, a computational tool for standardizing quality control in image-based spatial transcriptomics datasets. The tool addresses the lack of consensus in the field and provides robust metrics to identify and quantify common imperfections in datasets. The work is supported by an impressive dataset and compelling analyses, and will be of significant interest to researchers focused on data reproducibility and downstream analysis reliability in spatial transcriptomics.
  
  Summary
2. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  The authors present MerQuaCo, a computational tool that fills a critical gap in the field of spatial transcriptomics: the absence of standardized quality control (QC) tools for image-based datasets. Spatial transcriptomics is an emerging field where datasets are often imperfect, and current practices lack systematic methods to quantify and address these imperfections. MerQuaCo offers an objective and reproducible framework to evaluate issues like data loss, transcript detection variability, and efficiency differences across imaging planes.
  
  Strengths
  
  (1) The study draws on an impressive dataset comprising 641 mouse brain sections collected on the Vizgen MERSCOPE platform over two years. This scale ensures that the documented imperfections are not isolated or anecdotal but represent systemic challenges in spatial transcriptomics. The variability observed across this large dataset underscores the importance of using sufficiently large sample sizes when benchmarking different image-based spatial technologies. Smaller datasets risk producing misleading results by over-representing unusually successful or unsuccessful experiments. This comprehensive dataset not only highlights systemic challenges in spatial transcriptomics but also provides a robust foundation for evaluating MerQuaCo's metrics. The study sets a valuable precedent for future quality assessment and benchmarking efforts as the field continues to evolve.
  
  (2) MerQuaCo introduces thoughtful metrics and filters that address a wide range of quality control needs. These include pixel classification, transcript density, and detection efficiency across both x-y axes (periodicity) and z-planes (p6/p0 ratio). The tool also effectively quantifies data loss due to dropped images, providing tangible metrics for researchers to evaluate and standardize their data. Additionally, the authors' decision to include examples of imperfections detectable by visual inspection but not flagged by MerQuaCo reflects a transparent and balanced assessment of the tool's current capabilities.
  
  Comments on revisions:
  
  All previous concerns have been fully addressed. The revised manuscript presents a robust, well-documented, and user-friendly tool for quality control in image-based spatial transcriptomics, a rapidly advancing area where objective assessment tools are urgently needed.
  
  Review 1
3. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  MerQuaCo is an open-source computational tool developed for quality control in image-based spatial transcriptomics data, with a primary focus on data generated by the Vizgen MERSCOPE platform. The authors analyzed a substantial dataset of 641 fresh-frozen adult mouse brain sections to identify and quantify common imperfections, aiming to replace manual quality assessment with an automated, objective approach, providing standardized data integrity measures for spatial transcriptomics experiments.
  
  Strengths:
  
  The manuscript's strengths lie in its timely utility, rigorous empirical validation, and practical contributions to methodology and biological discovery in spatial transcriptomics.
  
  Weaknesses:
  
  While MerQuaCo demonstrates utility in large datasets and cross-platform potential, its generalizability and validation are currently limited by the availability of sufficient datasets from non-MERSCOPE platforms and non-brain tissues. The evaluation of data imperfections' impact on downstream analyses beyond cell typing (e.g., differential expression, spatial statistics, and cell-cell interactions) is also constrained by space and scope. However, these represent valuable directions for future work as more datasets become available.
  
  Review 2
4. Public_Reviews 13 Oct 2025
  
  in eLife
  
  Author response:
  
  The following is the authors’ response to the original reviews.
  
  Reviewer #1 (Public review):
  
  The authors present MerQuaCo, a computational tool that fills a critical gap in the field of spatial transcriptomics: the absence of standardized quality control (QC) tools for image-based datasets. Spatial transcriptomics is an emerging field where datasets are often imperfect, and current practices lack systematic methods to quantify and address these imperfections. MerQuaCo offers an objective and reproducible framework to evaluate issues like data loss, transcript detection variability, and efficiency differences across imaging planes.
  
  Strengths:
  
  (1) The study draws on an impressive dataset comprising 641 mouse brain sections collected on the Vizgen MERSCOPE platform over two years. This scale ensures that the documented imperfections are not isolated or anecdotal but represent systemic challenges in spatial transcriptomics. The variability observed across this large dataset underscores the importance of using sufficiently large sample sizes when benchmarking different image-based spatial technologies. Smaller datasets risk producing misleading results by over-representing unusually successful or unsuccessful experiments. This comprehensive dataset not only highlights systemic challenges in spatial transcriptomics but also provides a robust foundation for evaluating MerQuaCo's metrics. The study sets a valuable precedent for future quality assessment and benchmarking efforts as the field continues to evolve.
  
  (2) MerQuaCo introduces thoughtful metrics and filters that address a wide range of quality control needs. These include pixel classification, transcript density, and detection efficiency across both x-y axes (periodicity) and z-planes (p6/p0 ratio). The tool also effectively quantifies data loss due to dropped images, providing tangible metrics for researchers to evaluate and standardize their data. Additionally, the authors' decision to include examples of imperfections detectable by visual inspection but not flagged by MerQuaCo reflects a transparent and balanced assessment of the tool's current capabilities.
  
  Weaknesses:
  
  (1) The study focuses on cell-type label changes as the main downstream impact of imperfections. Broadening the scope to explore expression response changes of downstream analyses would offer a more complete picture of the biological consequences of these imperfections and enhance the utility of the tool.
  
  Here, we focused on the consequences of imperfections on cell-type labels, one common use for spatial transcriptomics datasets. Spatial datasets are used for so many other purposes that there are almost endless ways in which imperfections could impact downstream analyses. It is difficult to see how we might broaden the scope to include more downstream effects, while providing enough analysis to derive meaningful conclusions, all within the scope of a single paper. Existing studies bring some insight into the impact of imperfections and we expect future studies will extend our understanding of consequences in other biological contexts.
  
  (2) While the manuscript identifies and quantifies imperfections effectively, it does not propose post-imaging data processing solutions to correct these issues, aside from the exclusion of problematic sections or transcript species. While this is understandable given the study is aimed at the highest quality atlas effort, many researchers don't need that level of quality to compare groups. It would be important to include discussion points as to how those cut-offs should be decided for a specific study.
  
  Studies differ greatly in their aims and, as a result, the impact of imperfections in the underlying data will differ also, preventing us from offering meaningful guidance on how cut-offs might best be identified. Rather, our aim with MerQuaCo was to provide researchers with tools to generate information on their spatial datasets, to facilitate downstream decisions on data inclusion and cut-offs.
  
  (3) Although the authors demonstrate the applicability of MerQuaCo on a large MERFISH dataset, and the limited number of sections from other platforms, it would be helpful to describe its limitations in its generalizability.
  
  In figure 9, we addressed the limitations and generalizability of MerQuaCo as best we could with the available datasets. Gaining deep insight into the limitations and generalizability of MerQuaCo would require application to multiple large datasets and, to the best of our knowledge, these datasets are not available.
  
  Reviewer #2 (Public review):
  
  The authors present MerQuaCo, a computational tool for quality control in image-based spatial transcriptomic, especially MERSCOPE. They assessed MerQuaCo on 641 slides that are produced in their institute in terms of the ratio of imperfection, transcript density, and variations of quality by different planes (x-axis).
  
  Strengths:
  
  This looks to be a valuable work that can be a good guideline of quality control in future spatial transcriptomics. A well-controlled spatial transcriptomics dataset is also important for the downstream analysis.
  
  Weaknesses:
  
  The results section needs to be more structured.
  
  We have split the ‘Transcript density’ subsection of the results into 3 new subsections.
  
  Reviewer #3 (Public review):
  
  MerQuaCo is an open-source computational tool developed for quality control in imagebased spatial transcriptomics data, with a primary focus on data generated by the Vizgen MERSCOPE platform. The authors analyzed a substantial dataset of 641 freshfrozen adult mouse brain sections to identify and quantify common imperfections, aiming to replace manual quality assessment with an automated, objective approach, providing standardized data integrity measures for spatial transcriptomics experiments.
  
  Strengths:
  
  The manuscript's strengths lie in its timely utility, rigorous empirical validation, and practical contributions to methodology and biological discovery in spatial transcriptomics.
  
  Weaknesses:
  
  While MerQuaCo demonstrates utility in large datasets and cross-platform potential, its generalizability and validation require expansion, particularly for non-MERSCOPE platforms and real-world biological impact.
  
  We agree that there is value in expanding our analyses to non-Merscope platforms, to tissues other than brain, and to analyses other than cell typing. The limiting factor in all these directions is the availability of large enough datasets to probe the limits of MerQuaCo. We look forward to a future in which more datasets are available and it’s possible to extend our analyses
  
  Reviewer #1(Recommendation for the Author):
  
  (1) To better capture the downstream impacts of imperfections, consider extending the analysis to additional metrics, such as specificity variation across cell types, gene coexpression, or spatial gene patterning. This would deepen insights into how these imperfections shape biological interpretations and further demonstrate the versatility of MerQuaCo.
  
  These are compelling ideas, but we are unable to study so many possible downstream impacts in sufficient depth in a single study. Insights into these topics will likely come from future studies.
  
  (2) In Figure 7 legend, panel label (D) is repeated thus panels E-F are mislabelled.
  
  We have corrected this error.
  
  (3) Ensure that the image quality is high for the figures.
  
  We will upload Illustrator files, ensuring that images are at full resolution.
  
  Reviewer #2 (Recommendation for the Author):
  
  (1) A result subsection "Transcript density" looks too long. Please provide a subsection heading for each figure.
  
  We have split this section into 3 with new subheadings.
  
  (2) The result subsection title "Transcript density" sounds ambiguous. Please provide a detailed title describing what information this subsection contains.
  
  We have renamed this section ‘Differences in transcript density between MERSCOPE experiments’.
  
  Minor:
  
  (1) There is no explanation of the black and grey bars in Figure 2A.
  
  We have added information to the figure legend, identifying the datasets underlying the grey and black bars.
  
  (2) In the abstract, the phrase "High-dimension" should be "High-dimensional".
  
  We have changed ‘high-dimension’ to ‘high-dimensional’.
  
  (3) In the abstract, "Spatial results" is an unclear expression. What does it stand for?
  
  We have replaced the term ‘spatial results’ with ‘the outputs of spatial transcriptomics platforms’.
  
  Reviewer #3 (Recommendation for the Author):
  
  (1) While the tool claims broad applicability, validation is heavily centered on MERSCOPE data, with limited testing on other platforms. The authors should expand validation to include more diverse platforms and add a small analysis of non-brain tissue. If broader validation isn't feasible, modify the title and abstract to reflect the focus on the mouse brain explicitly.
  
  We agree that expansion to other platforms is desirable, but to the best of our knowledge sufficient datasets from other platforms are not available. In the abstract, we state that ‘… we describe imperfections in a dataset of 641 fresh-frozen adult mouse brain sections collected using the Vizgen MERSCOPE.’
  
  (2) The impact of data imperfections on downstream analysis needs a more comprehensive evaluation. The authors should expand beyond cluster label changes to include a) differential expression analysis with simulated imperfections, b) impact on spatial statistics and pattern detection, and c) effects on cell-cell interactions.
  
  Each of these ideas could support a substantial study. We are unable to do them justice in the limited space available as an addition to the current study.
  
  (3) The pixel classification workflow and validation process need more detailed documentation.
  
  The methods and results together describe the workflow and validation in depth. We are unclear what details are missing.
  
  (4) The manuscript lacks comparison to existing. QC pipelines such as Squidpy and Giotto. The authors should benchmark MerQuaCo against them and provide integration options with popular spatial analysis tools with clear documentation.
  
  To the best of our knowledge, Squidpy and Giotto lack QC benchmarks, certainly of the parameters characterized by MerQuaCo. Direct comparison isn’t possible.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.12.04.626766v2
arxiv.org arxiv.org

Microbiomes Through The Looking Glass

4
1. Public_Reviews 13 Oct 2025
  
  in eLife (unscoped)
  
  eLife Assessment
  
  This important study shows how the relative importance of inter-species interactions in microbiomes can be inferred from empirical species abundance data. The methods based on statistical physics of disordered systems are compelling and rigorous, and allow for distinguishing healthy and non-healthy human gut microbiomes via differences in their inter-species interaction patterns. This work should be of broad interest to researchers in microbial ecology and theoretical biophysics.
  
  Summary
2. Public_Reviews 13 Oct 2025
  
  in eLife (unscoped)
  
  Reviewer #1 (Public review):
  
  Summary:
  
  In this manuscript, the authors develop a novel method to infer ecologically-informative parameters across healthy and diseased states of the gut microbiota, although the method is generalizable to other datasets for species abundances. The authors leverage techniques from theoretical physics of disordered systems to infer different parameters-mean and standard deviation for the strength of bacterial interspecies interactions, a bacterial immigration rate, and the strength of demographic noise-that describe the statistics of microbiota samples from two groups-one for healthy subjects and another one for subjects with chronic inflammation syndromes. To do this, the authors simulate communities with a modified version of the Generalized Lotka-Volterra model and randomly-generated interactions, and then use a moment-matching algorithm to find sets of parameters that better reproduce the data for species abundances. They find that these parameters are different for the healthy and diseased microbiota groups. The results suggest, for example, that bacterial interaction strengths, relative to noise and immigration, are more dominant of microbiota dynamics in diseased states than in healthy states.
  
  We think that this manuscript brings an important contribution that will be of interest in the areas of statistical physics, (microbiota) ecology and (biological) data science. The evidence of their results is solid and the work improves the state-of-the-art in terms of methods.
  
  Strengths:
  
  Using a fairly generic ecological model, the method can identify the change in the relative importance of different ecological forces (distribution of interspecies interactions, demographic noise and immigration) in different sample groups. The authors focus on the case of the human gut microbiota, showing that the data is consistent with a higher influence of species interactions (relative to demographic noise and immigration) in a disease microbiota state than in healthy ones.
  
  The method is novel, original and it improves the state-of-the-art methodology for the inference of ecologically-relevant parameters. The analysis provides solid evidence on the conclusions.
  
  Weaknesses:
  
  As a proof of concept for a new inference method, this text maintains a technical focus, which may require some familiarity with statistical physics. Nevertheless, the authors' clear introduction of key mathematical terms and their interpretations, along with a clear discussion of the ecological implications, make the results accessible and easy to follow.
  
  Review 1
3. Public_Reviews 13 Oct 2025
  
  in eLife (unscoped)
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This valuable work aims to infer, from microbiome data, microbial species interaction patterns associated with healthy and unhealthy human gut microbiomes. Using solid techniques from statistical physics, the authors propose that healthy and unhealthy microbiome interaction patterns substantially differ. Unhealthy microbiomes are closer to instability and single-strain dominance; whereas healthy microbiomes showcase near-neutral dynamics, mostly driven by demographic noise and immigration.
  
  Strengths:
  
  This is a well-written article, relatively easy to follow and transparent despite the high degree of technicality of the underlying theory. The authors provide a powerful inferring procedure, which bypasses the issue of having only compositional data. This work shows that embracing the complexity of microbial systems can be used to our advantage, instead of being an insurmountable obstacle. This is a powerful counterpoint to the classic reductionist view that pushes researchers to study much simpler systems, and only hope to one day scale up their findings.
  
  Weaknesses:
  
  As acknowledged by the authors themselves, this is only a proof of concept. Further research is to better understand the dynamical nature of gut-microbiomes. The authors do however point at ways in which species abundance distributions could be better reproduced by dynamical models. They also suggest that they work could explain prior empirical findings invoking the "Anna Karenina principle", where healthy microbiomes resemble one another, but disease states tend to all differ.
  
  Review 2
4. Public_Reviews 13 Oct 2025
  
  in eLife (unscoped)
  
  Reviewer #3 (Public review):
  
  Summary:
  
  I found the manuscript to be well-written. I have a few questions regarding the model, though the bulk of my comments are requests to provide definitions and additional clarity. There are concepts and approaches used in this manuscript that are clear boons for understanding the ecology of microbiomes but are rarely considered by researchers approaching the manuscript from a traditional biology background. The authors have clearly considered this in their writing of S1 and S2, so addressing these comments should be straightforward. The methods section is particularly informative and well-written, with sufficient explanations of each step of the derivation that should be informative to researchers in the microbial life sciences that are not well-versed with physics-inspired approaches to ecology dynamics.
  
  Strengths:
  
  The modeling efforts of this study primarily rely on a disordered for of the generalized Lotka-Volterra (gLV) model. This model can be appropriate for investigating certain systems and the authors are clear about when and how more mechanistic models (i.e., consumer-resource) can lead to gLV. Phenomenological models such as this have been found to be highly useful for investigating the ecology of microbiomes, so this modeling choice seems justified, and the limitations are laid out.
  
  Weaknesses:
  
  The authors use metagenomic data of diseased and healthy patients that was first processed in Pasqualini et al. (2024). The use of metagenomic data leads me into a question regarding the role of sampling effort (i.e., read counts) in shaping model parameters such as $h$. This parameter is equal to the average of 1/# species across samples because the data are compositional in nature. My understanding is that $h$ was calculated using total abundances (i.e., read counts). The number of observed species is strongly influenced by sampling effort and the authors addressed this point in their revised manuscript.
  
  However, the role of sampling effort can depend on the type of data and my instinct about the role that sampling effort plays in species detection is primarily based on 16S data. The dependency between these two variables may be less severe for the authors' metagenomic pipeline. This potential discrepancy raises a broader issue regarding the investigation of microbial macroecological patterns and the inference of ecological parameters. Often microbial macroecology researchers rely on 16S rRNA amplicon data because that type of data is abundant and comparatively low-cost. Some in microbiology and bioinformatics are increasingly pushing researchers to choose metagenomics over 16S. Sometimes this choice is valid (discovery of new MAGs, investigate allele frequency changes within species, etc.), sometimes it is driven by the false equivalence "more data = better". The outcome though is that we have a body of more-or-less established microbial macroecological patterns which rest on 16S data and are now slowly incorporating results from metagenomics. To my knowledge there has not been a systematic evaluation of the macroecological patterns that do and do not vary by one's choice in 16S vs. metagenomics. Several of the authors in this manuscript have previously compared the MAD shape for 16S and metagenomic datasets in Pasqualini et al., but moving forward a more comprehensive study seems necessary (2024). These points were addressed by the authors in their revised manuscript.
  
  Final review: The authors addressed all comments and I have no additional comments.
  
  References
  
  Pasqualini, Jacopo, et al. "Emergent ecological patterns and modelling of gut microbiomes in health and in disease." PLOS Computational Biology 20.9 (2024): e1012482.
  
  Review 3
Visit annotations in context

Tags

Review 2

Review 1

Review 3

Summary

Annotators

Public_Reviews

URL

arxiv.org/abs/2406.07465v2
www.biorxiv.org www.biorxiv.org

Ribosomal RNA synthesis by RNA polymerase I is subject to premature termination of transcription

5
1. Public_Reviews 10 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This manuscript characterizes a mutated clone of RNA polymerase I in yeast, referred to as SuperPol, to understand the mechanisms of RNA polymerase I elongation and termination. The authors present convincing evidence that demonstrates the existence of premature termination in Pol I transcription. Overall, the characterization of this RNA pol I offers important insights into the regulation of ribosomal RNA transcription and its potential application in cancer pharmacology.
  
  [Editors' note: this paper was reviewed by Review Commons.]
  
  Summary
2. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The study characterises an RNA polymerase (Pol) I mutant (RPA135-F301S) named SuperPol. This mutant was previously shown to increase yeast ribosomal RNA (rRNA) production by Transcription Run-On (TRO). In this work, the authors confirm this mutation increases rRNA transcription using a slight variation of the TRO method, Transcriptional Monitoring Assay (TMA), which also allows the analysis of partially degraded RNA molecules. The authors show a reduction of abortive rRNA transcription in cells expressing the SuperPol mutant and a modest occupancy decrease at the 5' region of the rRNA genes compared to WT Pol I. These results suggest that the SuperPol mutant displays a lower frequency of premature termination. Using in vitro assays, the authors found that the mutation induces an enhanced elongation speed and a lower cleavage activity on mismatched nucleotides at the 3' end of the RNA. Finally, SuperPol mutant was found to be less sensitive to BMH-21, a DNA intercalating agent that blocks Pol I transcription and triggers the degradation of the Pol I subunit, Rpa190. Compared to WT Pol I, short BMH-21 treatment has little effect on SuperPol transcription activity, and consequently, SuperPol mutation decreases cell sensitivity to BMH-21.
  
  Significance:
  
  The work further characterises a single amino acid mutation of one of the largest yeast Pol I subunits (RPA135-F301S). While this mutation was previously shown to increase rRNA synthesis, the current work expands the SuperPol mutant characterisation, providing details of how RPA135-F301S modifies the enzymatic properties of yeast Pol I. In addition, their findings suggest that yeast Pol I transcription can be subjected to premature termination in vivo. The molecular basis and potential regulatory functions of this phenomenon could be explored in additional studies.
  
  Our understanding of rRNA transcription is limited, and the findings of this work may be interesting to the transcription community. Moreover, targeting Pol I activity is an open strategy for cancer treatment. Thus, the resistance of SuperPol mutant to BMH-21 might also be of interest to a broader community, although these findings are yet to be confirmed in human Pol I and with more specific Pol I inhibitors in future.
  
  Comments on revision:
  
  The authors' response addressed all the points I raised adequately.
  
  Review 1
3. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This article presents a study on a mutant form of RNA polymerase I (RNAPI) in yeast, referred to as SuperPol, which demonstrates increased rRNA production compared to the wild-type enzyme. While rRNA production levels are elevated in the mutant, RNAPI occupancy as detected by CRAC is reduced at the 5' end of rDNA transcription units. The authors interpret these findings by proposing that the wild-type RNAPI pauses in the external transcribed spacer (ETS), leading to premature transcription termination (PTT) and degradation of truncated rRNAs by the RNA exosome (Rrp6). They further show that SuperPol's enhanced activity is linked to a lower frequency of PTT events, likely due to altered elongation dynamics and reduced RNA cleavage activity, as supported by both in vivo and in vitro data.
  
  The study also examines the impact of BMH-21, a drug known to inhibit Pol I elongation, and shows that SuperPol is less sensitive to this drug, as demonstrated through genetic, biochemical, and in vivo approaches. The authors show that BMH-21 treatment induces premature termination in wild-type Pol I, but only to a lesser extent in SuperPol. They suggest that BMH-21 promotes termination by targeting paused Pol I complexes and propose that PTT is an important regulatory mechanism for rRNA production in yeast.
  
  The data presented are of high quality and support the notion that 1) premature transcription termination occurs at the 5' end of rDNA transcription units; 2) SuperPol has an increased elongation rate with reduced premature termination; and 3) BMH-21 promotes both pausing and termination. The authors employ several complementary methods, including in vitro transcription assays. These results are significant and of interest for a broad audience.
  
  Adding experiments in different growth conditions to support the claim of regulation by PTT (as the authors propose) will also be an important addition. The revisions further support the claim, with in particular the notion that increased elongation rate of superpol occurs at the expense of fidelity.
  
  Significance:
  
  These results are significant and of interest for a basic research audience.
  
  Review 2
4. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  In the manuscript "Ribosomal RNA synthesis by RNA polymerase I is regulated by premature termination of transcription", Azouzi and co-authors investigate the regulatory mechanisms of ribosomal RNA (rRNA) transcription by RNA Polymerase I (RNAPI) in the budding yeast S. cerevisiae. They follow up on exploring the molecular basis of a mutant allele of the second-largest subunit of RNAPI, RPA135-F301S, also dubbed SuperPol, that they had previously reported (Darrière et al, 2019), and which was shown to rescue Rpa49-linked growth defects, possibly by increasing rRNA production.
  
  Through a combination of genomic and in vitro approaches, the authors test the hypothesis that RNAPI activity could be subjected to a premature transcription termination (PTT) mechanism, akin to what is observed for RNA Polymerase II (RNAPII). The authors demonstrate that SuperPol increased processivity "desensitizes" RNAPI to abortive transcription cycles at the expense of decreased fidelity. In agreement, SuperPol is shown to be resistant to BMH-21, a drug previously shown to impair RNAPI elongation.
  
  Overall, this work expands the mechanistic understanding of the early dynamics of RNAPI transcription. The presented results are of interest for researchers studying transcription regulation, particularly those interested in RNAPI's transcription mechanisms and fidelity.
  
  Strengths:
  
  Overall, the experiments are performed with rigor and include the appropriate controls and statistical analyses. Conclusions are drawn from appropriate experiments. Both the figures and the text present the data clearly. The Materials and Methods section is detailed enough.
  
  Weaknesses:
  
  The biological significance of this phenomenon remains unaddressed and thus unclear. The lack of experiments to test a specific regulatory function (such as UTP-A loading checkpoint or other mechanisms) limit these termination events to possibly abortive actions of unclear significance.
  
  Comments on revised version:
  
  I appreciated the additional experiments and the other changes made by the authors in the revised version.
  
  Review 3
5. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Author response:
  
  The following is the authors’ response to the original reviews
  
  General Statements:
  
  In our manuscript, we demonstrate for the first time that RNA Polymerase I (Pol I) can prematurely release nascent transcripts at the 5' end of ribosomal DNA transcription units in vivo. This achievement was made possible by comparing wild-type Pol I with a mutant form of Pol I, hereafter called SuperPol previously isolated in our lab (Darrière at al., 2019). By combining in vivo analysis of rRNA synthesis (using pulse-labelling of nascent transcript and cross-linking of nascent transcript - CRAC) with in vitro analysis, we could show that Superpol reduced premature transcript release due to altered elongation dynamics and reduced RNA cleavage activity. Such premature release could reflect regulatory mechanisms controlling rRNA synthesis. Importantly, This increased processivity of SuperPol is correlated with resistance with BMH-21, a novel anticancer drugs inhibiting Pol I, showing the relevance of targeting Pol I during transcriptional pauses to kill cancer cells. This work offers critical insights into Pol I dynamics, rRNA transcription regulation, and implications for cancer therapeutics.
  
  We sincerely thank the three reviewers for their insightful comments and recognition of the strengths and weaknesses of our study. Their acknowledgment of our rigorous methodology, the relevance of our findings on rRNA transcription regulation, and the significant enzymatic properties of the SuperPol mutant is highly appreciated. We are particularly grateful for their appreciation of the potential scientific impact of this work. Additionally, we value the reviewer’s suggestion that this article could address a broad scientific community, including in transcription biology and cancer therapy research. These encouraging remarks motivate us to refine and expand upon our findings further.
  
  All three reviewers acknowledged the increased processivity of SuperPol compared to its wildtype counterpart. However, two out of three questions our claims that premature termination of transcription can regulate ribosomal RNA transcription. This conclusion is based on SuperPol mutant increasing rRNA production. Proving that modulation of early transcription termination is used to regulate rRNA production under physiological conditions is beyond the scope of this study. Therefore, we propose to change the title of this manuscript to focus on what we have unambiguously demonstrated:
  
  “Ribosomal RNA synthesis by RNA polymerase I is subjected to premature termination of transcription”.
  
  Reviewer 1 main criticisms centers on the use of the CRAC technique in our study. While we address this point in detail below, we would like to emphasize that, although we agree with the reviewer’s comments regarding its application to Pol II studies, by limiting contamination with mature rRNA, CRAC remains the only suitable method for studying Pol I elongation over the entire transcription units. All other methods are massively contaminated with fragments of mature RNA which prevents any quantitative analysis of read distribution within rDNA. This perspective is widely accepted within the Pol I research community, as CRAC provides a robust approach to capturing transcriptional dynamics specific to Pol I activity.
  
  We hope that these findings will resonate with the readership of your journal and contribute significantly to advancing discussions in transcription biology and related fields.
  
  Description of the planned revisions:
  
  Despite numerous text modification (see below), we agree that one major point of discussion is the consequence of increased processivity in SuperPol mutant on the “quality” of produced rRNA. Reviewer 3 suggested comparisons with other processive alleles, such as the rpb1-E1103G mutant of the RNAPII subunit (Malagon et al., 2006). This comparison has already been addressed by the Schneider lab (Viktorovskaya OV, Cell Rep., 2013 - PMID: 23994471), which explored Pol II (rpb1-E1103G) and Pol I (rpa190-E1224G). The rpa190-E1224G mutant revealed enhanced pausing in vitro, highlighting key differences between Pol I and Pol II catalytic ratelimiting steps (see David Schneider's review on this topic for further details).
  
  Reviewer 2 and 3 suggested that a decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. Pol I mutant with decreased rRNA cleavage have been characterized previously, and resulted in increased errorrate. We already started to address this point. Preliminary results from in vitro experiments suggest that SuperPol mutants exhibit an elevated error rate during transcription. However, these findings remain preliminary and require further experimental validation to confirm their reproducibility and robustness. We propose to consolidate these data and incorporate into the manuscript to address this question comprehensively. This could provide valuable insights into the mechanistic differences between SuperPol and the wild-type enzyme. SuperPol is the first pol I mutant described with an increased processivity in vitro and in vivo, and we agree that this might be at the cost of a decreased fidelity.
  
  Regulatory aspect of the process:
  
  To address the reviewer’s remarks, we propose to test our model by performing experiments that would evaluate PTT levels in Pol I mutant’s or under different growth conditions. These experiments would provide crucial data to support our model, which suggests that PTT is a regulatory element of Pol I transcription. By demonstrating how PTT varies with environmental factors, we aim to strengthen the hypothesis that premature termination plays an important role in regulating Pol I activity.
  
  We propose revising the title and conclusions of the manuscript. The updated version will better reflect the study's focus and temper claims regarding the regulatory aspects of termination events, while maintaining the value of our proposed model.
  
  Description of the revisions that have already been incorporated in the transferred manuscript:
  
  Some very important modifications have now been incorporated:
  
  Statistical Analyses and CRAC Replicates:
  
  Unlike reviewers 2 and 3, reviewer 1 suggests that we did not analyze the results statistically. In fact, the CRAC analyses were conducted in biological triplicate, ensuring robustness and reproducibility. The statistical analyses are presented in Figure 2C, which highlights significant findings supporting the fact WT Pol I and SuperPol distribution profiles are different. We CRAC replicates exhibit a high correlation and we confirmed significant effect in each region of interest (5’ETS, 18S.2, 25S.1 and 3’ ETS, Figure 1) to confirm consistency across experiments. We finally took care not to overinterpret the results, maintaining a rigorous and cautious approach in our analysis to ensure accurate conclusions.
  
  CRAC vs. Net-seq:
  
  Reviewer 1 ask to comment differences between CRAC and Net-seq. Both methods complement each other but serve different purposes depending on the biological question on the context of transcription analysis. Net-seq has originally been designed for Pol II analysis. It captures nascent RNAs but does not eliminate mature ribosomal RNAs (rRNAs), leading to high levels of contamination. While this is manageable for Pol II analysis (in silico elimination of reads corresponding to rRNAs), it poses a significant problem for Pol I due to the dominance of rRNAs (60% of total RNAs in yeast), which share sequences with nascent Pol I transcripts. As a result, large Net-seq peaks are observed at mature rRNA extremities (Clarke 2018, Jacobs 2022). This limits the interpretation of the results to the short lived pre-rRNA species. In contrast, CRAC has been specifically adapted by the laboratory of David Tollervey to map Pol I distribution while minimizing contamination from mature rRNAs (The CRAC protocol used exclusively recovers RNAs with 3′ hydroxyl groups that represent endogenous 3′ ends of nascent transcripts, thus removing RNAs with 3’-Phosphate, found in mature rRNAs). This makes CRAC more suitable for studying Pol I transcription, including polymerase pausing and distribution along rDNA, providing quantitative dataset for the entire rDNA gene.
  
  CRAC vs. Other Methods:
  
  Reviewer 1 suggests using GRO-seq or TT-seq, but the experiments in Figure 2 aim to assess the distribution profile of Pol I along the rDNA, which requires a method optimized for this specific purpose. While GRO-seq and TT-seq are excellent for measuring RNA synthesis and cotranscriptional processing, they rely on Sarkosyl treatment to permeabilize cellular and nuclear membranes. Sarkosyl is known to artificially induces polymerase pausing and inhibits RNase activities which are involved in the process. To avoid these artifacts, CRAC analysis is a direct and fully in vivo approach. In CRAC experiment, cells are grown exponentially in rich media and arrested via rapid cross-linking, providing precise and artifact-free data on Pol I activity and pausing.
  
  Pol I ChIP Signal Comparison:
  
  The ChIP experiments previously published in Darrière et al. lack the statistical depth and resolution offered by our CRAC analyses. The detailed results obtained through CRAC would have been impossible to detect using classical ChIP. The current study provides a more refined and precise understanding of Pol I distribution and dynamics, highlighting the advantages of CRAC over traditional methods in addressing these complex transcriptional processes.
  
  BMH-21 Effects:
  
  As highlighted by Reviewer 1, the effects of BMH-21 observed in our study differ slightly from those reported in earlier work (Ref Schneider 2022), likely due to variations in experimental conditions, such as methodologies (CRAC vs. Net-seq), as discussed earlier. We also identified variations in the response to BMH-21 treatment associated with differences in cell growth phases and/or cell density. These factors likely contribute to the observed discrepancies, offering a potential explanation for the variations between our findings and those reported in previous studies. In our approach, we prioritized reproducibility by carefully controlling BMH-21 experimental conditions to mitigate these factors. These variables can significantly influence results, potentially leading to subtle discrepancies. Nevertheless, the overall conclusions regarding BMH-21's effects on WT Pol I are largely consistent across studies, with differences primarily observed at the nucleotide resolution. This is a strength of our CRAC-based analysis, which provides precise insights into Pol I activity.
  
  We will address these nuances in the revised manuscript to clarify how such differences may impact results and provide context for interpreting our findings in light of previous studies.
  
  Minor points:
  
  Reviewer #1:
  
  In general, the writing style is not clear, and there are some word mistakes or poor descriptions of the results, for example:
  
  On page 14: "SuperPol accumulation is decreased (compared to Pol I)".
  
  On page 16: "Compared to WT Pol I, the cumulative distribution of SuperPol is indeed shifted on the right of the graph."
  
  We clarified and increased the global writing style according to reviewer comment.
  
  There are also issues with the literature, for example: Turowski et al, 2020a and Turowski et al, 2020b are the same article (preprint and peer-reviewed). Is there any reason to include both references? Please, double-check the references.
  
  This was corrected in this version of the manuscript.
  
  In the manuscript, 5S rRNA is mentioned as an internal control for TMA normalisation. Why are Figure 1C data normalised to 18S rRNA instead of 5S rRNA?
  
  Data are effectively normalized relative to the 5S rRNA, but the value for the 18S rRNA is arbitrarily set to 100%.
  
  Figure 4 should be a supplementary figure, and Figure 7D doesn't have a y-axis labelling.
  
  The presence of all Pol I specific subunits (Rpa12, Rpa34 and Rpa49) is crucial for the enzymatic activity we performed. In the absence of these subunits (which can vary depending on the purification batch), Pol I pausing, cleavage and elongation are known to be affected. To strengthen our conclusion, we really wanted to show the subunit composition of the purified enzyme. This important control should be shown, but can indeed be shown in a supplementary figure if desired.
  
  Y-axis is figure 7D is now correctly labelled
  
  In Figure 7C, BMH-21 treatment causes the accumulation of ~140bp rRNA transcripts only in SuperPol-expressing cells that are Rrp6-sensitive (line 6 vs line 8), suggesting that BHM-21 treatment does affect SuperPol. Could the author comment on the interpretation of this result?
  
  The 140 nt product is a degradation fragment resulting from trimming, which explains its lower accumulation in the absence of Rrp6. BMH21 significantly affects WT Pol I transcription but has also a mild effect on SuperPol transcription. As a result, the 140 nt product accumulates under these conditions.
  
  Reviewer #2:
  
  pp. 14-15: The authors note local differences in peak detection in the 5'-ETS among replicates, preventing a nucleotide-resolution analysis of pausing sites. Still, they report consistent global differences between wild-type and SuperPol CRAC signals in the 5'ETS (and other regions of the rDNA). These global differences are clear in the quantification shown in Figures 2B-C. A simpler statement might be less confusing, avoiding references to a "first and second set of replicates"
  
  According to reviewer, statement has been simplified in this version of the manuscript.
  
  Figures 2A and 2C: Based on these data and quantification, it appears that SuperPol signals in the body and 3' end of the rDNA unit are higher than those in the wild type. This finding supports the conclusion that reduced pausing (and termination) in the 5'ETS leads to an increased Pol I signal downstream. Since the average increase in the SuperPol signal is distributed over a larger region, this might also explain why even a relatively modest decrease in 5'ETS pausing results in higher rRNA production. This point merits discussion by the authors.
  
  We agree that this is a very important discussion of our results. Transcription is a very dynamic process in which paused polymerase is easily detected using the CRAC assay. Elongated polymerases are distributed over a much larger gene body, and even a small amount of polymerase detected in the gene body can represent a very large rRNA synthesis. This point is of paramount importance and, as suggested by the reviewer, is now discussed in detail.
  
  A decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. Have the authors observed any evidence supporting this possibility?
  
  Reviewer suggested that a decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. We thank Reviewer #2 to point it as in our opinion, this is an important point what should be added to the manuscript. We have now included new data (panels 5G, 5H and 5I) in the manuscript showing that SuperPol in vitro exhibits an increased error rate compared to the WT enzyme. From these results obtained in vitro, we concluded that SuperPol shows reduced nascent transcript cleavage, associated with more efficient transcript elongation, but to the detriment of transcriptional fidelity.
  
  pp. 15 and 22: Premature transcription termination as a regulator of gene expression is welldocumented in yeast, with significant contributions from the Corden, Brow, Libri, and Tollervey labs. These studies should be referenced along with relevant bacterial and mammalian research.
  
  According to reviewer suggestion, we referenced these studies.
  
  p. 23: "SuperPol and Rpa190-KR have a synergistic effect on BMH-21 resistance." A citation should be added for this statement.
  
  This represents some unpublished data from our lab. KR and SuperPol are the only two known mutants resistant to BMH-21. We observed that resistance between both alleles is synergistic, with a much higher resistance to BMH-21 in the double mutant than in each single mutant (data not shown). Comparing their resistance mechanisms is a very important point that we could provide upon request. This was added to the statement.
  
  p. 23: "The released of the premature transcript" - this phrase contains a typo
  
  This is now corrected.
  
  Reviewer #3:
  
  Figure 1B: it would be opportune to separate the technique's schematic representation from the actual data. Concerning the data, would the authors consider adding an experiment with rrp6D cells? Some RNAs could be degraded even in such short period of time, as even stated by the authors, so maybe an exosome depleted background could provide a more complete picture. Could also the authors explain why the increase is only observed at the level of 18S and 25S? To further prove the robustness of the Pol I TMA method could be good to add already characterized mutations or other drugs to show that the technique can readily detect also well-known and expected changes.
  
  The precise objective of this experiment is to avoid the use of the Rrp6 mutant. Under these conditions, we prevent the accumulation of transcripts that would result from a maturation defect. While it is possible to conduct the experiment with the Rrp6 mutant, it would be impossible to draw reliable conclusions due to this artificial accumulation of transcripts.
  
  Figure 1C: the NTS1 probe signal is missing (it is referenced in Figure 1A but not listed in the Methods section or the oligo table). If this probe was unused, please correct Figure 1A accordingly.
  
  We corrected Figure 1A.
  
  Figure 2A: the RNAPI occupancy map by CRAC is hard to interpret. The red color (SuperPol) is stacked on top of the blue line, and we are not able to observe the signal of the WT for most of the position along the rDNA unit. It would be preferable to use some kind of opacity that allows to visualize both curves. Moreover, the analysis of the behavior of the polymerase is always restricted to the 5'ETS region in the rest of the manuscript. We are thus not able to observe whether termination events also occur in other regions of the rDNA unit. A Northern blot analysis displaying higher sizes would provide a more complete picture.
  
  We addressed this point to make the figure more visually informative. In Northern Blot analysis, we use a TSS (Transcription Start Site) probe, which detects only transcripts containing the 5' extremity. Due to co-transcriptional processing, most of the rRNA undergoing transcription lacks its 5' extremity and is not detectable using this technique. We have the data, but it does not show any difference between Pol I and SuperPol. This information could be included in the supplementary data if asked.
  
  "Importantly, despite some local variations, we could reproducibly observe an increased occupancy of WT Pol I in 5'-ETS compared to SuperPol (Figure 1C)." should be Figure 2C.
  
  Thanks for pointing out this mistake. It has been corrected.
  
  Figure 3D: most of the difference in the cumulative proportion of CRAC reads is observed in the region ~750 to 3000. In line with my previous point, I think it would be worth exploring also termination events beyond the 5'-ETS region.
  
  We agree that such an analysis would have been interesting. However, with the exception of the pre-rRNA starting at the transcription start site (TSS) studied here, any cleaved rRNA at its 5' end could result from premature termination and/or abnormal processing events. Exploring the production of other abnormal rRNAs produced by premature termination is a project in itself, beyond this initial work aimed at demonstrating the existence of premature termination events in ribosomal RNA production.
  
  Figure 4: should probably be provided as supplementary material.
  
  As l mentioned earlier (see comments), the presence of all Pol I specific subunits (Rpa12, Rpa34 and Rpa49) is crucial for the enzymatic activity we performed. This important control should be shown, but can indeed be shown in a supplementary figure if desired.
  
  "While the growth of cells expressing SuperPol appeared unaffected, the fitness of WT cells was severely reduced under the same conditions." I think the growth of cells expressing SuperPol is slightly affected.
  
  We agree with this comment and we modified the text accordingly.
  
  Figure 7D: the legend of the y-axis is missing as well as the title of the plot.
  
  Legend of the y-axis and title of the plot are now present.
  
  The statements concerning BMH-21, SuperPol and Rpa190-KR in the Discussion section should be removed, or data should be provided.
  
  This was discussed previously. See comment above.
  
  Some references are missing from the Bibliography, for example Merkl et al., 2020; Pilsl et al., 2016a, 2016b.
  
  Bibliography is now fixed
  
  Description of analyses that authors prefer not to carry out:
  
  Does SuperPol mutant produces more functional rRNAs ?
  
  As Reviewer 1 requested, we agree that this point requires clarification.. In cells expressing SuperPol, a higher steady state of (pre)-rRNAs is only observed in absence of degradation machinery suggesting that overproduced rRNAs are rapidly eliminated. We know that (pre)rRNas are unable to accumulate in absence of ribosomal proteins and/or Assembly Factors (AF). In consequence, overproducing rRNAs would not be sufficient to increase ribosome content. This specific point is further address in our lab but is beyond the scope of this article.
  
  Is premature termination coupled with rRNA processing
  
  We appreciate the reviewer’s insightful comments. The suggested experiments regarding the UTP-A complex's regulatory potential are valuable and ongoing in our lab, but they extend beyond the scope of this study and are not suitable for inclusion in the current manuscript.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.11.27.568781v3
arxiv.org arxiv.org

Theory of active self-organization of dense nematic structures in the actin cytoskeleton

5
1. Public_Reviews 10 Oct 2025
 
 in eLife (unscoped)
 
 eLife Assessment
 
 In this study, the authors offer a theoretical explanation for the emergence of nematic bundles in the actin cortex, carrying implications for the assembly of actomyosin stress fibers. As such, the study is a valuable contribution to the field actomyosin organisation in the actin cortex. The theoretical work is solid and provides a rigorous theoretical framework to study active self-organisation in actomyosin systems, including qualitative comparison with experimental observations.
 
 Summary
2. Public_Reviews 09 Oct 2025
 
 in eLife (unscoped)
 
 Reviewer #1 (Public review):
 
 Summary:
 
 In this article, Mirza et al developed a continuum active gel model of actomyosin cytoskeleton that account for nematic order and density variations in actomyosin. Using this model, they identify the requirements for the formation of dense nematic structures. In particular, they show that self-organization into nematic bundles requires both flow-induced alignment and active tension anisotropy in the system. By varying model parameters that control active tension and nematic alignment, the authors show that their model reproduces a rich variety of actomyosin structures, including tactoids, fibres, asters as well as crystalline networks. Additionally, discrete simulations are employed to calculate the activity parameters in the continuum model, providing a microscopic perspective on the conditions driving the formation of fibrillar patterns.
 
 Strengths:
 
 The strength of the work lies in its delineation of the parameter ranges that generate distinct types of nematic organization within actomyosin networks. The authors pinpoint the physical mechanisms behind the formation of fibrillar patterns, which may offer valuable insights into stress fiber assembly. Another strength of the work is connecting activity parameters in the continuum theory with microscopic simulations.
 
 Weaknesses:
 
 This paper is a very difficult read for nonspecialists, especially if you are not well-versed in continuum hydrodynamic theories. Efforts should be made to connect various elements of theory with biological mechanisms, which is mostly lacking in this paper. The comparison with experiments is predominantly qualitative. It is unclear if the theory is suited for in vitro or in vivo actomyosin systems. The justification for various model assumptions, especially concerning their applicability to actomyosin networks, requires a more thorough examination. The classification of different structures demands further justification. For example, the rationale behind categorizing structures as sarcomeric remains unclear when nematic order is perpendicular to the axis of the bands. Sarcomeres traditionally exhibit a specific ordering of actin filaments with alternating polarity patterns. Similarly, the criteria for distinguishing between contractile and extensile structures need clarification, as one would expect extensile structures to be under tension contrary to the authors' claim. Additionally, it's unclear if the model's predictions for fiber dynamics align with observations in cells, as stress fibers exhibit a high degree of dynamism and tend to coalesce with neighboring fibers during their assembly phase. Finally, it seems that the microscopic model is unable to recapitulate the density patterns predicted by the continuum theory, raising questions about the suitability of the simulation model.
 
 Review 1
3. Public_Reviews 09 Oct 2025
 
 in eLife (unscoped)
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The article by Waleed et al discusses the self-organization of actin cytoskeleton using the theory of active nematics. Linear stability analysis of the governing equations and computer simulations show that the system is unstable to density fluctuations and self-organized structures can emerge.
 
 Strengths:
 
 (i) Analytical calculations complemented with simulations (ii) Theory for cytoskeletal network
 
 Weaknesses:
 
 Not placed in the context or literature on active nematics.
 
 Comments on revised version:
 
 The authors have satisfactorily responded to the comments
 
 Review 2
4. Public_Reviews 09 Oct 2025
 
 in eLife (unscoped)
 
 Reviewer #3 (Public review):
 
 The manuscript "Theory of active self-organization of dense nematic structures in the actin cytoskeleton" analysis self-organized pattern formation within a two-dimensional nematic liquid crystal theory and uses microscopic simulations to test the plausibility of some of the conclusions drawn from that analysis. After performing an analytic linear stability analysis that indicates the possibility of patterning instabilities, the authors perform fully non-linear numerical simulations and identify the emergence of stripe-like patterning when anisotropic active stresses are present. Following a range of qualitative numerical observations on how parameter changes affect these patterns, the authors identify, besides isotropic and nematic stress, also active self-alignment as an important ingredient to form the observed patterns. Finally, microscopic simulations are used to test the plausibility of some of the most crucial assumptions underlying continuum simulations.
 
 The paper is well written, figures are mostly clear, and the theoretical analysis presented in both, main text and supplement, is rigorous. Mechano-chemical coupling has emerged in recent years as a crucial element of cell cortex and tissue organization and it is plausible to think that both, isotropic and anisotropic active stresses, are present within such effectively compressible structures. Even though not explicitly stated this way by the authors, I would argue that combining these two is one of the key ingredients that distinguishes this theoretical paper from similar ones.
 
 The diversity of patterning processes experimentally observed and theoretically described is nicely elaborated on in the introduction of the paper. The theory development and discussion of the continuum model itself is also well-embedded in a review of the relevant broad literature on active liquid crystals and active nematics, which includes plenty of previous results by the authors themselves. Interestingly, several of the patterns identified in the present work, such as 2D hexagonal and pulsatory patterns (Kumar et al, PRL, 2014), as well as contractile patches (Mietke et al, PRL 2019) have been observed previously in different, but related, active isotropic fluid models. In light of this crowded literature, the authors do good job in delineating key results obtained in the present manuscript from existing work.
 
 The results of numerical simulations are well-presented. The discussion of numerical observations is comprehensive, but also at many times qualitative. Some of the observations resonate with recent discussions in the field, for example the observation of effectively extensile dynamics in a contractile system, which is interesting and reminiscent of ambiguities about extensile/contractile properties discussed in recent preprints (Nejad et al, Nat Comm 2024). It is convincingly concluded that, besides nematic stress on top of isotropic one, active self-alignment is a key ingredient to produce the observed patterns.
 
 The authors must be complimented for trying to gain further mechanistic insights into their conclusions using microscopic filament simulations that were diligently performed. It is rightfully stated that these simulations only provide plausibility tests about key assumptions underlying the hydrodynamic theory. Within this scope, I would say the authors are successful. At the same time, it leaves open questions that could have been discussed more carefully. For example, I wonder what can be said about the regime \kappa>0 microscopically, in which the continuum theory does also predict the formation of stripe patterns? How does the spatial inhomogeneous organization the continuum theory predicts fit in the presented, microscopic picture and vice versa? The authors clearly explain the scope and limitations of the microscopic model, which suggests that questions like these will be interesting directions of future investigations.
 
 Overall, the paper represents a valuable contribution to the field of active matter that should provide a fruitful basis to develop new hypothesis about the dynamic self-organisation and mechanics of dense filamentous bundles in biological systems.
 
 Review 3
5. Public_Reviews 09 Oct 2025
 
 in eLife (unscoped)
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 eLife assessment
 
 In this study, the authors offer a theoretical explanation for the emergence of nematic bundles in the actin cortex, carrying implications for the assembly of actomyosin stress fibers. As such, the study is a valuable contribution to the field actomyosin organization in the actin cortex. While the theoretical work is solid, experimental evidence in support of the model assumptions remains incomplete. The presentation could be improved to enhance accessibility for readers without a strong background in hydrodynamic and nematic theories.
 
 To address the weaknesses identified in this assessment, we have expanded the motivation and description of the theoretical model, specifically insisting on the experimental evidence supporting its rationale and assumptions. These changes in the revised manuscript are implemented in the two first paragraphs of Section “Theoretical model” and in a more detailed description and justification of the different mathematical terms that appear in that section. We have made an effort to map in our narrative different terms to mechanistic processes in the actomyosin network. Even if the nature of the manuscript is inevitably theoretical, we think that the revised manuscript will be more accessible to a broader spectrum of readers.
 
 Public Reviews:
 
 Reviewer #1 (Public Review):
 
 Summary:
 
 In this article, Mirza et al developed a continuum active gel model of actomyosin cytoskeleton that account for nematic order and density variations in actomyosin. Using this model, they identify the requirements for the formation of dense nematic structures. In particular, they show that self-organization into nematic bundles requires both flow-induced alignment and active tension anisotropy in the system. By varying model parameters that control active tension and nematic alignment, the authors show that their model reproduces a rich variety of actomyosin structures, including tactoids, fibres, asters as well as crystalline networks. Additionally, discrete simulations are employed to calculate the activity parameters in the continuum model, providing a microscopic perspective on the conditions driving the formation of fibrillar patterns.
 
 Strengths:
 
 The strength of the work lies in its delineation of the parameter ranges that generate distinct types of nematic organization within actomyosin networks. The authors pinpoint the physical mechanisms behind the formation of fibrillar patterns, which may offer valuable insights into stress fiber assembly. Another strength of the work is connecting activity parameters in the continuum theory with microscopic simulations.
 
 We thank the referee for these comments.
 
 Weaknesses:
 
 (A) This paper is a very difficult read for nonspecialists, especially if you are not well-versed in continuum hydrodynamic theories. Efforts should be made to connect various elements of theory with biological mechanisms, which is mostly lacking in this paper. The comparison with experiments is predominantly qualitative.
 
 We understand the point of the referee. While it is unavoidable to present the continuum hydrodynamic theory behind our results, we have made an effort in the revised manuscript to (1) motivate the essential features required from a theoretical model of the actomyosin cytoskeleton capable of describing its nematic self organization (two first paragraphs of Section “Theoretical model”), and to (2) explicitly explain the physical meaning of each of the mathematical terms in the theory, and when appropriate, relate them to molecular mechanisms in the cytoskeleton. We hope that the revised manuscript addresses the concern of the referee.
 
 Regarding the comparison with experiments, they are indeed qualitative because the main point of the paper is to establish a physical basis for the self-organization of dense nematic structures in actomyosin gels. Somewhat surprisingly, we argue that a compelling mechanism explaining the tendency of actomyosin gels to form patterns of dense nematic bundles has been lacking. As we review in the introduction, these patterns are qualitatively diverse across cell types and organisms in terms of geometry and dynamics, and for this reason, our goal is to show that the same material in different parameter regimes can exhibit such qualitative diversity. A quantitative comparison is difficult for several reasons. First, many of the parameters in our theory have not been measured and are expected to vary wildly between cell types. In fact, estimates in the literature often rely on comparison with hydrodynamic models such as ours. For this reason, we chose to delineate regimes leading to qualitatively different emerging architectures and dynamics. Second, the patterns of nematic bundles found across cell types depend on the interaction between (1) the intrinsic tendency of actomyosin gels to form such structures studied here and (2) other elements of the cellular context. For instance, polymerization and retrograde flow from the lamellipodium, the physical barrier of the nucleus, and the interaction with the focal adhesion machinery are essential to understand the emergence of stress fibers in adherent cells. Cell shape and curvature anisotropy control the orientation of actin bundles in parallel patterns in the wings and trachea of insects. Nuclear positions guide the actin bundles organizing the cellularization of Sphaeroforma arctica [11]. Here, we focus on establishing that actomyosin gels have an intrinsic ability to self organize into dense nematic bundles, and leave how this property enables the morphogenesis of specific structures for future work. We have emphasized this point in the revised section of conclusions.
 
 (B) It is unclear if the theory is suited for in vitro or in vivo actomyosin systems. The justification for various model assumptions, especially concerning their applicability to actomyosin networks, requires a more thorough examination.
 
 We thank the referee for this comment. Our theory is applicable to actomyosin gels originating from living cells. To our knowledge, the ability of reconstituted actomyosin gels from purified proteins to sustain the kind of contractile dynamical steady-states observed in living cells is very limited. In the revised manuscript, we cite a very recent preprint presenting very exciting but partial results in this direction [49]. Instead, reconstituted in vitro systems encapsulating actomyosin cell extracts robustly recapitulate contractile steady-states. This point has been clarified in the first paragraph of Section “Theoretical model”.
 
 (C) The classification of different structures demands further justification. For example, the rationale behind categorizing structures as sarcomeric remains unclear when nematic order is perpendicular to the axis of the bands. Sarcomeres traditionally exhibit a specific ordering of actin filaments with alternating polarity patterns.
 
 We agree with the referee and in the revised manuscript we have avoided the term “sarcomeric” because it refers to very specific organizations in cells. What we previously called “sarcomeric patterns”, where bands of high density exhibit nematic order perpendicular to the axis of the bands, is not a structure observed to our knowledge in cells. It is introduced to delimit the relevant region in parameter space. In the revised manuscript, we refer to this pattern as “banded pattern with perpendicular nematic organization” or “banded pattern” in short.
 
 (D) Similarly, the criteria for distinguishing between contractile and extensile structures need clarification, as one would expect extensile structures to be under tension contrary to the authors' claim.
 
 We thank the referee for raising this point, which was not sufficiently clarified in the original manuscript. We first note that in incompressible active nematic models, active tension is deviatoric (traceless and anisotropic) because an isotropic component would simply get absorbed by the pressure field enforcing incompressibility. Being compressible, our model admits an active tension tensor with deviatoric and isotropic components. We consider always a contractile (positive) isotropic component of active tension, but the deviatoric component can be either contractile (𝜅 > 0) or extensile (𝜅 < 0), where we follow the common terminology according to which in contractile/extensile active nematics the active stress is proportional to q with a positive/negative proportionality constant [see e.g. https://doi.org/10.1038/s41467018-05666-8]. Furthermore, as clarified in the revised manuscript, total active stresses accounting for the deviatoric and isotropic components are always contractile (positive) in all directions, as enforced by the condition |𝜅| < 1.
 
 For fibrillar patterns, we need 𝜅 < 0, and therefore active stresses are larger perpendicular to the nematic direction. This means that the anisotropic component of the active tension is extensile, although, accounting for the isotropic component, total active tension is contractile (see Fig. 1c). This is now clarified in the text following Eq. 7 and in Fig. 1.
 
 However, following fibrillar pattern formation and as a result of the interplay between active and viscous stresses, the total stress can be larger along the emergent dense nematic structures (“contractile structures”) or perpendicular to them (“extensile structures”). To clarify this point, in the revised Fig. 4 and the text referring to it, we have expanded our explanation and plotted the difference between the total stress component parallel to the nematic direction (𝜎∥) and the component perpendicular to the nematic direction (𝜎⊥), with contractile structures satisfying 𝜎∥ − 𝜎⊥ > 0 and extensile structures satisfying 𝜎∥ − 𝜎⊥ < 0. See lines 280 to 303. This is consistent with the common notion of contractile/extensile systems in incompressible nematic systems [see e.g. https://doi.org/10.1038/s41467-018-05666-8].
 
 (E) Additionally, its unclear if the model's predictions for fiber dynamics align with observations in cells, as stress fibers exhibit a high degree of dynamism and tend to coalesce with neighboring fibers during their assembly phase.
 
 In the present work, we focus on the self-organization of a periodic patch of actomyosin gel. However, in adherent cells boundary conditions play an essential role, as discussed in our response to comment (A) by this referee. In ongoing work, we are studying with the present model the dynamics of assembly and reconfiguration of dense nematic structures in domains with boundary conditions mimicking in adherent cells, possibly interacting with the adhesion machinery, finding dynamical interactions as those suggested by the referee. As an example, we show a video of a simulation where at the edge of the circular domain, there is an actin influx modeling the lamellipodium, and in four small regions friction is higher simulating focal adhesions. Under these boundary conditions, the model presented in the paper exhibits the kind of dynamical reorganizations alluded by the referee.
 
 Author response video 1.
 
 We would like to note, however, that the prominent stress fibers in cells adhered to stiff substrates, so abundantly reported in the literature, are not the only instance of dense nematic actin bundles. In the present manuscript, we emphasize the relation of the predicted organizations with those found in different in vivo contexts not related to stress fibers, such as the aligned patterns of bundles in insects (trachea, scales in butterfly wings), in hydra, or in reproductive organs of C elegans; the highly dynamical network of bundles observed in C elegans early embryos; or the labyrinth patters of micro-ridges in the apical surface of epidermal cells in fish.
 
 (F) Finally, it seems that the microscopic model is unable to recapitulate the density patterns predicted by the continuum theory, raising questions about the suitability of the simulation model.
 
 We thank the referee for raising this question, which needs further clarification. The goal of the microscopic model is not to reproduce the self-organized patterns predicted by the active gel theory. The microscopic model lacks essential ingredients, notably a realistic description of hydrodynamics and turnover. Our goal with the agent-based simulations is to extract the relation between nematic order and active stresses for a small homogeneous sample of the network. This small domain is meant to represent the homogeneous active gel prior to pattern formation, and it allows us to substantiate key assumptions of the continuum model leading to pattern formation, notably the dependence of isotropic and deviatoric components of the active stress on density and nematic order (Eq. 7) and the active generalized stress promoting ordering.
 
 We should mention that reproducing the range of out-of-equilibrium mesoscale architectures predicted by our active gel model with agent-based simulations seems at present not possible, or at least significantly beyond the state-of-the-art. To our knowledge, these models have not been able to reproduce the heterogeneous nonequilibrium contractile states involving sustained self-reinforcing flows underlying the pattern formation mechanism studied in our work. The scope of the discrete network simulations has been clarified in lines 340 to 349 in the revised manuscript.
 
 While agent-based cytoskeletal simulations are very attractive because they directly connect with molecular mechanisms, active gel continuum models are better suited to describe out-of-equilibrium emergent hydrodynamics at a mesoscale. We believe that these two complementary modeling frameworks are rather disconnected in the literature, and for this reason, we have attempted substantiate some aspects of our continuum modeling with discrete simulations. We have emphasized the complementarity of the two approaches in the conclusions.
 
 Reviewer #1 (Recommendations For The Authors):
 
 Questions on the theory:
 
 Does rho describe the density of actin or myosin? The authors say that they are modeling actomyosin material as a whole, but the actin and myosin should be modeled separately. Along, similar lines, does Q define the ordering of actin or myosin?
 
 Active gel models of the actomyosin cytoskeleton have been formulated with independent densities for actin and for myosin or using a single density field, implicitly assuming a fixed stoichiometry. Super-resolution imaging of the actomyosin cytoskeleton also suggest that in principle it makes sense to consider different nematic fields for actin and for myosin filaments. In the revised manuscript, we now explicitly mention that our density and nematic field are effective descriptions of the entire actomyosin gel (lines 82-84).
 
 A more detailed model would entail additional material parameters, not available experimentally, which may help reproduce specific experiments but that would make the systematic study of the different behaviors much more difficult. Our approach has been to keep the model minimal meeting the fundamental requirements outlined in the first paragraphs of Section “Theoretical model”.
 
 Should the active stress depend on material density? It seems strange (from Eq. 3) that active stress could be non-zero even where density is zero, since sigma_act does not depend on rho.
 
 Yes, active stress is assumed to be proportional to density. Eq. 3 in the original manuscript was misleading (it was multiplied by rho in Eq. 2). In the revised manuscript, we have explained with a bit more detail the theoretical model, clarifying this point.
 
 The authors should clearly explain their rationale for retaining certain types of nonlinear terms while ignoring others in theory. For instance, the nonlinearities in the equations of motion are sometimes quadratic in the fields, while there are also some cubic terms. Please remark up to what order in the fields the various interactions are modeled.
 
 We thank the referee for raising this point. The nonlinearities in the theory are easily explained on the basis of a small number of choices. We have added a new paragraph towards the end of Section “Theoretical model” (lines 145 to 152) providing a rationale for the origin and underlying assumptions leading to different nonlinearities.
 
 To connect with experiments and the biological context, please explain the biological origin of various terms in the model: (1) L-dependent terms in Eq. 2 and 4, (2) Flowalignment of nematic order and experimental evidence in support of it, (3) densitydependent susceptibility terms in Eq. 4
 
 (1) Unfortunately, the L-dependent terms are very bulky, but are very standard in nematic theories. The best way to understand their physical significance is through the expression of the nematic free-energy, which is now given and explained in the revised manuscript (Eq. 3). The resulting complicated expression for the molecular field and the nematic stress (Eqs. 4 and 5) are mathematical consequences of the choice of nematic free energy. In the revised manuscript, we also attempt to provide a basis for these terms in the context of the actin cytoskeleton. (2) To our knowledge, the best reference supporting this term from experiments is Reymann et al, eLife (2016). In the revised manuscript, we have provided a physical interpretation. (3) We have expanded the motivation and plausible microscopic justification of this term.
 
 There are different 'activity' terms in the model. Their biophysical origin is not made clear. For example, the authors should make clear if these activities arise from filament or motor activity. Relatedly, the authors should provide a comprehensive discussion of the signs of the different active parameters and their physical interpretations.
 
 In an active gel model, activity parameters are phenomenological and how they map to molecular mechanisms is not precisely known, although conventionally contractile active tension is ascribed to the mechanical transduction of chemical power by myosin motors. The fact is that, besides myosin activity, there are many nonequilibrium processes in the actomyosin cytoskeleton that may lead to active stresses including (de)polymerization of filaments or (un)binding of crosslinkers. In the revised manuscript, we have added sentences illustrating how different terms may result from microscopic mechanisms, but providing a precise mapping between our model and nonequilibrium dynamics of proteins is beyond the scope of our work, although our discrete network simulations address this issue to a certain degree.
 
 Following the suggestion of the referee, our description of the theory now discusses much more extensively the signs of activity parameters and their physical interpretations, e.g. the text following Eq. 7.
 
 Throughout the paper, various activity terms are varied independently of each other. Is that a reasonable assumption given that activities should depend on ATP and are thus not independent of one another?
 
 We agree that, ultimately, all active process depend on the conversion of chemical energy into mechanical energy. However, recent work has highlighted how active tension also depends on the microscopic architecture of the network controlled by multiple regulators of the actomyosin cytoskeleton (e.g. Chug et al, Nat Cell Biol, 2017). It is reasonable to expect that, for a given rate of ATP consumption, chemical power will be converted into mechanical power in different ways depending on the micro-architecture of the cytoskeleton, e.g. the stoichiometry of filaments, crosslinkers, myosins, or the length distribution of filaments (very long filaments crosslinked by myosins may be difficult to reorient but may contract efficiently).
 
 We have added a paragraph in Section “Theoretical model” with a discussion, lines 153 to 156.
 
 Sarcomeres are muscle fibers that exhibit alternating polarity pattern. Such patterning is not evident in what the authors call 'sarcomeres' in Fig. 2. I believe the authors should revise their terminology and not loosely interpret existing classifications in the field.
 
 We thank the referee for raising this point. We have changed the terminology.
 
 Fig 2a: Is the cartoon for filament alignment incorrect for kappa>0?
 
 The cartoon is correct. In the revised manuscript we have explained more clearly the physical meaning of kappa in the text following Eq. 7. In the caption of Fig. 1 and of Fig. 2a, we have also clarified that when the absolute value of kappa is <1, then active tension is positive in all directions.
 
 Within the section "Requirements for fibrillar and banded patterns", it will be useful to show the figures for varying the different active parameters in the main figures.
 
 We have followed the referee’s suggestion and moved Supp. Fig. 1 of the original manuscript to the main figures.
 
 How do the authors decide if bundles are contractile or extensile? Why are contractile bundles under tension while extensile bundles are under compression? I would expect the opposite.
 
 We agree that this point deserves a more detailed explanation. In the revised manuscript and in the new Figure 4, we further develop this point. The fibrillar pattern forms when kappa<0. We further assume that -1<kappa<0, so that active tension is positive in all directions. In this regime, the deviatoric (anisotropic) part of active tension is extensile. However, following pattern formation and because of the interplay between active and viscous stresses, the total stress in the emerging bundles may become extensile or contractile, depending on whether the largest component of stress is perpendicular or along the bundle axis. This is now presented in the updated figure, with new panels presenting maps of the total tension. The text discussing this point has been rewritten and we hope that the new version is much clearer (lines 280 to 303).
 
 A contractile bundle tends to shorten, but it cannot do it because of boundary conditions or the interaction with other bundles. As a result they are in tension. Conversely, an extensile bundle tries to elongate, but being constrained, it becomes compressed. As an analogy, consider the cortex of a suspended cell. The cortex is contractile, but it cannot contract because of volume regulation in th cell, which is typically pressurized. As a result, tension in the cortex is positive, as shown by Laplace’s law [10.1016/j.tcb.2020.03.005]. We have tried to clarify this point in the revised manuscript.
 
 Can the authors reproduce alternating density patterns using the cytosim simulations? This is an important step in establishing the correspondence between the continuum theory and the agent-based model.
 
 We have addressed this point in our response to public comment (F) of this referee.
 
 The authors do not provide code or data.
 
 The finite element code with an input file require to run a representative simulation in the paper is now made available, see Ref. [74].
 
 The customizations of Cytosim needed to account for nematic order in our discrete network simulations are available, see Ref. [98].
 
 Reviewer #2 (Public Review):
 
 Summary:
 
 The article by Waleed et al discusses the self organization of actin cytoskeleton using the theory of active nematics. Linear stability analysis of the governing equations and computer simulations show that the system is unstable to density fluctuations and self organized structures can emerge. While the context is interesting, I am not sure whether the physics is new. Hence I have reservations about recommending this article.
 
 We thank the referee for these comments. In the revised manuscript, we have highlighted the novelty, particularly in the last paragraph of the introduction, the first two paragraphs of Section “Theoretical model”, and in the conclusions. Despite a very large literature on theoretical models of stress fibers, actin rings, and active nematics, we argue that the active self-organization of dense nematic structures from an isotropic and low-density gel has not been compellingly explained so far. Many models assume from the outset the presence of actin bundles, or explain their formation using localized activity gradients. The literature of active nematics has extensively studied symmetry breaking and the self-organization. However, most of the works assume initial orientational order. Only a few works study the emergence of nematic order from a uniform isotropic state, but consider dry systems lacking hydrodynamic interactions or incompressible and density-independent systems [37,38]. Yet, pattern formation in actomyosin gels is characterized by large density variations, and by highly compressible flows, which coordinate in a mechanism relying on an advective instability and self-reinforcing flows.
 
 Our theoretical model is not particularly novel, and as we mention in the manuscript, it can be particularized to different models used in the literature. However, we argue that it has the right minimal features to capture nematic self-organization in actomyosin gels. To our knowledge, no previous study explains the emergence of dense and nematic structures from a low-density isotropic gel as a result of activity and involving the advective instability typical of symmetry-breaking and patterning in the actomyosin cytoskeleton. These are important qualitative features of our results that resonate with a large experimental record, and as such, we believe that our work provides a new and compelling mechanism relying on self-organization to explain the prominence and diversity of patterns involving dense nematic bundles in the actomyosin cytoskeleton across species.
 
 Strengths:
 
 (i) Analytical calculations complemented with simulations (ii) Theory for cytoskeletal network
 
 Weaknesses:
 
 Not placed in the context or literature on active nematics.
 
 We agree with the referee that this was a weakness of the original manuscript. In the revised manuscript, within reasonable space constraints given the size and dynamism of the field of active nematics, we have placed our work in the context of this field (end of introduction and first two paragraphs of Section “Theoretical model”). The published version of our companion manuscript [45] also contributes to providing a clear context to our theoretical model within the field.
 
 Reviewer #2 (Recommendations For The Authors):
 
 The article by Waleed et al discusses the self organization of actin cytoskeleton using the theory of active nematics. Linear stability analysis of the governing equations and computer simulations show that the system is unstable to density fluctuations and self organized structures can emerge. While the context is interesting, I am not sure whether the physics is new. Hence I have reservations about recommending this article. I explain my questions comments below.
 
 We have responded to this comment above.
 
 (i) Active nematics including density variations have been dealt quite extensively in the literature. For example, the works of Sriram Ramaswami have dealt with this system including linear stability analysis, simulations etc. In what way is the present work different from the system that they have considered?
 
 (ii) Active flows leading to self organization has been a topic of discussion in many works. For example: (i) Annual Review of Fluid Mechanics, Vol. 43:637-659, 2010, https://doi.org/10.1146/annurev-fluid-121108-145434 (ii) S Santhosh, MR Nejad, A Doostmohammadi, JM Yeomans, SP Thampi, Journal of Statistical Physics 180, 699-709 (iii) M. G. Giordano1, F. Bonelli2, L. N. Carenza1,3, G. Gonnella1 and G. Negro1, Europhysics Letters, Volume 133, Number 5. In what way this work is different from any of these?
 
 (iii) I am confused about the models used in the paper. There is significant literature from Prof. Mike Cates group, Prof. Julia Yeomans group, Prof. Marchetti's group who all use similar governing equations. In the present paper, I find it hard to understand whether the model used is similar to the existing ones in literature or are there significant differences. It should be clarified.
 
 Response to (i), (ii) and (iii).
 
 We completely agree with this referee (and also the previous referee), that the contextualization of our work in the field of active nematics was very insufficient. In the revised manuscript, the last paragraph of the introduction and the first two paragraphs of Section “Theoretical model” now address this point. In short, previous active nematic models predicting patterns with density variations have been either for dry active matter (disregarding hydrodynamic interactions), or for suspensions of active particles moving in an incompressible flow. None of these previous works predict nematic pattern formation as a result of activity relying on the advective instability and self-reinforcing compressible flows, leading to high density and high order bundles surrounded by an isotropic low density phase. Yet, these are fundamental features observed in actomyosin gels. Many works deal with symmetry-breaking of a system with pre-existing order, but very few address how order emerges actively from an isotropic state. We thank the referee for pointing at the paper by Santhosh et al, who nicely make this argument and is now cited. Our mechanism is fundamentally different from that in Santhosh, whose model is incompressible and ignores density variations.
 
 We hope that the revised manuscript addresses this important concern.
 
 (i) >(iv) Below Eqn 6, it starts by saying that the “...origin..is clear...” Its not. I don't understand the physical origin of the instability, and this should be clarified, may be with some illustrations.
 
 We apologize for this unfortunate sentence, which we have rewritten in the revised manuscript (lines 181 to 185).
 
 Reviewer #3 (Public Review):
 
 The manuscript "Theory of active self-organization of dense nematic structures in the actin cytoskeleton" analysis self-organized pattern formation within a two-dimensional nematic liquid crystal theory and uses microscopic simulations to test the plausibility of some of the conclusions drawn from that analysis. After performing an analytic linear stability analysis that indicates the possibility of patterning instabilities, the authors perform fully non-linear numerical simulations and identify the emergence of stripelike patterning when anisotropic active stresses are present. Following a range of qualitative numerical observations on how parameter changes affect these patterns, the authors identify, besides isotropic and nematic stress, also active self-alignment as an important ingredient to form the observed patterns. Finally, microscopic simulations are used to test the plausibility of some of the conclusions drawn from continuum simulations.
 
 The paper is well written, figures are mostly clear and the theoretical analysis presented in both, main text and supplement, is rigorous. Mechano-chemical coupling has emerged in recent years as a crucial element of cell cortex and tissue organization and it is plausible to think that both, isotropic and anisotropic active stresses, are present within such effectively compressible structures. Even though not yet stated this way by the authors, I would argue that combining these two is of the key ingredients that distinguishes this theoretical paper from similar ones. The diversity of patterning processes experimentally observed is nicely elaborated on in the introduction of the paper, though other closely related previous work could also have been included in these references (see below for examples).
 
 We thank the referee for these comments and for the suggestion to emphasize the interplay of isotropic and anisotropic active tension, which is possible only in a compressible gel, as mentioned in the revised manuscript. We have emphasized this point in different places in the revised manuscript. We thank the suggestions of the referee to better connect with existing literature.
 
 To introduce the continuum model, the authors exclusively cite their own, unpublished pre-print, even though the final equations take the same form as previously derived and used by other groups working in the field of active hydrodynamics (a certainly incomplete list: Marenduzzo et al (PRL, 2007), Salbreux et al (PRL, 2009, cited elsewhere in the paper), Jülicher et al (Rep Prog Phys, 2018), Giomi (PRX, 2015),...). To make better contact with the broad active liquid crystal community and to delineate the present work more compellingly from existing results, it would be helpful to include a more comprehensive discussion of the background of the existing theoretical understanding on active nematics. In fact, I found it often agrees nicely with the observations made in the present work, an opportunity to consolidate the results that is sometimes currently missed out on. For example, it is known that self-organised active isotropic fluids form in 2D hexagonal and pulsatory patterns (Kumar et al, PRL, 2014), as well as contractile patches (Mietke et al, PRL 2019), just as shown and discussed in Fig. 2. It is also known that extensile nematics, \kappa<0 here, draw in material laterally of the nematic axis and expel it along the nematic axis (the other way around for \kappa>0, see e.g. Doostmohammadi et al, Nat Comm, 2018 "Active Nematics" for a review that makes this point), consistent with all relative nematic director/flow orientations shown in Figs. 2 and 3 of the present work.
 
 We thank the referee for these suggestions. Indeed, in the original submission we had outsourced much of the justification of the model and the relevant literature to a related pre-print, but this is not reasonable. The companion publication has now been accepted in the New Journal of Physics, with significant changes to better connect the work to the field of active nematics. A preprint reflecting those changes is available in Ref. [64], but we hope to reference the published paper that will come out soon.
 
 In the revised manuscript, we have significantly rewritten the Section “Theoretical model” to frame the continuum model in the context of the field of active nematics. While our model and results have commonalities with previous work, there are also important differences. We have highlighted the novelty of the present work along with the relation with previous studies and theoretical models in the last paragraph of the introduction and the first two paragraphs of Section “Theoretical model”. Furthermore, as suggested by the referee, we have made an effort to connect our results with previous work by Kumar, Mietke, Doostmohammadi and others.
 
 Regarding the last point alluded by the referee (“extensile nematics, \kappa<0 here, draw in material laterally of the nematic axis and expel it along the nematic axis”), the picture raised by the referee would be nuanced for our compressible system as compared to the incompressible systems discussed in that reference. As we have elaborated in our response to point (D) of Referee #1, our systems are overall contractile (with positive active tension in all directions), but the deviatoric component of the active tension can be either extensile or contractile. In our “extensile” models (left in Fig. 2c), material is drawn to laterally to the nematic axis but it is not expelled along this axis. Instead, it is “expelled” by turnover. In the revised manuscript, we have added a comment about this.
 
 The results of numerical simulations are well-presented. Large parts of the discussion of numerical observations - specifically around Fig. 3 - are qualitative and it is not clear why the analysis is restricted to \kappa<0. Some of the observations resonate with recent discussions in the field, for example the observation of effectively extensile dynamics in a contractile system is interesting and reminiscent of ambiguities about extensile/contractile properties discussed in recent preprints (https://arxiv.org/abs/2309.04224). It is convincingly concluded that, besides nematic stress on top of isotropic one, active self-alignment is a key ingredient to produce the observed patterns.
 
 We thank the referee for these comments. We are reluctant to extend the detailed analysis of emergent architectures and dynamics to the case \kappa > 0 as it leads to architectures not observed, to our knowledge, in actin networks. In the revised manuscript, we have expanded and clarified the characterization of emergent contractile/extensile networks by reporting the relative magnitude of stress along and perpendicular to the nematic direction. Our revised manuscript clearly shows that even though all of our simulations describe locally contractile systems with extensile anisotropic active tension, the emergent meso-structures can be either extensile or contractile, with the extensile ones exhibiting the usual bend-type instability (a secondary instability in our system) described classically for extensile active nematic systems. We have rewritten the text discussing this (lines 280 to 303), where we have placed these results in the context of recent work reporting the nontrivial relation between the contractility/extensibility of the local units vs the nematic pattern.
 
 I compliment the authors for trying to gain further mechanistic insights into this conclusion with microscopic filament simulations that are diligently performed. It is rightfully stated that these simulations only provide plausibility tests and, within this scope, I would say the authors are successful. At the same time, it leaves open questions that could have been discussed more carefully. For example, I wonder what can be said about the regime \kappa>0 (which is dropped ad-hoc from Fig. 3 onward) microscopically, in which the continuum theory does also predict the formation of stripe patterns - besides the short comment at the very end? How does the spatial inhomogeneous organization the continuum theory predicts fit in the presented, microscopic picture and vice versa?
 
 We thank the referee for this compliment. We think that the point raised by the referee is very interesting. It is reasonable to expect that the sign of \kappa may not be a constant but rather depend on S and \rho. Indeed, for a sparse network with low order, the progressive bundling by crosslinkers acting on nearby filaments is likely to produce a large active stress perpendicular to the nematic direction, whereas in a dense and highly ordered region, myosin motors are more likely to effectively contract along the nematic direction whereas there is little room for additional lateral contraction by additional bundling. As discussed in our response to referee #1, we believe that studying the formation of patterns using the discrete network simulations is far beyond the scope of our work. We discuss in lines 332 to 341, as well as in the last paragraph of the conclusions, the scope and limitations of our discrete network simulations.
 
 Overall, the paper represents a valuable contribution to the field of active matter and, if strengthened further, might provide a fruitful basis to develop new hypothesis about the dynamic self-organisation of dense filamentous bundles in biological systems.
 
 Reviewer #3 (Recommendations For The Authors):
 
 The statement "the porous actin cytoskeleton is not a nematic liquid-crystal because it can adopt extended isotropic/low-order phases" is difficult to understand and should be clarified, as the next paragraph starts formulating a nematic active liquid crystal theory. Do the authors mean a crystal that "Tends to be in a disordered phase?", according to its equilibrium properties? It would still be a "nematic liquid crystal", only its ground state is not a nematic phase.
 
 We agree with the referee, and we hope that changes in the introduction and in Section “Theoretical model” address this comment.
 
 I could not find what Frank energy is precisely used, that would be helpful information.
 
 In the revised manuscript, we have provided the expression for the nematic free energy in Eq. 3.
 
 The Significance of green/purple arrows in Fig 2a sketch unclear, green arrows also in b,c, do they represent the same quantity? From the simulations images it is overall it is very difficult to see how the flows are oriented near the high-density regions (i.e. if they are towards / away from the strip).
 
 We thank the referee for bringing this up. The colorcodings of the sketches were confusing. The modified figures (Fig. 1(c) and Fig. 2(a)) present now a clearer and unified representation of anisotropic tension. The green arrows in Fig. 2(c) represent the out-of-equilibrium flows in the steady state. We agree that the zoom is insufficient to resolve the flow structure. For this reason, in the revised Fig. 2, we have added additional panels showing the flow with higher resolution.
 
 It is currently unclear how the linear stability results - beyond identification of the parameter \delta - inform any of the remaining manuscript. Quantitative comparisons of the various length scales seen in simulated patterns (e.g. Fig. 2b, 3c etc) with linear predictions and known characteristic length scales would be instructive mechanistically, would make the overall presentation more compelling and probes limitations of linear results.
 
 In the revised manuscript, we have provided further information so that the readers can appreciate the predictions and limitations of the linear stability results. We have added a sentence and a Figure to show that, in addition to the critical activity, the linear theory provides a good prediction of the wavelengh of the pattern. See lines 199 to 201.
 
 It is not clear what is meant by "[bundle-formation] requires that active tension perpendicular to nematic orientation is larger than along this direction", and therefore also not why that would be "counter-intuitive". If interpreted naively, I would say that a large tension brings in more filaments into the bundle, so that may well be an obviously helpful feature for bundle formation and maintenance. In any case, it would be helpful if clarity is improved throughout when arguments about "directions of tensions" are made.
 
 We have significantly rewritten the first paragraphs of section “Microscopic origin…” to clarify this point (lines 330 to 339). This paragraph, along with other changes in the manuscript such as the explanation of Eq. 7 or the discussion about the stress anisotropy in the new version of Fig. 4 (see lines 280 to 303), provide a better explanation of this important point.
 
 All density color bars: Shouldn't they rather be labelled \rho/\rho_0?
 
 Yes! We have corrected this typo.
 
 Scalar product missing in caption definition of order parameter Fig. 2
 
 We have corrected this typo.
 
 Fig. 3a: I suggest to put the expression for q0 in the caption
 
 We have changed q_0 by S_0 and clarified its meaning in the caption of what now is Fig 4.
 
 Paragraph on bottom right of page 6 should several times probably refer to Fig. 3c(...), instead of Fig. 3b
 
 We have corrected this typo.
 
 AuthorResponse
Visit annotations in context

Tags

Review 1

Summary

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

arxiv.org/abs/2306.15352v4
www.biorxiv.org www.biorxiv.org

A network regularized linear model to infer spatial expression pattern for single cell

5
1. Public_Reviews 10 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  The study is useful for advancing spatial transcriptomics through its novel regression-based linear model (glmSMA) that integrates single-cell RNA-seq with spatial reference atlases, and its methodological framework is convincing. The approach demonstrates notable utility by enabling higher-resolution cell mapping across multiple biological systems and spatial platforms compared to existing tools.
  
  Summary
2. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Liu et al., present glmSMA, a network-regularized linear model that integrates single-cell RNA-seq data with spatial transcriptomics, enabling high-resolution mapping of cellular locations across diverse datasets. Its dual regularization framework (L1 for sparsity and generalized L2 via a graph Laplacian for spatial smoothness) demonstrates robust performance of their model. It offers novel tools for spatial biology, despite some gaps in fully addressing spatial communication.
  
  The study presents a clear methodological framework that balances sparsity and smoothness, with parameter guidelines for different tissue contexts. It is commendable for its application to multiple spatial omics platforms, including both sequencing-based and imaging-based data, with results that can be generalized across both structured and less-structured tissues. After revision, there is a more transparent discussion of assumptions, including the correlation between expression and physical distance, and how performance may vary by tissue heterogeneity.
  
  Limitations are modest - the spatial communication application is mentioned but not fully developed, and resolution reporting is primarily qualitative, which may limit direct comparability between datasets. The imaging-based validation is currently limited to simulated or lower-plex data, and expansion to high-plex datasets would further support platform versatility, although this is not essential to the core claims.
  
  Overall, the manuscript delivers on its main objective, which is to present and validate a practical, flexible, and accurate framework for spatial mapping. The methods are clearly described, and the resource will be useful for researchers seeking to integrate single-cell and spatial datasets in diverse biological contexts.
  
  Review 1
3. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The author proposes a novel method for mapping single-cell data to specific locations with higher resolution than several existing tools.
  
  Strengths:
  
  The spatial mapping tests were conducted on various tissues, including the mouse cortex, human PDAC, and intestinal villus.
  
  Comments on revised version:
  
  The authors have sufficiently addressed all of my comments.
  
  Review 2
4. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  The authors have provided a thorough and constructive response to the comments. They effectively addressed concerns regarding the dependence on marker gene selection by detailing the incorporation of multiple feature selection strategies, such as highly variable genes and spatially informative markers (e.g., via Moran's I), which enhance glmSMA's robustness even when using gene-limited reference atlases.
  
  Furthermore, the authors thoughtfully acknowledged the assumption underlying glmSMA-that transcriptionally similar cells are spatially proximal-and discussed both its limitations and empirical robustness in heterogeneous tissues such as human PDAC. Their use of real-world, heterogeneous datasets to validate this assumption demonstrates the method's practical utility and adaptability.
  
  Overall, the response appropriately contextualizes the limitations while reinforcing the generalizability and performance of glmSMA. The authors' clarifications and experimental justifications strengthen the manuscript and address the reviewer's concerns in a scientifically sound and transparent manner.
  
  Review 3
5. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Author response:
  
  The following is the authors’ response to the original reviews.
  
  Reviewer #1 (Public review):
  
  Summary:
  
  Liu et al., present glmSMA, a network-regularized linear model that integrates single-cell RNA-seq data with spatial transcriptomics, enabling high-resolution mapping of cellular locations across diverse datasets. Its dual regularization framework (L1 for sparsity and generalized L2 via a graph Laplacian for spatial smoothness) demonstrates robust performance of their model and offers novel tools for spatial biology, despite some gaps in fully addressing spatial communication.
  
  Overall, the manuscript is commendable for its comprehensive benchmarking across different spatial omics platforms and its novel application of regularized linear models for cell mapping. I think this manuscript can be improved by addressing method assumptions, expanding the discussion on feature dependence and cell type-specific biases, and clarifying the mechanism of spatial communication.
  
  The conclusions of this paper are mostly well supported by data, but some aspects of model developmentand performance evaluation need to be clarified and extended.
  
  We are thankful for the positive comments and have made changes following the reviewer's advice, as detailed below.
  
  (1) What were the assumptions made behind the model? One of them could be the linear relationship between cellular gene expression and spatial location. In complex biological tissues, non-linear relationships could be present, and this would also vary across organ systems and species. Similarly, with regularization parameters, they can be tuned to balance sparsity and smoothness adequately but may not hold uniformly across different tissue types or data quality levels. The model also seems to assume independent errors with normal distribution and linear additive effects - a simplification that may overlook overdispersion or heteroscedasticity commonly observed in RNA-seq data.
  
  Thank you for this comment. We acknowledge that the non-linear relationships can be present in complex tissues and may not be fully captured by a linear model.
  
  Our choice of a linear model was guided by an investigation of the relationship in the current datasets, which include intestinal villus, mouse brain, and fly embryo.There is a linear correlation between expression distance and physical distance [Nitzan et al]. Within a given anatomical structure, cells in closer proximity exhibit more similar expression patterns (Fig. 3c). In tissues where non-linear relationships are more prevalent—such as the human PDAC sample—our mapping results remain robust. We acknowledge that we have not yet tested our algorithm in highly heterogeneous regions like the liver, and we plan to include such analyses in future work if necessary.
  
  Regarding the regularization parameters, we agree that the balance between sparsity and smoothness is sensitive to tissue-specific variation and data quality. In our current implementation, we explored a range of values to find robust defaults. Supplementary Figure 7 illustrates the regularization path for cell assignment in the fly embryo.
  
  The choice of L1 and L2 regularization parameters is crucial for balancing sparsity and smoothness in spatial mapping.
  
  For Structured Tissues (brain):
  
  Moderate L1 to ensure cells are localized.
  
  Small to moderate L2 to maintain local smoothness without blurring distinct regions.
  
  For Less Structured (PDAC):
  
  Slightly lower L1 to allow cells to be associated with multiple regions if boundaries are ambiguous.
  
  Higher L2 to stabilize mappings in noisy or mixed regions.
  
  (2) The performance of glmSMA is likely sensitive to the number and quality of features used. With too few features, the model may struggle to anchor cells correctly due to insufficient discriminatory power, whereas too many features could lead to overfitting unless appropriately regularized. The manuscript briefly acknowledges this issue, but further systematic evaluation of how varying feature numbers affect mapping accuracy would strengthen the claims, particularly in settings where marker gene availability is limited. A simple way to show some of this would be testing on multiple spatial omics (imaging-based) platforms with varying panel sizes and organ systems. Related to this, based on the figures, it also seems like the performance varies by cell type. What are the factors that contribute to this? Variability in expression levels, RNA quantity/quality? Biases in the panel? Personally, I am also curious how this model can be used similarly/differently if we have a FISH-based, high-plex reference atlas. Additional explanation around these points would be helpful for the readers.
  
  Thank you for this thoughtful comment. The performance of our method is indeed sensitive to the number and quality of selected features. To optimize feature selection, we employed multiple strategies, including Moran’s I statistic, identification of highly variable genes, and the Seurat pipeline to detect anchor genes linking the spatial transcriptomics data with the reference atlas. The number of selected markers depends on the quality of the data. For highquality datasets, fewer than 100 markers are typically sufficient for prediction. To select marker genes, we applied the following optional strategies:
  
  (1) Identifying highly variable genes (HVGs).
  
  (2) Calculating Moran’s I scores for all genes to assess spatial autocorrelation.
  
  (3) Generating anchor genes based on the integration of the reference atlas and scRNA-seq data using Seurat.
  
  We evaluated our method across diverse tissue types and platforms—including Slide-seq, 10x Visium, and Virtual-FISH—which represent both sequencing-based and imaging-based spatial transcriptomics technologies. Our model consistently achieved strong performance across these settings. It's worth noting that the performance of other methods, such as CellTrek [Wei et al] and novoSpaRc [Nitzan et al], also depends heavily on feature selection. In particular, performance degrades substantially when fewer features are used. For fair comparison across different methods, the same set of marker genes was used. Under this condition, our method outperformed the others based on KL divergence (Fig. 2b, Fig. 5g).
  
  To assess the effect of marker gene quantity, we randomly selected subsets of 2,000, 1500, 1,000, 700, 500, and 200 markers from the original set. As the number of markers decreases, mapping performance declines, which is expected due to the reduction in available spatial information. This result underscores the general dependence of spatial mapping accuracy on both the number and quality of informative marker genes (Supplementary Fig. 10).
  
  We do not believe that the observed performance is directly influenced by cell type composition. Major cell types are typically well-defined, and rare cell types comprise only a small fraction of the dataset. For these rare populations, a single misclassification can disproportionately impact metrics like KL divergence due to small sample size. However, this does not necessarily indicate a systematic cell type–specific bias in the mapping. We incorporated a high-resolution Slide-seq dataset from the mouse hippocampus to evaluate the influence of cell type composition on the algorithm’s performance [Stickels et al., 2020]. Most cell types within the CA1, CA2, CA3, and DG regions were accurately mapped to their original anatomical locations (Fig. 5e, f, g).
  
  (3) Application 3 (spatial communication) in the graphical abstract appears relatively underdeveloped. While it is clear that the model infers spatial proximities, further explanation of how these mappings translate into insights into cell-cell communication networks would enhance the biological relevance of the findings.
  
  Thank you for this valuable feedback. We agree that further elaboration on the connection between spatial proximity and cell–cell communication would enhance the biological interpretation of our results. While our current model focuses on inferring spatial relationships, we may provide some cell-cell communications in the future.
  
  (4) What is the final resolution of the model outputs? I am assuming this is dictated by the granularity of the reference atlas and the imposed sparsity via the L1 norm, but if there are clear examples that would be good. In figures (or maybe in practice too), cells seem to be assigned to small, contiguous patches rather than pinpoint single-cell locations, which is a pragmatic compromise given the inherent limitations of current spatial transcriptomics technologies. Clarification on the precise spatial scale (e.g., pixel or micrometer resolution) and any post-mapping refinement steps would be beneficial for the users to make informed decisions on the right bioinformatic tools to use.
  
  Thank you for the comment. For each cell, our algorithm generates a probability vector that indicates its likely spatial assignment along with coordinate information. In our framework, each cell is mapped to one or more spatial spots with associated probabilities. Depending on the amount of regularization through L1 and L2 norms, a cell may be localized to a small patch or distributed over a broader domain (Supplementary Fig. 5 & 7). For the 10x Visium data, we applied a repelling algorithm to enhance visualization [Wei et al]. If a cell’s original location is already occupied, it is reassigned to a nearby neighborhood to avoid overlap. The users can also see the entire regularization path by varying the penalty terms.
  
  Nitzan M, Karaiskos N, Friedman N, Rajewsky N. Gene expression cartography. Nature. 2019;576(7785):132-137. doi:10.1038/s41586-019-1773-3
  
  Wei, R. et al. (2022) ‘Spatial charting of single-cell transcriptomes in tissues’, Nature Biotechnology, 40(8), pp. 1190–1199. doi:10.1038/s41587-022-01233-1.
  
  Stickels, R.R. et al. (2020) ‘Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-SEQV2’, Nature Biotechnology, 39(3), pp. 313–319. doi:10.1038/s41587-020-0739-1.
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The author proposes a novel method for mapping single-cell data to specific locations with higher resolution than several existing tools.
  
  Strengths:
  
  The spatial mapping tests were conducted on various tissues, including the mouse cortex, human PDAC, and intestinal villus.
  
  Weakness:
  
  (1) Although the researchers claim that glmSMA seamlessly accommodates both sequencing-based and image-based spatial transcriptomics (ST) data, their testing primarily focused on sequencingbased ST data, such as Visium and Slide-seq. To demonstrate its versatility for spatial analysis, the authors should extend their evaluation to imaging-based spatial data.
  
  Thank you for the comment. We have tested our algorithm on the virtual FISH dataset from the fly embryo, which serves as an example of image-based spatial omics data (Fig. 4c). However, such datasets often contain a limited number of available genes. To address this, we will conduct additional testing on image-based data if needed. The Allen Brain Atlas provides high-quality ISH data, and we can select specific brain regions from this resource to further evaluate our algorithm if necessary [Lein et al]. Currently, we plan to focus more on the 10x Visium platform, as it supports whole-transcriptome profiling and offers a wide range of tissue samples for analysis.
  
  (2) The definition of "ground truth" for spatial distribution is unclear. A more detailed explanation is needed on how the "ground truth" was established for each spatial dataset and how it was utilized for comparison with the predicted distribution generated by various spatial mapping tools.
  
  Thank you for the comment. To clarify how ground truth is defined across different tissues, we provided the following details. Direct ground truth for cell locations is often unavailable in scRNA-seq data due to experimental constraints. To address this, we adopted alternative strategies for estimating ground truth in each dataset:
  
  10x Visium Data: We used the cell type distribution derived from spatial transcriptomics (ST) data as a proxy for ground truth. We then computed the KL divergence between this distribution and our model's predictions for performance assessment.
  
  Slide-seq Data: We validated predictions by comparing the expression of marker genes between the reconstructed and original spatial data.
  
  Fly Embryo Data: We used predicted cell locations from novoSpaRc as a reference for evaluating our algorithm.
  
  These strategies allowed us to evaluate model performance even in the absence of direct cell location data. In addition, we can apply multiple evaluation strategies within a single dataset.
  
  (3) In the analysis of spatial mapping results using intestinal villus tissue, only Figure 3d supports their findings. The researchers should consider adding supplemental figures illustrating the spatial distribution of single cells in comparison to the ground truth distribu tion to enhance the clarity and robustness of their investigation.
  
  Thank you for the comment. In the intestinal dataset, only six large domains were defined. As a result, the task for this dataset is relatively simple—each cell only needs to be assigned to one of the six domains. As the intestinal villus is a relatively simple tissue, most existing algorithms performed well on it. For this reason, we did not initially provide extensive details in the main text.
  
  (4) The spatial mapping tests were conducted on various tissues, including the mouse cortex, human PDAC, and intestinal villus. However, the original anatomical regions are not displayed, making it difficult to directly compare them with the predicted mapping results. Providing ground truth distributions for each tested tissue would enhance clarity and facilitate interpretation. For instance, in Figure 2a and Supplementary Figures 1 and 2, only the predicted mapping results are shown without the corresponding original spatial distribution of regions in the mouse cortex. Additionally, in Figure 3c, four anatomical regions are displayed, but it is unclear whether the figure represents the original spatial regions or those predicted by glmSMA. The authors are encouraged to clarify this by incorporating ground truth distributions for each tissue.
  
  Thank you for the comment. To improve visualization, we included anatomical structures alongside the mapping results in the next version, wherever such structures are available (e.g., mouse brain cortex, human PDAC sample, etc.). Major cell type assignments for the PDAC samples, along with anatomical structures, are shown in Supplementary Figure 9. Most of these cell types were correctly mapped to their corresponding anatomical regions.
  
  (5) The cell assignment results from the mouse hippocampus (Supplementary Figure 6) lack a corresponding ground truth distribution for comparison. DG and CA cells were evaluated solely based on the gene expression of specific marker genes. Additional analyses are needed to further validate the robustness of glmSMA's mapping performance on Slide-seq data from the mouse hippocampus.
  
  Thank you for the comment. The ground truth for DG and CA cells was not available. To better evaluate the model's performance, we computed the KL divergence between the original and predicted cell type distributions, following the same approach used for the 10x Visium dataset. We identified a higher-quality dataset for the mouse hippocampus and used it to evaluate our algorithm. Additionally, we employed KL divergence as an alternative strategy to validate and benchmark our results (Fig. 5e, f, g). Most CA cells, including CA1, CA2, and CA3 principal cells, were correctly assigned back to the CA region. Dentate principal cells were accurately mapped to the DG region (Fig. 5e, f).
  
  (6) The tested spatial datasets primarily consist of highly structured tissues with well-defined anatomical regions, such as the brain and intestinal villus. Anatomical regions are not distinctly separated, such as liver tissue. Further evaluation of such tissues would help determine the method's broader applicability.
  
  Thank you for the insightful comment. We agree that many spatial datasets used in our study are from tissues with well-defined anatomical regions. To address the applicability of glmSMA in tissues without clearly separated anatomical structures, we applied glmSMA to the Drosophila embryo, which represents a tissue with relatively continuous spatial patterns and lacks well-demarcated anatomical boundaries compared to organs like the brain or intestinal villus.
  
  Despite this less structured spatial organization, glmSMA demonstrated robust performance in the fly embryo, accurately mapping cells to their correct spatial spots based on gene expression profiles. This result indicates that glmSMA is not strictly limited to highly structured tissues and can generalize to tissues with more continuous or gradient-like spatial architectures. These results suggest that glmSMA has broader applicability beyond highly compartmentalized tissues.
  
  Lein, E., Hawrylycz, M., Ao, N. et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168–176 (2007). https://doi.org/10.1038/nature05453
  
  Reviewer #3 (Public review):
  
  The authors aim to develop glmSMA, a network-regularized linear model that accurately infers spatial gene expression patterns by integrating single-cell RNA sequencing data with spatial transcriptomics reference atlases. Their goal is to reconstruct the spatial organization of individual cells within tissues, overcoming the limitations of existing methods that either lack spatial resolution or sensitivity.
  
  Strengths:
  
  (1) Comprehensive Benchmarking:
  
  Compared against CellTrek and Novosparc, glmSMA consistently achieved lower Kullback-Leibler divergence (KL divergence) scores, indicating better cell assignment accuracy.
  
  Outperformed CellTrek in mouse cortex mapping (90% accuracy vs. CellTrek's 60%) and provided more spatially coherent distributions.
  
  (2) Experimental Validation with Multiple Real-World Datasets:
  
  The study used multiple biological systems (mouse brain, Drosophila embryo, human PDAC, intestinal villus) to demonstrate generalizability.
  
  Validation through correlation analyses, Pearson's coefficient, and KL divergence support the accuracy of glmSMA's predictions.
  
  We thank reviewer #3 for their positive feedback and thoughtful recommendations.
  
  Weaknesses:
  
  (1) The accuracy of glmSMA depends on the selection of marker genes, which might be limited by current FISH-based reference atlases.
  
  We agree that the accuracy of glmSMA is influenced by the selection of marker genes, and that current FISH-based reference atlases may offer a limited gene set. To address this, we incorporate multiple feature selection strategies, including highly variable genes and spatially informative genes (e.g., via Moran’s I), to optimize performance within the available gene space. As more comprehensive reference atlases become available, we expect the model’s accuracy to improve further.
  
  (2) glmSMA operates under the assumption that cells with similar gene expression profiles are likely to be physically close to each other in space which not be true under various heterogeneous environments.
  
  Thank you for raising this important point. We agree that glmSMA operates under the assumption that cells with similar gene expression profiles tend to be spatially proximal, and this assumption may not strictly hold in highly heterogeneous tissues where spatial organization is less coupled to transcriptional similarity.
  
  To address this concern, we specifically tested glmSMA on human PDAC samples, which represent moderately heterogeneous environments characterized by complex tumor microenvironments, including a mixture of ductal cells, cancer cells, stromal cells, and other components. Despite this heterogeneity, glmSMA successfully mapped major cell types to their expected anatomical regions, demonstrating that the method is robust even in the presence of substantial cellular diversity and spatial complexity.
  
  This result suggests that while glmSMA relies on the assumption of spatialtranscriptomic correlation, the method can tolerate a reasonable degree of spatial heterogeneity without a significant loss of performance. Nevertheless, we acknowledge that in extremely disorganized or highly mixed tissues where transcriptional similarity is decoupled from spatial proximity, the performance may be affected.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.11.20.624541v2
www.biorxiv.org www.biorxiv.org

Single-cell profiling of trabecular meshwork identifies mitochondrial dysfunction in a glaucoma model that is protected by vitamin B3 treatment

4
1. Public_Reviews 10 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This study provides a fundamental advancement in our understanding of trabecular meshwork cell diversity and its role in eye pressure regulation and glaucoma using multimodal single-cell analysis, spatial validation, and functional testing that go beyond the current state-of-the-art. The study demonstrates that mitochondrial dysfunction, specifically in one of three distinct cell subtypes (TM3), contributes to elevated IOP in a genetic mouse model of glaucoma carrying a mutation in the transcription factor Lmx1b. While the identification of TM3 cells as metabolically specialized is compelling, there is somewhat limited evidence linking mitochondrial dysfunction to the Lmx1b mutation in TM3 cells.
 
 Summary
2. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This study provides a comprehensive single-cell and multiomic characterization of trabecular meshwork (TM) cells in the mouse eye, a structure critical to intraocular pressure (IOP) regulation and glaucoma pathogenesis. Using scRNA-seq, snATAC-seq, immunofluorescence, and in situ hybridization, the authors identify three transcriptionally and spatially distinct TM cell subtypes. The study further demonstrates that mitochondrial dysfunction specifically in one subtype (TM3) contributes to elevated IOP in a genetic mouse model of glaucoma carrying a mutation in the transcription factor Lmx1b. Importantly, treatment with nicotinamide (vitamin B3), known to support mitochondrial health, prevents IOP elevation in this model. The authors also link their findings to human datasets, suggesting the existence of analogous TM3-like cells with potential relevance to human glaucoma.
 
 Strengths:
 
 The study is methodologically rigorous, integrating single-cell transcriptomic and chromatin accessibility profiling with spatial validation and in vivo functional testing. The identification of TM subtypes is consistent across mouse strains and institutions, providing robust evidence of conserved TM cell heterogeneity. The use of a glaucoma model to show subtype-specific vulnerability-combined with a therapeutic intervention-gives the study strong mechanistic and translational significance. The inclusion of chromatin accessibility data adds further depth by implicating active transcription factors such as LMX1B, a gene known to be associated with glaucoma risk. The integration with human single-cell datasets enhances the potential relevance of the findings to human disease.
 
 Weaknesses:
 
 Although the LMX1B transcription factor is implicated as a key regulator in TM3 cells, its role in directly controlling mitochondrial gene expression is not fully explored. Additional analysis of motif accessibility or binding enrichment near relevant target genes could substantiate this mechanistic link. The therapeutic effect of vitamin B3 is clearly demonstrated phenotypically, but the underlying cellular and molecular mechanisms remain somewhat underdeveloped-for instance, changes in mitochondrial function, oxidative stress markers, or NAD+ levels are not directly measured. While the human relevance of TM3 cells is suggested through marker overlap, more quantitative approaches, such as cell identity mapping or gene signature scoring in human datasets, would strengthen the translational connection.
 
 Overall, this is a compelling and carefully executed study that offers significant advances in our understanding of TM cell biology and its role in glaucoma. The integration of multimodal data, disease modeling, and therapeutic testing represents a valuable contribution to the field. With additional mechanistic depth, the study has the potential to become a foundational resource for future research into IOP regulation and glaucoma treatment.
 
 Review 1
3. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 In this study, the authors perform multimodal single-cell transcriptomic and epigenomic profiling of 9,394 mouse TM cells, identifying three transcriptionally distinct TM subtypes with validated molecular signatures. TM1 cells are enriched for extracellular matrix genes, TM2 for secreted ligands supporting Schlemm's canal, and TM3 for contractile and mitochondrial/metabolic functions. The transcription factor LMX1B, previously linked to glaucoma, shows the highest expression in TM3 cells and appears to regulate mitochondrial pathways. In Lmx1bV265D mutant mice, TM3 cells exhibit transcriptional signs of mitochondrial dysfunction associated with elevated IOP. Notably, vitamin B3 treatment significantly mitigates IOP elevation, suggesting a potential therapeutic avenue. This is an excellent and collaborative study involving investigators from two institutions, offering the most detailed single-cell transcriptomic and epigenetic profiling of the mouse limbal tissues-including both TM and Schlemm's canal (SC), from wild-type and Lmx1bV265D mutant mice. The study defines three TM subtypes and characterizes their distinct molecular signatures, associated pathways, and transcriptional regulators. The authors also compare their dataset with previously published murine and human studies, including those by Van Zyl et al., providing valuable cross-species insights.
 
 Strengths:
 
 (1) Comprehensive dataset with high single-cell resolution
 
 (2) Use of multiple bioinformatic and cross-comparative approaches
 
 (3) Integration of 3D imaging of TM and SC for anatomical context
 
 (4) Convincing identification and validation of three TM subtypes using molecular markers.
 
 Weaknesses:
 
 (1) Insufficient evidence linking mitochondrial dysfunction to TM3 cells in Lmx1bV265D mice: While the identification of TM3 cells as metabolically specialized and Lmx1b-enriched is compelling, the proposed link between Lmx1b mutation and mitochondrial dysfunction remains underdeveloped. It is unclear whether mitochondrial defects are a primary consequence of Lmx1b-mediated transcriptional dysregulation or a secondary response to elevated IOP. Although authors have responded to this, the manuscript is not sufficiently altered to address these points. I would like to suggest that authors tone down mitochondrial connection with Lmx1b from the title and abstract, and clearly discuss that these events are associated, and future work is needed to dissect the role of mitochondria in this pathway. Furthermore, the protective effects of nicotinamide (NAM) are interpreted as evidence of mitochondrial involvement, but no direct mitochondrial measurements (e.g., immunostaining, electron microscopy, OCR assays) are provided. It is essential to validate mitochondrial dysfunction in TM3 cells using in vivo functional assays to support the central conclusion of the paper. Without this, the claim that mitochondrial dysfunction drives IOP elevation in Lmx1bV265D mice remains speculative. Alternatively, authors should consider revising their claims that mitochondrial dysfunction in these mice is a central driver of TM dysfunction.
 
 (2) Mechanism of NAM-mediated protection is unclear: The manuscript states that NAM treatment prevents IOP elevation in Lmx1bV265D mice via metabolic support, yet no data are shown to confirm that NAM specifically rescues mitochondrial function. Do NAM-treated TM3 cells show improved mitochondrial integrity? Are reactive oxygen species (ROS) reduced? Does NAM also protect RGCs from glaucomatous damage? Addressing these points would clarify whether the therapeutic effects of NAM are indeed mitochondrial.
 
 Review 2
4. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This study provides a comprehensive single-cell and multiomic characterization of trabecular meshwork (TM) cells in the mouse eye, a structure critical to intraocular pressure (IOP) regulation and glaucoma pathogenesis. Using scRNA-seq, snATAC-seq, immunofluorescence, and in situ hybridization, the authors identify three transcriptionally and spatially distinct TM cell subtypes. The study further demonstrates that mitochondrial dysfunction, specifically in one subtype (TM3), contributes to elevated IOP in a genetic mouse model of glaucoma carrying a mutation in the transcription factor Lmx1b. Importantly, treatment with nicotinamide (vitamin B3), known to support mitochondrial health, prevents IOP elevation in this model. The authors also link their findings to human datasets, suggesting the existence of analogous TM3-like cells with potential relevance to human glaucoma.
 
 Strengths:
 
 The study is methodologically rigorous, integrating single-cell transcriptomic and chromatin accessibility profiling with spatial validation and in vivo functional testing. The identification of TM subtypes is consistent across mouse strains and institutions, providing robust evidence of conserved TM cell heterogeneity. The use of a glaucoma model to show subtype-specific vulnerability, combined with a therapeutic intervention-gives the study strong mechanistic and translational significance. The inclusion of chromatin accessibility data adds further depth by implicating active transcription factors such as LMX1B, a gene known to be associated with glaucoma risk. The integration with human single-cell datasets enhances the potential relevance of the findings to human disease.
 
 We thank the reviewers for their thorough reading of our manuscript and helpful comments.
 
 Weaknesses:
 
 (1) Although the LMX1B transcription factor is implicated as a key regulator in TM3 cells, its role in directly controlling mitochondrial gene expression is not fully explored. Additional analysis of motif accessibility or binding enrichment near relevant target genes could substantiate this mechanistic link.
 
 We show that the Lmx1b mutation induces mitochondrial dysfunction with mitochondrial gene expression changes but agree with the referee in that we do not show direct regulation of mitochondrial genes by LMX1B. Emerging data suggest that LMX1B regulates the expression of mitochondrial genes in other cell types [1, 2] making the direct link reasonable. Future work that is beyond the scope of the current paper will focus on sequencing cells at earlier timepoints to help distinguish gene expression changes associated with the V265D mutation from those secondary to ongoing disease and elevated IOP. Additional studies, including ATAC seq at more ages, ChIP-seq and/or Cut and Run/Tag (in TM cells) will be necessary to directly investigate LMX1B target genes.
 
 As we studied adult mice, mitochondrial gene expression changes could be secondary to other disease induced stresses. Because we did not intend to say we have shown a direct link, we have now added a sentence to the discussion ensure clarity.
 
 Lines 932-934: “Although our studies show a clear effect of the Lmx1b mutation on mitochondria, future studies are needed to determine if LMX1B directly modulates mitochondrial genes in V265D mutant TM cells”
 
 (2) The therapeutic effect of vitamin B3 is clearly demonstrated phenotypically, but the underlying cellular and molecular mechanisms remain somewhat underdeveloped - for instance, changes in mitochondrial function, oxidative stress markers, or NAD+ levels are not directly measured.
 
 We agree that further experiments towards a fuller mechanistic understanding of vitamin B3’s therapeutic effects are needed. Such experiments are planned but are beyond the scope of this paper, which is already very large (7 Figures and 16 Supplemental Figures).
 
 (3) While the human relevance of TM3 cells is suggested through marker overlap, more quantitative approaches, such as cell identity mapping or gene signature scoring in human datasets, would strengthen the translational connection.
 
 We appreciate the reviewer’s suggestion and agree that additional quantitative analyses will further strengthen the translational relevance of TM3 cells. It is not yet clear if humans have a direct TM3 counterpart or if TM cell roles are compartmentalized differently between human cell types. We are currently limited in our ability to perform these comparative analyses. Specifically, we were unable to obtain permission to use the underlying dataset from Patel et al., and our access to the Van Zyl et al. dataset was through the Single Cell Portal, which does not support more complex analyses (ex. cell identity mapping or gene signature scoring). Differences between human studies themselves also affect these comparisons. Future work aimed at resolving differences and standardizing human TM cell annotations, as well as cross species comparisons are needed (working groups exist and this ongoing effort supports 3 human TM cell subtypes as also reported by Van Zyl). This is beyond what we are currently able to do for this paper. We present a comprehensive assessment using readily available published resources.
 
 Reviewer #2 (Public review):
 
 Summary:
 
 This elegant study by Tolman and colleagues provides fundamental findings that substantially advance our knowledge of the major cell types within the limbus of the mouse eye, focusing on the aqueous humor outflow pathway. The authors used single-cell and single-nuclei RNAseq to very clearly identify 3 subtypes of the trabecular meshwork (TM) cells in the mouse eye, with each subtype having unique markers and proposed functions. The U. Columbia results are strengthened by an independent replication in a different mouse strain at a separate laboratory (Duke). Bioinformatics analyses of these expression data were used to identify cellular compartments, molecular functions, and biological processes. Although there were some common pathways among the 3 subtypes of TM cells (e.g., ECM metabolism), there also were distinct functions. For example:
 
 TM1 cell expression supports heavy engagement in ECM metabolism and structure, as well as TGFb2 signaling.
 
 TM2 cells were enriched in laminin and pathways involved in phagocytosis, lysosomal function, and antigen expression, as well as End3/VEGF/angiopoietin signaling.
 
 TM3 cells were enriched in actin binding and mitochondrial metabolism.
 
 They used high-resolution immunostaining and in situ hybridization to show that these 3 TM subtypes express distinct markers and occupy distinct locations within the TM tissue. The authors compared their expression data with other published scRNAseq studies of the mouse as well as the human aqueous outflow pathway. They used ATAC-seq to map open chromatin regions in order to predict transcription factor binding sites. Their results were also evaluated in the context of human IOP and glaucoma risk alleles from published GWAS data, with interesting and meaningful correlations. Although not discussed in their manuscript, their expression data support other signaling pathways/ proteins/ genes that have been implicated in glaucoma, including: TGFb2, BMP signaling (including involvement of ID proteins), MYOC, actin cytoskeleton (CLANs), WNT signaling, etc.
 
 In addition to these very impressive data, the authors used scRNAseq to examine changes in TM cell gene expression in the mouse glaucoma model of mutant Lmxb1-induced ocular hypertension. In man, LMX1B is associated with Nail-Patella syndrome, which can include the development of glaucoma, demonstrating the clinical relevance of this mouse model. Among the gene expression changes detected, TM3 cells had altered expression of genes associated with mitochondrial metabolism. The authors used their previous experience using nicotinamide to metabolically protect DBA2/J mice from glaucomatous damage, and they hypothesized that nicotinamide supplementation of mutant Lmx1b mice would help restore normal mitochondrial metabolism in the TM and prevent Lmx1b-mediated ocular hypertension. Adding nicotinamide to the drinking water significantly prevented Lmxb1 mutant mice from developing high intraocular pressure. This is a laudable example of dissecting the molecular pathogenic mechanisms responsible for a disease (glaucoma) and then discovering and testing a potential therapy that directly intervenes in the disease process and thereby protects from the disease.
 
 Strengths:
 
 There are numerous strengths in this comprehensive study including:
 
 Deep scRNA sequencing that was confirmed by an independent dataset in another mouse strain at another university.
 
 Identification and validation of molecular markers for each mouse TM cell subset along with localization of these subsets within the mouse aqueous outflow pathway.
 
 Rigorous bioinformatics analysis of these data as well as comparison of the current data with previously published mouse and human scRNAseq data.
 
 Correlating their current data with GWAS glaucoma and IOP "hits".
 
 Discovering gene expression changes in the 3 TM subgroups in the mouse mutant Lmx1b model of glaucoma.
 
 Further pursuing the indication of dysfunctional mitochondrial metabolism in TM3 cells from Lmx1b mutant mice to test the efficacy of dietary supplementation with nicotinamide. The authors nicely demonstrate the disease modifying efficacy of nicotinamide in preventing IOP elevation in these Lmx1b mutant mice, preventing the development of glaucoma. These results have clinical implications for new glaucoma therapies.
 
 We thank the reviewer for these generous and thoughtful comments on the strengths of this study.
 
 Weaknesses:
 
 (1) Occasional over-interpretation of data. The authors have used changes in gene expression (RNAseq) to implicate functions and signaling pathways. For example: they have not directly measured "changes in metabolism", "mitochondrial dysfunction" or "activity of Lmx1b".
 
 We thank the reviewer for this feedback. We did not intend to overstate and agree. Our gene expression changes support, but do not by themselves prove, metabolic disturbances. We had felt that this was obvious and did not want to clutter the text. We have revised the manuscript to clarify that our conclusions about metabolic changes and LMX1B activity are based on gene expression patterns rather than direct functional assays and have added EM data (see below under “Recommendations for the authors”).
 
 We have also added the following to the results:
 
 Lines 715-721: “Although the documented gene expression changes strongly suggest metabolic and mitochondrial dysfunction, they do not directly prove it. Using electron microscopy to directly evaluate mitochondria in the TM, we found a reduction in total mitochondria number per cell in mutants (P = 0.015, Figure 6G). In addition, mitochondria in mutants had increased area and reduced cristae (inner membrane folds) in mutants consistent with mitochondrial swelling and metabolic dysfunction (all P < 0.001 compared to WT, Figure 6G-H).”
 
 More detailed EM and metabolic studies are underway but are beyond the scope of this paper.
 
 (2) In their very thorough data set, there is enrichment of or changes in gene expression that support other pathways that have been previously reported to be associated with glaucoma (such as TGFb2, BMP signaling, actin cytoskeletal organization (CLANs), WNT signaling, ossification, etc. that appears to be a lost opportunity to further enhance the significance of this work.
 
 We appreciate the reviewer’s suggestions for enhancing the relevance of our work, we had not initially discussed this due to length concerns. We have now incorporated some of this information into the manuscript (see below under “Recommendations for the authors”).
 
 Reviewer #3 (Public review):
 
 Summary: In this study, the authors perform multimodal single-cell transcriptomic and epigenomic profiling of 9,394 mouse TM cells, identifying three transcriptionally distinct TM subtypes with validated molecular signatures. TM1 cells are enriched for extracellular matrix genes, TM2 for secreted ligands supporting Schlemm's canal, and TM3 for contractile and mitochondrial/metabolic functions. The transcription factor LMX1B, previously linked to glaucoma, shows the highest expression in TM3 cells and appears to regulate mitochondrial pathways. In Lmx1bV265D mutant mice, TM3 cells exhibit transcriptional signs of mitochondrial dysfunction associated with elevated IOP. Notably, vitamin B3 treatment significantly mitigates IOP elevation, suggesting a potential therapeutic avenue.
 
 This is an excellent and collaborative study involving investigators from two institutions, offering the most detailed single-cell transcriptomic and epigenetic profiling of the mouse limbal tissues-including both TM and Schlemm's canal (SC), from wild-type and Lmx1bV265D mutant mice. The study defines three TM subtypes and characterizes their distinct molecular signatures, associated pathways, and transcriptional regulators. The authors also compare their dataset with previously published murine and human studies, including those by Van Zyl et al., providing valuable crossspecies insights.
 
 Strengths:
 
 (1) Comprehensive dataset with high single-cell resolution
 
 (2) Use of multiple bioinformatic and cross-comparative approaches
 
 (3) Integration of 3D imaging of TM and SC for anatomical context
 
 (4) Convincing identification and validation of three TM subtypes using molecular markers.
 
 We thank the reviewer for their comments on the strengths of this study.
 
 Weaknesses:
 
 (1) Insufficient evidence linking mitochondrial dysfunction to TM3 cells in Lmx1bV265D mice: While the identification of TM3 cells as metabolically specialized and Lmx1b-enriched is compelling, the proposed link between Lmx1b mutation and mitochondrial dysfunction remains underdeveloped. It is unclear whether mitochondrial defects are a primary consequence of Lmx1b-mediated transcriptional dysregulation or a secondary response to elevated IOP. Additional evidence is needed to clarify whether Lmx1b directly regulates mitochondrial genes (e.g., via ChIP-seq, motif analysis, or ATAC-seq), or whether mitochondrial changes are downstream effects.
 
 We agree and refer the reviewer to our responses to the other referees including Reviewer 1, Comment 1 and Reviewer 2 comments 1 and 17. As noted there, these mechanistic questions are the focus of ongoing and future studies. We have revised the text where appropriate to ensure it accurately reflects the scope of our current data.
 
 (2) Furthermore, the protective effects of nicotinamide (NAM) are interpreted as evidence of mitochondrial involvement, but no direct mitochondrial measurements (e.g., immunostaining, electron microscopy, OCR assays) are provided. It is essential to validate mitochondrial dysfunction in TM3 cells using in vivo functional assays to support the central conclusion of the paper. Without this, the claim that mitochondrial dysfunction drives IOP elevation in Lmx1bV265D mice remains speculative. Alternatively, authors should consider revising their claims that mitochondrial dysfunction in these mice is a central driver of TM dysfunction.
 
 We again refer the reviewer to our other response including Reviewer 1, Comment 1 and Reviewer 2 comments 1 and 17.
 
 (3) Mechanism of NAM-mediated protection is unclear: The manuscript states that NAM treatment prevents IOP elevation in Lmx1bV265D mice via metabolic support, yet no data are shown to confirm that NAM specifically rescues mitochondrial function. Do NAM-treated TM3 cells show improved mitochondrial integrity? Are reactive oxygen species (ROS) reduced? Does NAM also protect RGCs from glaucomatous damage? Addressing these points would clarify whether the therapeutic effects of NAM are indeed mitochondrial.
 
 We refer the reviewer to our response to Reviewer 1, Comment 2.
 
 (4) Lack of direct evidence that LMX1B regulates mitochondrial genes: While transcriptomic and motif accessibility analyses suggest that LMX1B is enriched in TM3 cells and may influence mitochondrial function, no mechanistic data are provided to demonstrate direct regulation of mitochondrial genes. Including ChIP-seq data, motif enrichment at mitochondrial gene loci, or perturbation studies (e.g., Lmx1b knockout or overexpression in TM3 cells) would greatly strengthen this central claim.
 
 We refer the reviewer to our response to Reviewer 1, Comment 1.
 
 (5) Focus on LMX1B in Fig. 5F lacks broader context: Figure 5F shows that several transcription factors (TFs)-including Tcf21, Foxs1, Arid3b, Myc, Gli2, Patz1, Plag1, Npas2, Nr1h4, and Nfatc2exhibit stronger positive correlations or motif accessibility changes than LMX1B. Yet the manuscript focuses almost exclusively on LMX1B. The rationale for this focus should be clarified, especially given LMX1B's relatively lower ranking in the correlation analysis. Were the functions of these other highly ranked TFs examined or considered in the context of TM biology or glaucoma? Discussing their potential roles would enhance the interpretation of the transcriptional regulatory landscape and demonstrate the broader relevance of the findings.
 
 Our analysis (Figure 5F) indicates that Lmx1b is the transcription factor most strongly associated with its predicted target gene expression across all TM cells, as reflected by its highest value along the X-axis. While other transcription factors exhibit greater motif accessibility (Y-axis), this likely reflects their broader expression across TM subtypes. In contrast, Lmx1b is minimally expressed in TM1 and TM2 cells, which may account for its lower motif accessibility overall (motifs not accessible in cells where Lmx1b is not / minimally expressed).
 
 Our emphasis on LMX1B is further supported by its direct genetic association with glaucoma. In contrast, the other transcription factors lack clear links to glaucoma and are supported primarily by indirect evidence. Nonetheless, we agree that the transcription factors highlighted in our analysis are promising candidates for future investigation. However, to maintain focus on the central narrative of this study, we have chosen not to include an extended discussion of these additional genes.
 
 (6) In abstract, they say a number of 9,394 wild-type TM cell transcriptomes. The number of Lmx1bV265D/+ TM cell transcriptomes analyzed is not provided. This information is essential for evaluating the comparative analysis and should be clearly stated in the Abstract and again in the main text (e.g., lines 121-123). Including both wild-type and mutant cell counts will help readers assess the balance and robustness of the dataset.
 
 We thank the reviewer for noticing this oversight and have added this value to the abstract and results section.
 
 Lines 41 and 696: 2,491 mutant TM cells.
 
 (7) Did the authors monitor mouse weight or other health parameters to assess potential systemic effects of treatment? It is known that the taste of compounds in drinking water can alter fluid or food intake, which may influence general health. Also, does Lmx1bV265D/+ have mice exhibit non-ocular phenotypes, and if so, does nicotinamide confer protection in those tissues as well? Additionally, starting the dose of the nicotinamide at postnatal day 2, how long the mice were treated with water containing nicotinamide, and after how many days or weeks IOP was reduced, and how long the decrease in the IOP was sustained.
 
 Water intake was monitored in both treatment groups, and dosing was based on the average volume consumed by adult mice (lines 1017–1018, young pups do not drink water and so drug is largely delivered through mothers’ milk until weaning and so we do not know an accurate dose for young pups). Mouse health was assessed throughout the experiment through regular monitoring of body weight and general condition.
 
 Depending on genetic context, Lmx1b mutations can cause kidney disease and impact other systems. Non-ocular phenotypes were not the focus of this study and were not characterized.
 
 We added a comment to the method to clarify the NAM treatment timeline. NAM was administered continuously in the drinking water starting at P2 and maintained throughout the experiment. IOP was measured beginning at 2 months and then at monthly time points. NAM lessened IOP at 2 and 3 months. We terminated IOP assessment at 3 months.
 
 Lines 1028-1029: “Treatment was started at postnatal day 2 and continued throughout the experiment.”
 
 (8) While the IOP reduction observed in NAM-treated Lmx1bV265D/+ mice appears statistically significant, it is unclear whether this reflects meaningful biological protection. Several untreated mice exhibit very high IOP values, which may skew the analysis. The authors should report the mean values for IOP in both untreated and NAM-treated groups to clarify the magnitude and variability of the response.
 
 We have added supplemental table 7 with the statistical information. Regarding the high IOP values observed in a subset of untreated V265D mutant mice, we consistently detect individual mutant eyes with IOPs exceeding 30 mmHg across independent cohorts and time points [3-5]. It is important to note that IOP is subject to fluctuation and in disease states such as glaucoma, circadian rhythms can be disrupted with stochastic and episodic IOP spikes throughout the day. This may be occurring in those untreated mice. This is also why we strive to use sample sizes of 40 or more. Additionally, we observe that some mutant eyes with IOPs measured within the normal range have anterior chamber deepening (ACD) - a persistent anatomical change associated with sustained or recurrent high IOP that stretches the cornea and may posteriorly displace the lens. This suggests mutant mice experience transient IOP elevations that are not always captured at a single time point due to the stochastic nature of these fluctuations. To account for this, we include ACD as an additional readout alongside IOP measurements. The reduction in ACD observed in NAM-treated mice provides independent evidence supporting the biological relevance of NAM-mediated IOP reduction.
 
 (9) Additionally, since NAM has been shown to protect RGCs in other glaucoma models directly, the authors should assess whether RGCs are preserved in NAM-treated Lmx1b V265D/+ mice. Demonstrating RGC protection would support a synergistic effect of NAM through both IOP reduction and direct neuroprotection, strengthening the translational relevance of the treatment.
 
 We again thank the referee. We note the possibility of dual IOP protection and neuroprotection in the manuscript (lines 961–963). The goal of the present study, however, was to determine mechanisms underlying IOP elevation in patients with LMX1B variants. Therefore, we limited our focus to IOP elevation (LMX1B is expressed in the TM but not RGCs). Studies of the RGCs and optic nerve in V265D mutant mice treated with NAM take considerable effort but are underway. They will be reported in a subsequent manuscript. Initial data support protection, but that is a work in progress.
 
 Additionally, we recently reported a similar pattern of IOP protection to that reported here using pyruvate - in experiments where we analyzed the optic nerve as the focus of the study was assessment of pyruvate as a resilience factor against high genetic risk of glaucoma [4]. In that case, there was statistically significant protection from glaucomatous optic nerve damage, arguing for translational relevance again with a possible synergistic effect through both IOP reduction and direct neuroprotection.
 
 (10) Can the authors add any other functional validation studies to explore to understand the pathways enriched in all the subtypes of TM1, TM2, and TM3 cells, in addition to the ICH/IF/RNAscope validation?
 
 We agree with the reviewer on the importance of further functional validation of pathways active in TM cell subtypes that influence IOP. However, comprehensive investigation of the pathways active in subtypes need to be in future studies. It is beyond the scope of his already large paper.
 
 (11) The authors should include a representative image of the limbal dissection. While Figure S1 provides a schematic, mouse eyes are very small, and dissecting unfixed limbal tissue is technically challenging. It is also difficult to reconcile the claim that the majority of cells in the limbal region are TM and endothelium. As shown in Figure S6, DAPI staining suggests a much higher abundance of scleral cells compared to TM cells within the limbal strip. Additional clarification or visual evidence would help validate the dissection strategy and cellular composition of the captured region.
 
 We appreciate the reviewer’s suggestion and have added additional images to Figure S1 to show our limbal strip dissection. However, we clarify that we do not intend to suggest that TM and endothelial cells are the most abundant populations in these dissected strips. When we say “are enriched for drainage tissues” we mean in comparison to dissecting the anterior segment as a whole. We have clarified this in the text. In fact, epithelial cells (primarily from the cornea) constituted the largest cluster in our dataset (Figure 1A). Additionally, to avoid misinterpretation, we generally refrain from drawing conclusions about the relative abundance of cell types based on sequencing data. Single-cell and single nucleus RNA sequencing results are sensitive to technical factors that alter cell proportions depending on exact methodological details. In our study, TM cells comprised 24.4% of the single-cell dataset and 11.8% of the single-nucleus dataset, illustrating the impact of methodological variability.
 
 Lines 163-164: “Individual eyes were dissected to isolate a strip of limbal tissue, which is enriched for TM cells in comparison to dissecting the anterior segment as a whole.”
 
 Reviewer #1 (Recommendations for the authors):
 
 To enhance the reproducibility and transparency of the findings presented in this study, we strongly recommend that the authors make all analysis scripts and computational tools publicly available.
 
 We agree with the reviewer’s emphasis on transparency and are currently building a GitHub page to share our scripts. However, we did not develop any new tools for this study. All tools that we used are publicly available and provided in our methods section. All data will be available as raw data and through the Broad Institute’s Single Cell Portal.
 
 Reviewer #2 (Recommendations for the authors):
 
 The authors are to be commended for a well-written presentation of high-quality data, their comparisons of datasets (other mouse and human scRNAseq data), correlation with clinical glaucoma risk alleles, and curative therapy for the mouse model of Lmx1b glaucoma. There are several minor suggestions that the authors might consider to further improve their manuscript:
 
 (1) Lines 42-43: Although their data strongly support the role of mitochondrial dysfunction in Lmx1b glaucoma, they might want to soften their conclusion "supports a primary role of mitochondrial dysfunction within TM3 cells initiating the IOP elevation that causes glaucoma".
 
 With the inclusion of EM data supporting mitochondrial dysfunction in Lmx1b mutant TM cells, we have revised this sentence to more accurately reflect our findings.
 
 Lines 42-44 (previously lines 42-43): “Mitochondria in TM cells of V265D/+ mice are swollen with a reduced cristae area, further supporting a role for mitochondrial dysfunction in the initiation of IOP elevation in these mice.”
 
 (2) Figure 1: Why is the shape of the "TM containing" cluster in 1A so different than the cluster shown in 1B?
 
 We isolated cells from the 'TM-containing' cluster and performed unbiased reclustering, which alters their positioning in UMAP space. The figure legend has been updated to clarify this point.
 
 Lines 143-144 “A separate UMAP representation of the trabecular meshwork (TM) containing cluster following subclustering.”
 
 (3) Line 160: change "data was" to "data were"
 
 Corrected
 
 (4) S4 Fig C: Please comment on why the Columbia and Duke heatmaps for TM3 are not as congruent as the heatmaps for TM1 and TM2.
 
 We cannot definitively determine the reason for this. However, differences in tissue processing techniques between the Columbia and Duke preparations may contribute. Such variations have been shown to affect cellular transcriptomes in certain contexts. It is possible that TM3 cells are more susceptible to these effects than others. We have added a statement addressing this point to the figure legend.
 
 Lines 238-240: “Because tissue processing techniques can alter gene expression [52], the heatmap variation between institutes likely reflects differences in processing techniques (Methods) and suggests that TM3 cells are more susceptible to these effects than other cell types.”
 
 (5) S9 Fig: It is very difficult to see any staining for TM1 CHIL1 (2nd panel), TM2 End3 (2nd panel), and TM3 Lypd1 (both panels)
 
 We apologize for the difficulty in visualizing these panels. To improve clarity, we have increased the brightness of all relevant marker signals, within standard bounds, to facilitate easier interpretation.
 
 (6) Line 380: "are significantly higher"; since statistical analysis was not reported, please do not use "significantly"
 
 Done
 
 (7) The authors should consider discussing several of their findings that agree with published literature. For example:
 
 Figure 3B: "Wnt protein binding" (PMID: 18274669), "TGFb "binding" (numerous references), "integrin binding" (work of Donna Peters), "actin binding"/"actin filament binding"/"actin filament bundle" (CLANs references)
 
 S10 Fig c: "ossification" (work of Torretta Borres)
 
 S11 Fig A: ID2/ID3 (PMID: 33938911); (B) BMP4 (PMID: 17325163)
 
 S12 Fig A: MYOC in TM1 cells (numerous references)
 
 We appreciate the reviewer’s diligent review and comments regarding these pathways. We have added a comment to the discussion regarding the agreement of these pathways.
 
 Lines 855-858: In addition, the expression of genes that we document generally agrees with the literature. For example, the following genes and signaling molecules have been reported in TM cells, WNT signaling [78], TGF-β signaling [79-85], integrin binding [86-88], actin cytoskeletal networks [89], calcification genes [90, 91], and Myocilin [91-94].
 
 (8) Line 541: was confocal microscopy used to measure the "3D shapes" of nuclei or was this done with a single image to determine sphericity?
 
 This analysis was performed using confocal microscopy and 3D reconstructed models of the TM nuclei. We have added text to clarify this in the figure legend
 
 Lines 553-556: “To rigorously assess whether TM1 nuclei are more spherical, we analyzed their reconstructed 3D shapes from whole mounts images by confocal microscopy, comparing them to TM3 nuclei using the ‘Sphericity’ tool in Imaris.”
 
 (9) Line 545: please add a close parentheses after "scoring 1"
 
 Done
 
 (10) S15 Fig: (A) There does not appear to be "good agreement" (line 653) between the datasets for TM1. (C) please provide a better explanation on how to interpret these "Confusion Matrix" results.
 
 We understand the referee's concern, the patterns likely appear different to the referee due to limited sampling in snRNA-seq data. Based on our results, TM1 seems particularly susceptible, possibly because these cells do not tolerate the isolation process as well. Although we are confident that TM1 shows good agreement between the two techniques based on our experience, we have revised the language in the text to “generally” to reflect this nuance.
 
 Lines 633-635 (previously line 653): The generated clusters and their marker genes generally agreed with our scRNA-seq analyses (Fig 5A-B, S15A Fig).
 
 We have also added additional clarification for how to interpret the Confusion Matrix.
 
 Lines 669-672: “Colors indicate the fraction of cells identified in each ATAC cluster (row) which are also identified in each RNA cell type (columns), where darker colors represent stronger correspondence between RNA and ATAC clusters.”
 
 (11) Line 676: The transition from discussing the sc/snRNAseq data to the work in Lmx1b mutant mice is quite abrupt and could use a better transition to introduce this metabolism work.
 
 We have revised this transition for improved flow but prefer to keep all transitions brief due to the paper's length.
 
 Lines 691-694 (previously line 676): To evaluate the utility of our new TM cell atlas, we used it to examine how Lmx1b mutations affect the TM cell transcriptome and to identify potential mechanisms underlying IOP elevation. We selected LMX1B because it causes IOP elevation and glaucoma in humans and was identified as a highly active transcription factor in our TM cell dataset.
 
 (12) Lines 696-697: It appears counter-intuitive that upregulation of ubiquitin pathways would lead to proteostasis (proteosome protein degradation requires ubiquination).
 
 We have clarified that the protein tagging pathway was significantly upregulated. However, polyubiquitin precursor itself was downregulated. In general, the statistical significance of the protein tagging pathway suggests perturbation of the system tagging proteins for degradation. We have clarified this in the text.
 
 Lines 711-714 (previously lines 696-697): “In addition, mutant TM3 cells showed an upregulation of protein tagging genes. However, there is a downregulation of the polyubiquitin precursor gene (Ubb, P = 4.5E-30), indicating a general dysregulation of pathways that tag proteins for degradation.”
 
 (13) Line 715: Please justify why "perturbed metabolism" was chosen to pursue vs the other differentially expressed pathways
 
 We chose to narrow our focus on TM3 cells because of the enrichment for Lmx1b expression.Most pathways identified in our analysis of TM3 cells implicate mitochondrial metabolism.Therefore, we chose to further explore this avenue. We clarified that perturbed metabolism was the strongest gene expression signature in the text.
 
 Lines 753-754 (previously line 715): “Our findings most strongly implicate perturbed metabolism within TM3 cells as responsible for IOP elevation in an Lmx1b glaucoma model.”
 
 (14) Line 759: The authors clearly demonstrate that Lmx1b is most expressed in TM3 cells; however, they did not demonstrate that "Lmx1b was most active"
 
 ATAC analysis showed that Lmx1b was most active in TM cells overall. We inferred its activity in TM3 because Lmx1b is most enriched in that subtype. This has been clarified in the text.
 
 Lines 799-800 (previously line 759): “More specifically, we demonstrate that Lmx1b is the most active TM cell TF and is enriched in TM3 cells,…”
 
 (15) Lines 830-835: Please include references documenting increased TGFβ2 concentrations in POAG aqueous humor and TM, effects of TGFβ2 on TM ECM deposition, and TGFβ2 induced ocular hypertension ex vivo and in vivo.
 
 Done.
 
 (16) Line 875: The authors provide no direct evidence for enhances "oxidative stress" in Lmx1b TM3 cells
 
 The mitochondrial abnormalities and changed pathways support oxidative stress, but we have not directly tested this. Experiments are currently underway to evaluate its role, but these additional analyses are beyond the scope of this paper. We removed oxidative stress from the sentence.
 
 Lines 920-922 (previously line 875): “Importantly, in heterozygous mutant V265D/+ mice, TM3 cells had pronounced gene expression changes that implicate mitochondrial dysfunction, but that were absent or much lower in other cells including TM1 and TM2.”
 
 (17) Line 880: Similarly, the authors have not directly assessed effects on metabolism in TM3 cells; they only have shown changes in the expression of mitochondrial genes that may affect metabolism
 
 We have no way to specifically isolating TM3 cells to test this. Future work is underway to test this more broadly in isolated TM cells but is beyond the scope of this is already large paper. Considering our gene expression data and the addition of supporting EM data, we have qualified the text.
 
 Lines 930-931 (previously 880): “Our data extend these published findings by showing that inheritance of a single dominant mutation in Lmx1b similarly affects mitochondria in TM cells.”
 
 (18) Line 892: What markers were used to detect "cell stress"?
 
 We have revised the text. Although our RNA data show stress gene changes, characterization of these markers is beyond the scope of the current study and will be included in a subsequent paper.
 
 Lines 945-948 (previously line 892): “However, these processes were not limited to TM3 cells or even to cell types that express detectable Lmx1b, suggesting that they are secondary damaging processes that are subsequent to the initiating, Lmx1b-induced perturbations in TM3 cells.”
 
 Additional author driven change
 
 While revising and reviewing our data, we identified a coding error that resulted in the WT and V265D mutant group labels being switched in Figure 6. Importantly, the significance of the differentially expressed genes (DEGs), the implicated biological pathways, and the interpretation of pathway directionality in the manuscript remain accurate. The only issue was the incorrect labeling in the figure. We have corrected the labels in Figure 6 to accurately reflect the data. As noted above, all data and code will be made available to ensure full reproducibility of our results.
 
 References
 
 (1) Doucet-Beaupre H, Gilbert C, Profes MS, Chabrat A, Pacelli C, Giguere N, et al. Lmx1a and Lmx1b regulate mitochondrial functions and survival of adult midbrain dopaminergic neurons. Proc Natl Acad Sci U S A. 2016;113(30):E4387-96. Epub 2016/07/14. doi: 10.1073/pnas.1520387113. PubMed PMID: 27407143; PubMed Central PMCID: PMCPMC4968767.
 
 (2) Jimenez-Moreno N, Kollareddy M, Stathakos P, Moss JJ, Anton Z, Shoemark DK, et al. ATG8-dependent LMX1B-autophagy crosstalk shapes human midbrain dopaminergic neuronal resilience. J Cell Biol. 2023;222(5). Epub 2023/04/05. doi: 10.1083/jcb.201910133. PubMed PMID: 37014324; PubMed Central PMCID: PMCPMC10075225.
 
 (3) Cross SH, Macalinao DG, McKie L, Rose L, Kearney AL, Rainger J, et al. A dominantnegative mutation of mouse Lmx1b causes glaucoma and is semi-lethal via LDB1mediated dimerization [corrected]. PLoS Genet. 2014;10(5):e1004359. Epub 2014/05/09. doi: 10.1371/journal.pgen.1004359. PubMed PMID: 24809698; PubMed Central PMCID: PMCPMC4014447.
 
 (4) Li K, Tolman N, Segre AV, Stuart KV, Zeleznik OA, Vallabh NA, et al. Pyruvate and related energetic metabolites modulate resilience against high genetic risk for glaucoma. Elife. 2025;14. Epub 2025/04/24. doi: 10.7554/eLife.105576. PubMed PMID: 40272416; PubMed Central PMCID: PMCPMC12021409.
 
 (5) Tolman NG, Balasubramanian R, Macalinao DG, Kearney AL, MacNicoll KH, Montgomery CL, et al. Genetic background modifies vulnerability to glaucoma-related phenotypes in Lmx1b mutant mice. Dis Model Mech. 2021;14(2). Epub 2021/01/20. doi: 10.1242/dmm.046953. PubMed PMID: 33462143; PubMed Central PMCID: PMCPMC7903917.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.11.01.621152v2
www.biorxiv.org www.biorxiv.org

Organelle membrane-associated proteins recruit cGAS via phase separation to facilitate its membrane localization

5
1. Public_Reviews 10 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This useful study investigates how intrinsically disordered domains can interact to dictate the sub-cellular localization of a major innate immune sensor termed cGAS. The data from various cellular and biochemical assays are mostly solid, but the main conclusions from these experiments need to be validated further. This paper is relevant to immunologists, especially those interested in cytosolic DNA-sensing pathways.
  
  Summary
2. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  This manuscript by the Yin group presents interesting findings that organelle-tethered intrinsically disordered "MEMCA" scaffolds, as exemplified by ZDHHC18 at the Golgi and MARCH8 at endosomes, enhance the engagement of cGAS with organelle-proximal condensates, thereby sequestering cGAS from cytosolic DNA sensing and negatively regulating innate immunity.
  
  Strengths:
  
  These findings suggest a previously unrecognized mechanism by which Golgi/endosomal IDR scaffolds modulate cGAS activity, with implications for antiviral defense and tumor immunology. The study is conceptually intriguing and potentially impactful.
  
  Weaknesses:
  
  While the manuscript addresses a novel aspect of cGAS regulation, additional mechanistic insights and targeted validations are needed to ensure robustness:
  
  (1) How do ZDHHC18/MARCH8 enhance cGAS engagement? Do they act as bridges to form a ternary, membrane-tethered cGAS-DNA-MEMCA complex, or alter cGAS condensate properties allosterically?
  
  (2) Is organelle cGAS capture selective? For instance, can other palmitoyltransferases/E3 ligases be substituted for ZDHHC18/MARCH8?
  
  (3) Why does membrane association suppress cGAS enzymic activity, as dsDNA still resides in cGAS condensation?
  
  Review 1
3. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The authors found that cGAS, a DNA sensor, relocalizes to organelle membranes (ER, Golgi, endosomes) upon DNA stimulation, revealing spatial regulation of its activity. ZDHHC18 and MARCH8 recruit cGAS to Golgi/endosomes via intrinsically disordered regions (IDRs), driving phase-separated condensates. This sequestration of cGAS-dsDNA complexes suppresses innate immune signaling, uncovering a novel regulatory mechanism.
  
  Strengths:
  
  The work overall is very interesting. The authors provided molecular and biochemical evidence.
  
  Weaknesses:
  
  Overall, the work is very interesting. However, the quality of some of the data does need to be improved, and more experiments need to be performed.
  
  The following points need to be addressed:
  
  (1) In Figure S7, no direct binding between cGAS and MARCH8 or ZD18 IDR is observed, and the interaction only occurs after DNA stimulation. However, Figure 5 shows cGAS recruitment to ZD18 or MARCH8 IDR droplets, suggesting direct interactions. This apparent discrepancy should be clarified.
  
  (2) The authors propose that recruiting cGAS to organelle membranes reduces its activity, as demonstrated by the FKBP experiment. However, ZD18 and MARCH8 also post-translationally modify cGAS. Do both mechanisms contribute to this effect, and can the authors test this?
  
  (3) To demonstrate the functional importance of MEMCA, the authors should test IFN production or STING activation in cells.
  
  (4) Does the IDR of MARCH8 or ZD18 influence the interaction between cGAS and DNA?
  
  (5) Which region of cGAS does the IDR of MARCH8 or ZD18 interact with: the cGAS-CD or the cGAS-N-terminus?
  
  (6) The in vitro LLPS experiments with cGAS, DNA, and ZD18/MARCH8 should be conducted under physiological conditions.
  
  Review 2
4. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  In this study by Shi et al., the authors evaluate if cGAS is recruited to the membranes of intracellular organelles. Using a combination of biochemical fractionation and imaging techniques, the authors propose that upon recognition of DNA, cGAS translocates to various subcellular locations, including the golgi, endoplasmic reticulum, and endosomes. Mechanistically, the authors propose that upon localizing to the Golgi or endosome, cGAS binding to MARCH8 and ZDHHC18 prevents cGAS activity by incorporating cGAS and dsDNA into biomolecular condensates. However, in its current form, the study does not directly address this question.
  
  Strengths:
  
  The question of evaluating cGAS sub-cellular localization as a mechanism for controlling activity is interesting, and there is some evidence that cGAS is localized to sub-cellular organelle membranes.
  
  Weaknesses:
  
  (1) The well-established nuclear localization of cGAS is not adequately addressed in the cell lines used and is inconsistent with the findings.
  
  (2) Previous studies have shown that ZDHHC18 and MARCH8 control cGAS activity, which detracts somewhat from the novelty.
  
  (3) A lot of inconsistency in the cell lines and artificial expression systems used across the study.
  
  (4) A key element missing is showing that in the absence of ZDHHC18 or MARCH8, the loss of endogenous cGAS localization to the various sub-cellular organelles increases cGAMP synthesis and downstream STING activation in primary cells. There is an over-reliance on artificial expression systems. An important experiment to validate the hypothesis would be to evaluate endogenous cGAS localization in MARCH8- and ZDHHC18-deficient primary cells. Further, there should be evaluation of endogenous STING responses in MARCH8- and ZDHHC18-deficient primary cells in tandem with the localization studies.
  
  (5) There are a large number of grammatical errors throughout the manuscript which should be addressed.
  
  Review 3
5. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Author response:
  
  Below we outline our provisional responses to the major points raised in the public reviews, and our planned revisions:
  
  (1) Mechanistic model of how ZDHHC18/MARCH8 engage the cGAS–DNA condensate (Reviewer #1 & #2
  
  We will add a dedicated subsection and a working-model figure describing our current view: IDRs of ZDHHC18 (Golgi) and MARCH8 (endosomes) engage pre-formed cGAS–DNA condensates at organelle membranes, and thereby tune cGAS activity through PTMs. We will explicitly discuss bridge-like versus allosteric modes by perform additional LLPS experiment (e.g. FRAP assay) to detect any IDR-driven changes in condensate properties, and explain how these scenarios fit our data.
  
  (2) Selectivity beyond ZDHHC18/MARCH8 (Reviewer #1)
  
  We will expand the text to explain existing evidence indicating that, in addition to ZDHHC18 or MARCH8, other post-translational modification (PTM) enzymes and/or membrane-associated scaffolds may also modulate cGAS. We will summarize our current datasets that support this possibility and outline how this selectivity relates to organelle identity.
  
  (3) Why membrane association suppresses cGAS activity (Reviewer #1)
  
  We will provide a concise mechanistic rationale—integrating our published work—to explain how membrane-proximal sequestration can limit cGAS catalysis despite cGAS–DNA coexistence within condensates. Specifically, we will discuss (i) IDR-dependent changes in condensate properties, and (ii) PTMs by ZDHHC18/MARCH8 that allosterically reduce catalytic efficiency; we will clearly cross-reference our prior publications that bear on these points.
  
  (4) Reconciling Fig. S7 (DNA-dependent binding) with Fig. 5 (recruitment to IDR droplets) (Reviewer #2)
  
  We will add text to clarify experimental context and readouts to prove that there is no real contradiction between Fig. S7 and Fig. 5. In the experiment shown in Fig. 5, PEG (a macromolecular crowding agent) was added to the system, which facilitates the formation of IDR phase-separated droplets. Under these conditions, cGAS partitions into the IDR condensates, leading to the observed recruitment. In contrast, Fig. S7 examines the direct physical interaction between cGAS and the IDRs using biochemical pull-down assays and shows that no direct interaction occurs in the absence of DNA. These two results reflect different experimental contexts and are therefore not mutually exclusive.
  
  (5) Planned additional tests to address specificity and mechanism (Reviewer #2)
  
  DNA pull-down: to test whether IDRs alter cGAS–DNA affinity, we will compare cGAS binding to DNA with/without MEMCA IDRs (and with charged-residue mutants).
  
  Domain mapping: to determine which region of cGAS engages MEMCA IDRs, we will map binding using cGAS N-terminus/core-domain truncations and key surface mutants.
  
  Physiological in vitro LLPS: we will repeat cGAS–DNA–IDR LLPS assays under physiological buffer conditions and report partition coefficients, FRAP, and phase diagrams to ensure physiological relevance.
  
  (6) Image clarity and data presentation (Reviewer #2):
  
  We will improve image resolution, add zoomed-in insets with organelle markers, and provide more significant Cy5-ISD signal.
  
  (7) Nuclear localization of cGAS and system considerations (Reviewer #3)
  
  We will explicitly document the nuclear signal of cGAS observed in our confocal experiments, detail the cell lines and expression systems used. We will also clarify cGAS nuclear localization in the cell lines used.
  
  (8) Endogenous validation and cell line consistency (Reviewer #3):
  
  We will perform experiments in primary cells (knockout macrophages) to address the concern of relying on overexpression.
  
  (9) Language and grammar (Reviewer #3):
  
  We will thoroughly revise the manuscript for grammar and clarity.
  
  Together, these planned revisions will strengthen the mechanistic basis of our findings and provide direct evidence for the physiological role of organelle-tethered IDRs in regulating cGAS activity.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.01.668185v1
www.biorxiv.org www.biorxiv.org

Dietary sulfur amino acid restriction elicits a cold-like transcriptional response in inguinal but not epididymal white adipose tissue of male mice

4
1. Public_Reviews 10 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  Ruppert et al. investigated how activation of thermogenesis by cold exposure (CE) and methionine restriction (MetR) impacts health and leads to weight loss in mice. The authors provided valuable datasets showing that the responses to MR and CE are tissue-specific, while MR and CE affect beige adipose similarly. Although the study is descriptive, the data analyses are solid, with well-supported conclusions drawn from the findings.
  
  Summary
2. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  Activation of thermogenesis by cold exposure and dietary protein restriction are two lifestyle changes that impact health in humans and lead to weight loss in model organisms - here, in mice. How these affect liver and adipose tissues has not been thoroughly investigated side by side. In mice, the authors show that the responses to methionine restriction and cold exposure are tissue-specific, while the effects on beige adipose are somewhat similar.
  
  Strengths:
  
  The strength of the work is the comparative approach, using transcriptomics and bioinformatic analyses to investigate the tissue-specific impact. The work was performed in mouse models and is state-of-the-art. This represents an important resource for researchers in the field of protein restriction and thermogenesis.
  
  Weaknesses:
  
  The findings are descriptive, and the conclusions remain associative. The work is limited to mouse physiology, and the human implications have not been investigated yet.
  
  Review 1
3. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This study provides a library of RNA sequencing analysis from brown fat, liver, and white fat of mice treated with two stressors - cold challenge and methionine restriction - alone and in combination (interaction between diet and temperature). They characterize the physiologic response of the mice to the stressors, including effects on weight, food intake, and metabolism. This paper provides evidence that while both stressors increase energy expenditure, there are complex tissue-specific responses in gene expression, with additive, synergistic, and antagonistic responses seen in different tissues.
  
  Strengths:
  
  The study design and implementation are solid and well-controlled. Their writing is clear and concise. The authors do an admirable job of distilling the complex transcriptome data into digestible information for presentation in the paper. Most importantly, they do not overreach in their interpretation of their genomic data, keeping their conclusions appropriately tied to the data presented. The discussion is well thought out and addresses some interesting points raised by their results.
  
  Weaknesses:
  
  The major weakness of the paper is the almost complete reliance on RNA sequencing data, but it is presented as a transcriptomic resource.
  
  Review 2
4. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  Ruppert et al. present a well-designed 2×2 factorial study directly comparing methionine restriction (MetR) and cold exposure (CE) across liver, iBAT, iWAT, and eWAT, integrating physiology with tissue-resolved RNA-seq. This approach allows a rigorous assessment of where dietary and environmental stimuli act additively, synergistically, or antagonistically. Physiologically, MetR progressively increases energy expenditure (EE) at 22{degree sign}C and lowers RER, indicating a lipid utilization bias. By contrast, a 24-hour 4 {degree sign}C challenge elevates EE across all groups and eliminates MetR-Ctrl differences. Notably, changes in food intake and activity do not explain the MetR effect at room temperature.
  
  Strengths:
  
  The data convincingly support the central claim: MetR enhances EE and shifts fuel preference to lipids at thermoneutrality, while CE drives robust EE increases regardless of diet and attenuates MetR-driven differences. Transcriptomic analysis reveals tissue-specific responses, with additive signatures in iWAT and CE-dominant effects in iBAT. The inclusion of explicit diet×temperature interaction modeling and GSEA provides a valuable transcriptomic resource for the field.
  
  Weaknesses:
  
  Limitations include the short intervention windows (7 d MetR, 24 h CE), use of male-only cohorts, and reliance on transcriptomics without complementary proteomic, metabolomic, or functional validation. Greater mechanistic depth, especially at the level of WAT thermogenic function, would strengthen the conclusions.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.06.669020v2
www.biorxiv.org www.biorxiv.org

Explainable machine learning-assisted exploration of chromatin dynamics reveals chromosome-specific response to serum starvation

3
1. Public_Reviews 10 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This interesting study adapts machine learning tools to analyze movements of a chromatin locus in living cells in response to serum starvation. The machine learning approach developed is useful, the experiments are well controlled, and the data are solid. The study would be greatly strengthened by testing key predictions made using perturbation experiments. This work will be of interest to those studying chromosome biology and gene expression patterns.
  
  Summary
2. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  Redchuk et al. explore the dynamic properties of chromatin upon serum starvation using machine learning approaches. They use CRISPR-tagging to visualize a region on chromosome 1 in human cells and show that in their system, chromosome 1, but not the previously reported chromosomes 10, 13, and X, undergo a change in radial position upon serum starvation. Live cell imaging showed a position change towards the periphery after serum starvation. They then apply a machine learning algorithm for the analysis of the imaging data, which reveals changes in nuclear area during serum starvation and longer displacements of the chromosome 1 locus near the nuclear periphery. Differential behavior of homologues is also reported.
  
  Strengths:
  
  (1) The study of chromatin dynamics is an interesting and important area of research.
  
  (2) The use of machine learning approaches to analyze live cell imaging data is timely.
  
  (3) With serum starvation, the authors use a simple, well-controllable model system.
  
  Weaknesses:
  
  (1) This study only provides limited new insight into chromatin dynamics.
  
  (2) It was not immediately evident what the use of machine learning approaches added to this study. It appears that the main conclusions could have been reached by conventional analysis.
  
  (3) There are several specific technical points:
  
  a) It was not clear what the CRISRP-Sirius probes actually labelled. The chromosome 1 sgRNA sequence is provided, but I could not find information as to which region(s) of the chromosome are actually labelled (size, location, etc.).
  
  b) The authors visualize a relatively small region of chromosome 1 but make conclusions regarding the entire chromosome. Additional probes on the same chromosome should be used.
  
  Related to this point, the discussion of why the authors are unable to reproduce the prior findings of relocation of chromosomes 10, 13, and X is not satisfying. It would be worth comparing the FISH-based painting of entire chromosomes, which generated the results suggesting relocation of these chromosomes, with the point-labelling method used here.
  
  c) The study lacks controls. Since in their hands chromosomes 10, 13, and X do not change position, they should be used as a negative control in all experiments demonstrating a shift in the location of chromosome 1.
  
  d) I did not find information about the spatial or temporal resolution of the imaging modality. This is important to assess whether the observed changes in position, relative to time, are meaningful.
  
  e) The authors analyze surprisingly early timepoints (up to 40 minutes) of serum starvation. Would these results look different if longer serum starvation timepoints of several hours were analyzed?
  
  f) The authors can do a better job of explaining what the biological meaning of the various parameters (DistR, TDist, etc.) they measure is.
  
  g) I did not understand the reasoning for the authors' conclusion of differential behavior of homologues. Please explain this better, or idealy use more direct labeling methods that identify the individual homologues.
  
  h) In many figures, statistical analysis of the data is missing, including, but not limited to, Figures 1B, C, G, Figures 4, 5, 6.
  
  i) No information is provided throughout the manuscript as to how many cells were analyzed in each experiment. This should be indicated in every figure legend.
  
  Review 1
3. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The study demonstrates that CRISPR-Sirius provides a powerful approach to investigating chromosome dynamics in living cells during environmental stress. By focusing on serum starvation, the authors show that this process induces global nuclear changes, including a reduction in nuclear area and increased morphological dynamism, while at the same time driving specific reorganization of chromosome 1. Chromosome 1 relocates toward the nuclear periphery and displays distinctive patterns of motion, maintaining overall motility but punctuated by occasional long-distance displacements, particularly near the nuclear envelope. Importantly, the analysis reveals that homologous copies of chromosome 1 do not behave uniformly: peripheral loci become more mobile and responsive to starvation, whereas central homologs remain comparatively stable, often associated with nucleolar subcompartments. By integrating live imaging with machine learning and explainable AI analysis, the study highlights the complexity of nuclear organization and provides valuable insights into how chromosome-specific and locus-specific responses to stress are orchestrated within the three-dimensional nuclear landscape.
  
  Strengths:
  
  The study uses live-cell imaging to investigate the dynamics of loci during starvation. Live-cell tracking and data interpretation are carried out using machine learning and AI models, which is a major strength.
  
  Weaknesses:
  
  The manuscript is at times difficult to follow, partly because the methodological descriptions are highly specialized, especially for non-expert biologists. In addition, the observations are not tested for a mechanistic basis. Experiments that could provide deeper insights are missing, for example, why chromosome 1 moves, why the peripheral homologue dislocates, or why a "long jump" is observed at the periphery even though the speed of the loci does not change. It is also unclear whether a displacement of 0.5 μm is functionally meaningful.
  
  Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.08.669316v1
www.biorxiv.org www.biorxiv.org

3D directional tuning in the orofacial sensorimotor cortex during natural feeding and drinking

4
1. Public_Reviews 10 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This study characterises motor and somatosensory cortex neural activity during naturalistic eating and drinking tongue movement in nonhuman primates. The data, which include electrophysiology, three-dimensional tracking of tongue movements, and nerve block manipulations, are valuable to neuroscientists and neural engineers interested in tongue use. Although the current analyses provide a solid description of single neuron activity in these areas, both the population level analyses and the characterisation of activity changes following nerve block could be improved.
 
 Summary
2. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 Hosack and Arce-McShane investigate how the 3D movement direction of the tongue is represented in the orofacial part of the sensory-motor cortex and how this representation changes with the loss of oral sensation. They examine the firing patterns of neurons in the orofacial parts of the primary motor cortex (MIo) and somatosensory cortex (SIo) in non-human primates (NHPs) during drinking and feeding tasks. While recording neural activity, they also tracked the kinematics of tongue movement using biplanar video-radiography of markers implanted in the tongue. Their findings indicate that many units in both MIo and SIo are directionally tuned during the drinking task. However, during the feeding task, directional turning was more frequent in MIo units and less prominent in SIo units. Additionally, in some recording sessions, they blocked sensory feedback using bilateral nerve block injections, which seemed to result in fewer directionally tuned units and changes in the overall distribution of the preferred direction of the units.
 
 Strengths:
 
 The most significant strength of this paper lies in its unique combination of experimental tools. The author utilized a video-radiography method to capture 3D kinematics of the tongue movement during two behavioral tasks while simultaneously recording activity from two brain areas. This specific dataset and experimental setup hold great potential for future research on the understudied orofacial segment of the sensory-motor area.
 
 Weaknesses:
 
 A substantial portion of the paper is dedicated to establishing directional tuning in individual neurons, followed by an analysis of how this tuning changes when sensory feedback is blocked. While such characterizations are valuable, particularly in less-studied motor cortical areas and behaviors, the discrepancies in tuning changes across the two NHPs, coupled with the overall exploratory nature of the study, render the interpretation of these subtle differences somewhat speculative. At the population level, both decoding analyses and state space trajectories from factor analysis indicate that movement direction (or spout location) is robustly represented. However, as with the single-cell findings, the nuanced differences in neural trajectories across reach directions and between baseline and sensory-block conditions remain largely descriptive. To move beyond this, model-based or hypothesis-driven approaches are needed to uncover mechanistic links between neural state space dynamics and behavior.
 
 Review 1
3. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 This manuscript by Hosack and Arce-McShane examines the directional tuning of neurons in macaque primary motor (MIo) and somatosensory (SIo) cortex. The neural basis of tongue control is far less studied than, for example, forelimb movements, partly because the tongue's kinematics and kinetics are difficult to measure. A major technical advantage of this study is using biplanar video-radiography, processed with modern motion tracking analysis software, to track the movement of the tongue inside the oral cavity. Compared to prior work, the behaviors are more naturalistic behaviors (feeding and licking water from one of three spouts), although the animals were still head-fixed.
 
 The study's main findings are that:
 
 • A majority of neurons in MIo and a (somewhat smaller) percentage of SIo modulated their firing rates during tongue movements, with different modulation depending on the direction of movement (i.e., exhibited directional tuning). Examining the statistics of tuning across neurons, there was anisotropy (e.g., more neurons preferring anterior movement) and a lateral bias in which tongue direction neurons preferred that was consistent with the innervation patterns of tongue control muscles (although with some inconsistency between monkeys). • Consistent with this encoding, tongue position could be decoded with moderate accuracy even from small ensembles of ~28 neurons. • There were differences observed in the proportion and extent of directional tuning between the feeding and licking behaviors, with stronger tuning overall during feeding. This potentially suggests behavioral context-dependent encoding. • The authors then went one step further and used a bilateral nerve block to the sensory inputs (trigeminal nerve) from the tongue. This impaired the precision of tongue movements and resulted in an apparent reduction and change in neural tuning in Mio and SIo.
 
 Strengths:
 
 The data are difficult to obtain and appear to have been rigorously measured, and provide a valuable contribution to this under-explored subfield of sensorimotor neuroscience. The analyses adopt well-established methods especially from the arm motor control literature, and represent a natural starting point for characterizing tongue 3D direction tuning.
 
 Weaknesses:
 
 There are alternative explanations from some of the interpretations, but those interpretations are described in a way that clearly distinguishes results from interpretations, and readers can make their own assessments. Some of these limitations are described in more detail below.
 
 One weakness of the current study is that there is substantial variability in some of the results between monkeys, including the tuning characteristics of primary somatosensory cortex neurons during drinking, and the effect of nerve block on tongue movements and the associated changes in single neuron tuning.
 
 This study focuses on describing directional tuning using the preferred direction (PD) / cosine tuning model popularized by Georgopoulous and colleagues for understanding neural control of arm reaching in the 1980s. This is a reasonable starting point and a decent first order description of neural tuning. However, the arm motor control field has moved far past that viewpoint, and in some ways an over-fixation on static representational encoding models and PDs held that field back for many years. The manuscript benefit from drawing the readers' attention (perhaps in their Discussion) that PDs are a very simple starting point for characterizing how cortical activity relates to kinematics, but that there is likely much richer population-level dynamical structure and that a more mechanistic, control-focused analytical framework may be fruitful. A good review of this evolution in the arm field can be found in Vyas S, Golub MD, Sussillo D, Shenoy K. 2020. Computation Through Neural Population Dynamics. Annual Review of Neuroscience. 43(1):249-75. A revised version of the manuscript incorporates more population-level analyses, but with inconsistent use of quantifications/statistics and without sufficient contextualization of what the reader is to make of these results.
 
 The described changes in tuning after nerve block could also be explained by changes in kinematics between these conditions, which temper the interpretation of these interesting results.
 
 I am not convinced of the claim that tongue directional encoding fundamentally changes between drinking and feeding given the dramatically different kinematics and the involvement of other body parts like the jaw (e.g., the reference to Laurence-Chasen et al. 2023 just shows that there is tongue information independent of jaw kinematics, not that jaw movements don't affect these neurons' activities). I also find the nerve block results inconsistent (more tuning in one monkey, less in the other?) and difficult to really learn something fundamental from, besides that neural activity and behavior both change - in various ways - after nerve block (not at all surprising but still good to see measurements of).
 
 The manuscript states that "Our results suggest that the somatosensory cortex may be less involved than the motor areas during feeding, possibly because it is a more ingrained and stereotyped behavior as opposed to tongue protrusion or drinking tasks". An alternative explanation be more statistical/technical in nature: that during feeding, there will be more variability in exactly what somatosensation afferent signals are being received from trial to trial (because slight differences in kinematics can have large differences in exactly where the tongue is and the where/when/how of what parts of it are touching other parts of the oral cavity)? This variability could "smear out" the apparent tuning using these types of trial-averaged analyses. Given how important proprioception and somatosensation are for not biting the tongue or choking, the speculation that somatosensory cortical activity is suppressed during feedback is very counter-intuitive to this reviewer. In the revised manuscript the authors note these potential confounds and other limitations in the Discussion.
 
 Review 2
4. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary
 
 In this study, the authors aim to uncover how 3D tongue direction is represented in the Motor (M1o) and Somatosensory (S1o) cortex. In non-human primates implanted with chronic electrode arrays, they use X-ray based imaging to track the kinematics of the tongue and jaw as the animal is either chewing food or licking from a spout. They then correlate the tongue kinematics with the recorded neural activity. They perform both single-unit and population level analyses during feeding and licking. Then, they recharacterize the tuning properties after bilateral lidocaine injections in the two sensory branches of the trigeminal nerve. They report that their nerve block causes a reorganization of the tuning properties and population trajectories. Overall, this paper concludes that M1o and S1o both contain representations of the tongue direction, but their numbers, their tuning properties and susceptibility to perturbed sensory input are different.
 
 Strengths
 
 The major strengths of this paper are in the state-of-the-art experimental methods employed to collect the electrophysiological and kinematic data. In the revision, the single-unit analyses of tuning direction are robustly characterized. The differences in neural correlations across behaviors, regions and perturbations are robust. In addition to the substantial amount of largely descriptive analyses, this paper makes two convincing arguments 1) The single-neuron correlates for feeding and licking in OSMCx are different - and can't be simply explained by different kinematics and 2) Blocking sensory input alters the neural processing during orofacial behaviors. The evidence for these claims is solid.
 
 Weaknesses
 
 The main weakness of this paper is in providing an account for these differences to get some insight into neural mechanisms. For example, while the authors show changes in neural tuning and different 'neural trajectory' shapes during feeding and drinking - their analyses of these differences are descriptive and provide limited insight for the underlying neural computations.
 
 Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.07.02.601741v3
www.biorxiv.org www.biorxiv.org

Active regulation of the epidermal growth factor receptor by the membrane bilayer

3
1. Public_Reviews 10 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  The authors describe an interesting approach to studying the dynamics and function of membrane proteins in different lipid environments. The important findings have theoretical and practical implications beyond the study of EGFR to all membrane signalling proteins. The evidence supporting the conclusions is convincing, based on the use of a nanodisk system to study membrane proteins in vitro, combined with state-of-the-art single-molecule FRET. The work will be of broad interest to cell biologists and biochemists.
  
  Summary
2. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  This work addresses a key question in cell signalling: how does the membrane composition affect the behaviour of a membrane signalling protein? Understanding this is important, not just to understand basic biological function but because membrane composition is highly altered in diseases such as cancer and neurodegenerative disease. Although parts of this question have been addressed on fragments of the target membrane protein, EGFR, used here, Srinivasan et al. harness a unique tool, membrane nanodisks, which allow them to probe full-length EGFR in vitro in great detail with cutting-edge fluorescent tools. They find interesting impacts on EGFR conformation in differently charged and fluid membranes, explaining previously identified signalling phenotypes.
  
  Strengths:
  
  The nanodisk system enables full-length EGFR to be studied in vitro and in a membrane with varying lipid and cholesterol concentrations. The authors combine this with single-molecule FRET utilising multiple pairs of fluorophores at different places on the protein to probe different conformational changes in response to EGF binding under different anionic lipid and cholesterol concentrations. They further support their findings using molecular dynamics simulations, which help uncover the full atomistic detail of the conformations they observe.
  
  Weaknesses:
  
  Much of the interpretation of the results comes down to a bimodal model of an 'open' and 'closed' state between the intracellular tail of the protein and the membrane. Some of the data looks like a bimodal model is appropriate, but its use is not sufficiently justified (statistically or otherwise) in this work in its current form. The experiments with varying cholesterol in particular appear to suggest an alternate model with longer fluorescent lifetimes. More justification of these interpretations of the central experiment of this work would strengthen the paper.
  
  Review 1
3. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  Nanodiscs and synthesized EGFR are co-assembled directly in cell-free reactions. Nanodiscs containing membranes with different lipid compositions are obtained by providing liposomes with corresponding lipid mixtures in the reaction. The authors focus on the effects of lipid charge and fluidity on EGFR activity.
  
  Strengths:
  
  The authors implement a variety of complementary techniques to analyze data and to verify results. They further provide a new pipeline to study lipid effects on membrane protein function.
  
  Weaknesses:
  
  Due to the relative novelty of the approach, a number of concerns remain.
  
  (1) I am a little skeptical about the good correlation of the nanodisc compositions with the liposome compositions. I would rather have expected a kind of clustering of individual lipid types in the liposome membrane, in particular of cholesterol. This should then result in an uneven distribution upon nanodisc assembly, i.e., in a notable variation of lipid composition in the individual nanodiscs. Could this be ruled out by the implemented assays, or can just the overall lipid composition of the complete nanodisc fraction be analyzed?
  
  (2) Both templates have been added simultaneously, with a 100-fold excess of the EGFR template. Was this the result of optimization? How is the kinetics of protein production? As EGFR is in far excess, a significant precipitation, at least in the early period of the reaction, due to limiting nanodiscs, should be expected. How is the oligomeric form of the inserted EGFR? Have multiple insertions into one nanodisc been observed?
  
  (3) The IMAC purification does not discriminate between EGFR-filled and empty nanodiscs. Does the TEM study give any information about the composition of the particles (empty, EGFR monomers, or EGFR oligomers)? Normalizing the measured fluorescence, i.e., the total amount of solubilized receptor, with the total protein concentration of the samples could give some data on the stoichiometry of EGFR and nanodiscs.
  
  (4) The authors generally assume a 100% functional folding of EGFR in all analyzed environments. While this could be the case, with some other membrane proteins, it was shown that only a fraction of the nanodisc solubilized particles are in functional conformation. Furthermore, the percentage of solubilized and folded membrane protein may change with the membrane composition of the supplied nanodiscs, while non-charged lipids mostly gave rather poor sample quality. The authors normalize the ATP binding to the total amount of detectable EGFR, and variations are interpreted as suppression of activity. Would the presence of unfolded EGFR fractions in some samples with no access to ATP binding be an alternative interpretation?
  
  Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.14.670284v1
www.biorxiv.org www.biorxiv.org

Death receptor 6 does not regulate axon degeneration and Schwann cell injury responses during Wallerian degeneration

4
1. Public_Reviews 10 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 In this valuable study, through carefully executed and rigorously controlled experiments, the authors challenged a previously reported role of the Death Receptor 6 (DR6/Tnfrsf21) in Wallerian degeneration (WD). Using two DR6 knockout mouse lines and multiple WD assays, both in vitro and in vivo, the authors provided convincing evidence that loss of DR6 in mice does not protect peripheral axons from WD after injury. Questions remain about whether this conclusion is generalizable to CNS axonal degeneration in disease models such as ALS, AD, and prion diseases. In addition, the authors need to provide information about the sex, age, and genetic background of their animal studies to allow readers to better assess the basis for inconsistencies from previous reports on the protective effects of DR6.
 
 Summary
2. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The authors show that genetic deletion of the orphan tumor necrosis factor receptor DR6 in mice does not protect peripheral axons against degeneration after axotomy. Similarly, Schwann cells in DR6 mutant mice react to axotomy similarly to wild-type controls. These negative results are important because previous work has indicated that loss or inhibition of DR6 is protective in disease models and also against Wallerian degeneration of axons following injury. This carefully executed counterexample is important for the field to consider.
 
 Strengths:
 
 A strength of the paper is the use of two independent mouse strains that knock out DR6 in slightly different ways. The authors confirm that DR6 mRNA is absent in these models (western blots for DR6 protein are less convincingly null, but given the absence of mRNA, this is likely an issue of antibody specificity). One of the DR6 knockout strains used is the same strain used in a previous paper examining the effects of DR6 on Wallerian degeneration.
 
 The authors use a series of established assays to evaluate axon degeneration, including light and electron microscopy on nerve histological samples and cultured dorsal root ganglion neurons in which axons are mechanically severed and degeneration is scored in time-lapse microscopy. These assays consistently show a lack of effect of loss of DR6 on Wallerian degeneration in both mouse strains examined.
 
 Therefore, in the specific context of these experiments, the author's data support their conclusion that loss of DR6 does not protect against Wallerian degeneration.
 
 Weaknesses:
 
 The major weaknesses of this paper include the tone of correcting previously erroneous results and the lack of reporting on important details around animal experiments that would help determine whether the results here really are discordant with previous studies, and if so, why.
 
 The authors do not report the genetic strain background of the mice used, the sex distributions of their experimental cohorts, or the age of the mice at the time the experiments were performed. All of these are important variables.
 
 The DR6 knockout strain reported in Gamage et al. (2017) was on a C57BL/6.129S segregating background. Gamage et al. reported that loss of DR6 protected axons from Wallerian degeneration for up to 4 weeks, but importantly, only in 38.5% (5 out of 13) mice they examined. In the present paper, the authors speculate on possible causes for differences between the lack of effect seen here and the effects reported in Gamage et al., including possible spontaneous background mutations, epigenetic changes, genetic modifiers, neuroinflammation, and environmental differences. A likely explanation of the incomplete penetrance reported by Gamage et al. is the segregating genetic background and the presence of modifier loci between C57BL/6 and 129S. The authors do not report the genetic background of the mice used in this study, other than to note that the knockout strain was provided by the group in Gamage et al. However, if, for example, that mutation has been made congenic on C57BL/6 in the intervening years, this would be important to know. One could also argue that the results presented here are consistent with 8 out of 13 mice presented in Gamage et al.
 
 Age is also an important variable. The protective effects of the spontaneous WldS mutation decrease with age, for example. It is unclear whether the possible protective effects of DR6 also change with age; perhaps this could explain the variable response seen in Gamage et al. and the lack of response seen here.
 
 It is unclear if sex is a factor, but this is part of why it should be reported.
 
 The authors also state that they do not see differences in the Schwann cell response to injury in the absence of DR6 that were reported in Gamage et al., but this is not an accurate comparison. In Gamage et al., they examined Schwann cells around axons that were protected from degeneration 2 and 4 weeks post-injury. Those axons had much thinner myelin, in contrast to axons protected by WldS or loss of Sarm1, where the myelin thickness remained relatively normal. Thus, Gamage et al. concluded that the protection of axons from degeneration and the preservation of Schwann cell myelin thickness are separate processes. Here, since no axon protection was seen, the same analysis cannot be done, and we can only say that when axons degenerate, the Schwann cells respond the same whether DR6 is expressed or not.
 
 The authors also take issue with Colombo et al. (2018), where it was reported that there is an increase in axon diameter and a change in the g-ratio (axon diameter to fiber diameter - the axon + myelin) in peripheral nerves in DR6 knockout mice. This change resulted in a small population of abnormally large axons that had thinner myelin than one would expect for their size. The change in g-ratio was specific to these axons and driven by the increased axon diameter, not decreased myelin thickness, although those two factors are normally loosely correlated. Here, the authors report no changes in axon size or g-ratio, but this could also be due to how the distribution of axon sizes was binned for analysis, and looking at individual data points in supplemental figure 3A, there are axons in the DR6 knockout mice that are larger than any axons in wild type. Thus, this discrepancy may be down to specifics and how statistics were performed or how histograms were binned, but it is unclear if the results presented here are dramatically at odds with the results in Colombo et al. (2018).
 
 Finally, it is important to note that previously reported effects of DR6 inhibition, such as protection of cultured cortical neurons from beta-amyloid toxicity, are not necessarily the same as Wallerian degeneration of axons distal to an injury studied here. The negative results presented here, showing that loss of DR6 is not protective against Wallerian degeneration induced by injury, are important given the interest in DR6 as a therapeutic target, but they are specific to these mice and this mechanism of induced axon degeneration. The extent to which these findings contradict previous work is difficult to assess due to the lack of detail in describing the mouse experiments, and care should be taken in attempting to extrapolate these results to other disease contexts, such as ALS or Alzheimer's disease.
 
 Review 1
3. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 This manuscript by Beirowski, Huang, and Babetto revisits the proposed role of Death Receptor 6 (DR6/Tnfrsf21) in Wallerian degeneration (WD). A prior study (Gamage et al., 2017) suggested that DR6 deletion delays axon degeneration and alters Schwann cell responses following peripheral nerve injury. Here, the authors comprehensively test this claim using two DR6 knockout mouse models (the line used in the earlier report plus a CMV-Cre derived floxed ko line) and multiple WD assays in vivo and in vitro, aligned with three positive controls, Sarm1 WldS and Phr1/Mycbp2 mutants. Contrary to the prior findings, they find no evidence that DR6 deletion affects axon degeneration kinetics or Schwann cell dynamics (assessed by cJun expression or [intact+degenerating] myelin abundance after injury) during WD. Importantly, in DRG explant assays, neurites from DR6-deficient mice degenerated at rates indistinguishable from controls. The authors conclude that DR6 is dispensable for WD, and that previously reported protective effects may have been due to confounding factors such as genetic background or spontaneous mutations.
 
 Strengths:
 
 The authors employ two independently generated DR6 knockout models, one overlapping with the previously published study, and confirm loss of DR6 expression by qPCR and Western blotting. Multiple complementary readouts of WD are applied (structural, ultrastructural, molecular, and functional), providing a robust test of the hypothesis.
 
 Comparisons are drawn with established positive controls (WldS, SARM1, Phr1/Mycbp2 mutants), reinforcing the validity of the assays.
 
 By directly addressing an influential but inconsistent prior report, the manuscript clarifies the role of DR6 and prevents potential misdirection of therapeutic strategies aimed at modulating WD in the PNS. The discussion thoughtfully considers possible explanations for the earlier results, including colony-specific second-site mutations that could explain the incomplete penetrance of the earlier reported phenotype of only 36%.
 
 Weaknesses:
 
 (1) The study focuses on peripheral nerves. The manuscript frequently refers to CNS studies to argue for consistency with their findings. It would be more accurate to frame PNS/CNS similarities as reminiscences rather than as consistencies (e.g., line 205ff in the Discussion).
 
 (2) The DRG explant assays are convincing, though the slight acceleration of degeneration in the DR6 floxed/Cre condition is intriguing (Figure 4E). Could the authors clarify whether this is statistically robust or biologically meaningful?
 
 (3) In the summary (line 43), the authors refer to Hu et al. (2013) (reference 5) as the study that previously reported AxD delay and SC response alteration after injury. However, this study did not investigate the PNS, and I believe the authors intended to reference Gamage et al. (2017) (reference 10) at this point.
 
 (4) In line 74ff of the results section, the authors claim that developmental myelination is not altered in DR6 mutants at postnatal day 1. However, the variability in Figure S2 appears substantial, and the group size seems underpowered to support this claim. Colombo et al. (2018) (reference 11) reported accelerated myelination at P1, but this study likewise appears underpowered. Possible reasons for these discrepancies and the large variability could be that only a defined cross-sectional area was quantified, rather than the entire nerve cross-section.
 
 (5) The authors stress the data of Gamage et al. (2017) on altered SC responses in DR6 mutants after injury. They employed cJun quantification to show that SC reprogramming after injury is not altered in DR6 mutants. This approach is valid and the conclusion trustworthy. Here, the addition of data showing the combined abundance of intact and degenerated myelin does not add much insight. However, Gamage et al. (2017) reported altered myelin thickness in a subset of axons at 14 days after injury, which is considerably later than the time points analyzed in the present study. While, in the Reviewer's view, the thin myelin observed by Gamage et al. in fact resembles remyelination, the authors may wish to highlight the difference in the time points analyzed.
 
 Review 2
4. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 The authors revisit the role of DR6 in axon degeneration following physical injury (Wallerian degeneration), examining both its effects on axons and its role in regulating the Schwann cell response to injury. Surprisingly, and in contrast to previous studies, they find that DR6 deletion does not delay the rate of axon degeneration after injury, suggesting that DR6 is not a mediator of this process.
 
 Overall, this is a valuable study. As the authors note, the current literature on DR6 is inconsistent, and these results provide useful new data and clarification. This work will help other researchers interpret their own data and re-evaluate studies related to DR6 and axon degeneration.
 
 Strengths:
 
 (1) The use of two independent DR6 knockout mouse models strengthens the conclusions, particularly when reporting the absence of a phenotype.
 
 (2) The focus on early time points after injury addresses a key limitation of previous studies. This approach reduces the risk of missing subtle protective phenotypes and avoids confounding results with regenerating axons at later time points after axotomy.
 
 Weaknesses:
 
 (1) The study would benefit from including an additional experimental paradigm in which DR6 deficiency is expected to have a protective effect, to increase confidence in the experimental models, and to better contextualize the findings within different pathways of axon degeneration. For example, DR6 deletion has been shown in more than one study to be partially axon protective in the NGF deprivation model in DRGs in vitro. Incorporating such an experiment could be straightforward and would strengthen the paper, especially if some of the neuroprotective effects previously reported are confirmed.
 
 (2) The quality of some figures could be improved, particularly the EM images in Figure 2. As presented, they make it difficult to discern subtle differences.
 
 Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.07.21.665928v1
www.biorxiv.org www.biorxiv.org

Dynamic Architecture of Mycobacterial Outer Membranes Revealed by All-Atom Simulations

3
1. Public_Reviews 10 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 In their study, Brown et. al. provide an important advance in understanding the architecture of the mycobacterial outer membrane. Using all-atom simulations of model mycomembranes, the work reports compelling structural insights into how α-mycolic acids and outer leaflet lipids (PDIM and PAT) shape membrane organisation. The work revealed membrane heterogeneity with ordered inner leaflets and disordered outer leaflets that provide a molecular explanation for the resilience of the mycobacterial envelope.
 
 Summary
2. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Disclaimer:
 
 This reviewer is not an expert on MD simulations but has a basic understanding of the findings reported and is well-versed with mycobacterial lipids.
 
 Summary:
 
 In this manuscript titled "Dynamic Architecture of Mycobacterial Outer Membranes Revealed by All-Atom 1 Simulations", Brown et al describe outcomes of all-atom simulation of a model outer membrane of mycobacteria. This compelling study provided three key insights: (1) The likely conformation of the unusually long chain alpha-branched beta-methoxy fatty acids, mycolic acids in the mycomembrane, to be the extended U or Z type rather than the compacted W-type. (2) Outer leaflet lipids such as PDIM and PAT provide regional vertical heterogeneity and disorder in the mycomembrane that is otherwise prevented in a mycolic acid-only bilayer. (3) Removal of specific lipid classes from the symmetric membrane systems leads to significant changes in membrane thickness and resilience to high temperatures.
 
 Strengths:
 
 The authors take a step-wise approach in building the complexity of the membrane and highlight the limitations of each of the approaches. A case in point is the use of supraphysiological temperature of 333 K or even higher temperatures for some of the simulations. Overall, this is a very important piece of work for the mycobacterial field, and will help in the development of membrane-disrupting small molecules and provide important insights for lipid-lipid interactions in the mycomembrane.
 
 Weaknesses:
 
 (1) The authors used alpha-mycolic acids only for their models. The ratios of alpha, keto, and methoxy-mycolic acids are known in the literature, and it may be worth including these in their model. Future studies can be aimed at addressing changes in the dynamic behavior of the MOM by altering this ratio, but the inclusion of all three forms in the current model will be important and may alter the other major findings of the current study.
 
 (2) The findings from the 14 different symmetric membrane systems developed with the removal of one complex lipid at a time are very interesting but have not been analysed/discussed at length in the current manuscript. I find many interesting insights from Figures S3 and S5, which I find missing in the manuscript. These are as follows:
 
 a) Loss of PDIM resulted in reduced membrane thickness. This is a very important finding given that loss of PDIM can be a spontaneous phenomenon in Mtb cultures in vitro and that this is driven by increased nutrient uptake by PDIM-deficient bacilli (Domenech and Reed, 2009 Microbiology). While the latter is explained by the enhanced solute uptake by several PE/PPE transporter systems in the absence of PDIM (Wang et al, Science 2020), the findings presented by Brown et al could be very important in this context. A discussion on these aspects would be beneficial for the mycobacterial community.
 
 b) I find it interesting that loss of PAT or DAT does not change membrane thickness (Figure S3). While both PAT and PDIM can migrate to the interleaflet space, loss of PDIM and PAT has a different impact on membrane thickness. It is worth explaining what the likely interactions are that shape membrane thickness in the case of the modelled MOM.
 
 c) Figure S5: Is the presence of SGL driving PDIM and PAT to migrate to the inter-leaflet space? Again, a discussion on major lipid-lipid interactions driving these lipid migrations across the membrane thickness would be useful.
 
 Review 1
3. Public_Reviews 10 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The manuscript reports all-atom molecular dynamics simulations on the outer membrane of Mycobacterium tuberculosis. This is the first all-atom MD simulation of the MTb outer membrane and complements the earlier studies, which used coarse-grained simulation.
 
 Strengths:
 
 The simulation of the outer membrane consisting of heterogeneous lipids is a challenging task, and the current work is technically very sound.
 
 The observation about membrane heterogeneity and ordered inner leaflets vs disordered outer leaflets is a novel result from the study. This work will also facilitate other groups to work on all-atom models of mycobacterial outer membrane for drug transport, etc.
 
 Weaknesses:
 
 Beyond a challenging simulation study, the current manuscript only provides qualitative explanations on the unusual membrane structure of MTb and does not demonstrate any practical utility of the all-atom membrane simulation. It will be difficult for the general biology community to appreciate the significance of the work, based on the manuscript in its current form, because of the high content of technical details and limited evidence on the utility of the work.
 
 Major Points:
 
 (1) The simulation by Basu et al (Phys Chem Chem Phys 2024) has studied drug transports through mycolic acid monolayers. Since the authors of the current study have all atom models of MTb outer membrane, they should carry out drug transport simulations and compare them to the outer membranes of other bacteria through which drugs can permeate. In the current manuscript, it is only discussed in lines 388-392. Can the disruption of MA cyclopropanation be simulated to show its effect on membrane structure ?
 
 (2) In line 277, the authors mention about 6 simulations which mimic lipid knockout strains. The results of these simulations, specifically the outcomes of in silico knockout of lipids, are not described in detail.
 
 (3) Figure 5 shows PDIM and PAT-driven lipid redistribution, which is a significant novel observation from the study. However, comparison of 3B and 3D shows that at 313K, the movement of the PDIM head group is much less. Since MD simulations are sensitive to random initial seeds, repeated simulations with different random seeds and initial structures may be necessary.
 
 (4) As per Figure 1, in the initial structure, the head group of PAT should be on the membrane surface, similar to TDM and TMM, while PDIM is placed towardsthe interior of the outer membrane. However, Figure 5 shows that at t=0, PAT has the same Z position as PDIM. It will be necessary to provide Z-position Figures for TMM and TDM to understand the difference. Is it really dependent on the chemical structure of the lipid moiety or the initial position of the lipid in the bilayer at the beginning of the simulation?
 
 Minor Point:
 
 In view of the complexity of the system undertaken for the study, the manuscript in its current form may not be informative for readers who are not experts in molecular simulations.
 
 Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.05.24.655956v1
www.biorxiv.org www.biorxiv.org

Overexpression of Ssd1 and calorie restriction extend yeast replicative lifespan by preventing deleterious age-dependent iron uptake

4
1. Public_Reviews 10 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This important study uses innovative microfluidics-based single-cell imaging to monitor replicative lifespan, protein localization, and intracellular iron levels in aging yeast cells. The evidence for the proposed role of Ssd1 and reduced nutrients for lifespan through limiting iron uptake is convincing, even though some mechanistic details remain unclear. This work will be of interest to cell biologists working on aging and iron metabolism.
  
  Summary
2. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  Overexpression of the mRNA-binding protein Ssd1 was shown before to expand the replicative lifespan of yeast cells, whereas ssd1 deletion had the opposite effect. Here, the authors provide evidence that Ssd1 acts via sequestration of mRNAs of the Aft1/2-dependent iron regulon. This restricts activation of the regulon and limits accumulation of Fe2+ inside cells, thereby likely lowering oxidative damage. The effects of Ssd1 overexpression and calorie restriction on lifespan are epistatic, suggesting that they might act through the same pathway.
  
  Strengths:
  
  The study is well-designed and involves analysis of single yeast cells during replicative aging. The findings are well displayed and largely support the derived model, which also has implications for the lifespan of other organisms, including humans.
  
  Weaknesses:
  
  The model is largely supported by the findings, however, they remain largely correlative at the same time. Whether the knockout of ssd1 shortens lifespan by increased intracellular Fe2+ levels has not been tested. The finding that increased Ssd1 levels form condensates in a cell-cycle-dependent manner is interesting, yet the role of the condensates in lifespan expansion remains untested and unlinked.
  
  Review 1
3. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  This manuscript describes the use of a powerful technique called microfluidics to elucidate the mechanisms explaining how overexpression (OE) of Ssd1 and caloric restriction (CR) in yeast extend replicative lifespan (RLS). Microfluidics measures RLS by trapping cells in chambers mounted to a slide. The chambers hold the mother cell but allow daughters to escape. The slide, with many chambers, is recorded during the entire process, roughly 72 hours, with the video monitored afterwards to count how many daughters each of the trapped mothers produces. The power of the method is what can be done with it. For example, the entire process can be viewed by fluorescence so that GFP and mCherry-tagged proteins can be followed as cells age. The budding yeast is the only model where bona fide replicative aging can be measured, and microfluidics is the only system that allows protein localization and levels to be measured in a single cell while aging. The authors do a wonderful job of showing what this combination of tools can do.
  
  The authors had previously shown that Ssd1, an mRNA-binding protein, extends RLS when overexpressed. This was attributed to Ssd1 sequestering away specific mRNAs under stress, likely leading to reduced ribosomal function. It remained completely unknown how Ssd1 OE extended RLS. The authors observed that overexpressed, but not normally expressed, Ssd1 formed cytoplasmic condensates during mitosis that are resolved by cytokinesis. When the condensates fail to be resolved at the end of mitosis, this signals death.
  
  It has become clear in the literature that iron accumulation increases with age within the cell. The transcriptional programs that activate the iron regulon also become elevated in aging cells. This is thought to be due to impaired mitochondrial function in aging cells, with increased iron accumulation as an attempt at restoring mitochondrial activity. The authors show that Ssd1 OE and CR both reduce the expression of the iron regulon. The data presented indicate that iron accumulation shortens RLS: deletion of iron regulon components extends RLS, and adding iron to WT cells decreases RLS, but not when Ssd1 is overexpressed or when cells are calorically restricted. Interestingly, iron chelation using BPS has no impact on WT RLS, but decreases the elevated RLS in CR cells and cells overexpressing Ssd1. It was not initially clear why iron chelation would inhibit the extended lifespan seen with CR and Ssd1 OE. This was addressed by an experiment where it was shown that the iron regulon is induced (FIT2 induction) when iron is chelated. Thus, the detrimental effects of induction of the iron regulon by BPS and iron accumulation on RLS cannot be tempered by Ssd1 OE and CR once turned on.
  
  I did not find any weaknesses to be addressed in this paper. The draft was well-written, and the extensive experimentation was well-designed, performed, and controlled. However, I did make minor comments that I recommend the authors address:
  
  (1) Why would BPS not reduce RLS in WT cells? The authors could test whether OE of FIT2 reduces RLS in WT cells.
  
  (2) The authors should add a brief explanation for why the GDP1 promoter was chosen for Ssd1 OE.
  
  (3) On page 12, growth to saturation was described as glucose starvation. This is more accurately described as nutrient deprivation. Referring to it as glucose starvation is akin to CR, which growing to saturation is not. Ssd1 OE formed condensates upon saturation but not in CR. Why do the authors think Ssd1 OE did not form condensates upon CR? Too mild a stress?
  
  (4) The authors conclude that the main mechanism for RLS extension in CR and Ssd1 OE is the inhibition of the iron regulon in aging cells. The data certainly supports this. However, this may be an overstatement as other mutations block CR, such as mutations that impair respiration. The authors do note that induction of the iron regulon in aging cells could be a response to impaired mitochondrial function. Thus, it seems that the main goal of CR and Ssd1 OE may be to restore mitochondrial function in aging cells, one way being inactivation of the iron regulon. A discussion of how other mutations impact CR would be of benefit.
  
  (5) The cell cycle regulation of Ssd1 OE condensates is very interesting. There does not appear to be literature linking Ssd1 with proteasome-dependent protein turnover. Many proteins involved in cell cycle regulation and genome stability are regulated through ubiquitination. It is not necessary to do anything here about it, but it would be interesting to address how Ssd1 condensates may be regulated with such precision.
  
  (6) While reading the draft, I kept asking myself what the relevance to human biology was. I was very impressed with the extensive literature review at the end of the discussion, going over how well conserved this strategy is in yeast with humans. I suggest referring to this earlier, perhaps even in the abstract. This would nail down how relevant this model is for understanding human longevity regulation.
  
  In conclusion, I enjoyed reading this manuscript, describing how Ssd1 OE and CR lead to RLS increases, using different mechanisms. However, since the 2 strategies appear to be using redundant mechanisms, I was surprised that synergism was not observed.
  
  Review 2
4. Public_Reviews 10 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  In this paper, the authors investigate how the RNA-binding protein Ssd1 and calorie restriction (CR) influence yeast replicative lifespan, with a particular focus on age-dependent iron uptake and activation of the iron regulon. For this, they use microfluidics-based single-cell imaging to monitor replicative lifespan, protein localization, and intracellular iron levels across aging cells. They show that both Ssd1 overexpression and CR act through a shared pathway to prevent the nuclear translocation of the iron-regulon regulator Aft1 and the subsequent induction of high-affinity iron transporters. As a result, these interventions block the age-related accumulation of intracellular free iron, which otherwise shortens lifespan. Genetic and chemical epistasis experiments further demonstrate that suppression of iron regulon activation is the key mechanism by which Ssd1 and CR promote replicative longevity.
  
  Overall, the paper is technically rigorous, and the main conclusions are supported by a substantial body of experimental data. The microfluidics-based assays in particular provide compelling single-cell evidence for the dynamics of Ssd1 condensates and iron homeostasis.
  
  My main concern, however, is that the central reasoning of the paper-that Ssd1 overexpression and CR prevent the activation of the iron regulon-appears to be contradicted by previous findings, and the authors may actually be misrepresenting these studies, unless I am mistaken. In the manuscript, the authors state on two occasions:
  
  "Intriguingly, transcripts that had altered abundance in CR vs control media and in SSD1 vs ssd1∆ yeast included the FIT1, FIT2, FIT3, and ARN1 genes of the iron regulon (8)"
  
  "Ssd1 and CR both reduce the levels of mRNAs of genes within the iron regulon: FIT1, FIT2, FIT3 and ARN1 (8)"
  
  However, reference (8) by Kaeberlein et al. actually says the opposite:
  
  "Using RNA derived from three independent experiments, a total of 97 genes were observed to undergo a change in expression >1.5-fold in SSD1-V cells relative to ssd1-d cells (supplemental Table 1 at http://www.genetics.org/supplemental/). Of these 97 genes, only 6 underwent similar transcriptional changes in calorically restricted cells (Table 2). This is only slightly greater than the number of genes expected to overlap between the SSD1-V and CR datasets by chance and is in contrast to the highly significant overlap in transcriptional changes observed between CR and HAP4 overexpression (Lin et al. 2002) or between CR and high external osmolarity (Kaeberlein et al. 2002). Intriguingly, of the 6 genes that show similar transcriptional changes in calorically restricted cells and SSD1-V cells, 4 are involved in iron-siderochrome transport: FIT1, FIT2, FIT3, and ARN1 (supplemental Table 1 at http://www.genetics.org/supplemental/)."
  
  Although the phrasing might be ambiguous at first reading, this interpretation is confirmed upon reviewing Matt Kaeberlein's PhD thesis: https://dspace.mit.edu/handle/1721.1/8318 (page 264 and so on).
  
  Moreover, consistent with this, activation of the iron regulon during calorie restriction (or the diauxic shift) has also been observed in two other articles:
  
  https://doi.org/10.1016/S1016-8478(23)13999-9
  
  https://doi.org/10.1074/jbc.M307447200
  
  Taken together, these contradictory data might blur the proposed model and make it unclear how to reconcile the results.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.09.02.673772v1
arxiv.org arxiv.org

Fragmentation and aggregation of cyanobacterial colonies

4
1. Public_Reviews 10 Oct 2025
 
 in eLife (unscoped)
 
 eLife Assessment
 
 With the goal of investigating the assembly and fragmentation of cellular aggregates, this manuscript investigates cyanobacterial aggregates in a laboratory setting. This investigation of the conditions and mechanisms behind aggregation is an important contribution as it yields basic understanding of natural processes and offers potential strategies for control. The combination of computational and experimental investigations in this manuscript provides solid support for the role of shear on aggregation and fragmentation. However, the role of extracellular matrix, with possibly a strong effect on aggregation, is not adequately studied.
 
 Summary
2. Public_Reviews 10 Oct 2025
 
 in eLife (unscoped)
 
 Reviewer #1 (Public review):
 
 Sinzato et. al. investigated how shear flow in a rheological chamber affects the assembly and fragmentation of cyanobacterial aggregates, with the goal of understanding how such aggregates might form naturally, and/or be destroyed industrially. The authors used a combination of experiments and models to show that cyanobacterial colonies can be difficult to fragment with fluid flows. Additionally, they provide biophysical support for the idea that such aggregates likely form primarily when cells stay together after cell division, rather than coming together from disparate paths.
 
 This work has significant relevance to the field, both practically and naturally. Combatting or preventing toxic cyanobacterial blooms is an active area of environmental research that offers a practical backbone for this manuscript's ideas. Additionally, the formation and behavior of cellular aggregates in general is of widespread interest in many fields, including marine and freshwater ecology, healthcare and antibiotic resistance research, biophysics, and microbial evolution. In this field, there are still outstanding questions regarding how microbial aggregates form into communities, including if and how they come together from separate places. Therefore, I believe that researchers from many distinct fields would find interest in the topic of this paper, and particularly Figure 5, in which a phase space that is meant to represent the different modes of aggregate formation and destruction is suggested, dependent on properties of the fluid flow and particle concentration.
 
 Altogether, the authors were successful in their investigation, and I find their claims to be justified. In particular, the authors achieve strong results from their experiments. Below, I outline key claims of the paper and indicate the level to which they were supported by their data.
 
 Their first major claim is that fluid flows alone must be quite strong in order to fragment the cyanobacterial aggregates they have studied. With their rheological chamber, they explicitly show that energy dissipation rates must exceed "natural" conditions by multiple orders of magnitude in order to fragment lab strain colonies, and even higher to disrupt natural strains sampled from a nearby freshwater lake. This claim is well-supported by their experiments and data.
 
 The authors then claim that the fragmentation of aggregates due to fluid flows occurs primarily through erosion of small pieces from larger aggregates. Because their experimental setup does not allow them to directly observe this process (for example, by watching one aggregate break into pieces), they rely on indirect methods to support the claim. Overall, the experimental evidence is generally supportive, but the models leave some gaps. I describe this conclusion in more detail below.
 
 The strongest evidence for the erosion-dominated process comes from the authors' measurements of transfer of biomass between large and small size classes, as in Figure 2E and Figure 2D. The authors claim that only the erosion model can reproduce this kind of biomass transfer. However, it also seems that the idealized erosion model alone is not fully sufficient to capture the observed behavior. In Figure 2D, there remains a gap between their experiment and the prediction of the erosion model, which grows larger over time (Supplemental Figure S9). While the authors suggest that the erosion model is better than the equal-fragmentation model, it is also true that tracking the mean size (Figure 2B) or small size distribution (Figure S6) cannot distinguish between these models.
 
 Taken altogether, the experimental evidence favors an erosion-dominated process. However, a few minor questions remain regarding the models. Why does the equal-fragmentation model predict no biomass transfer between size classes? To what extent, quantitatively, does the erosion model outperform the equal fragments model at capturing the biomass size distributions? Finally, why does the idealized erosion fail to capture the size distribution at late stages in Supplemental Figure S9 - would this discrepancy be resolved if the authors considered individual colony variances in cell adhesion (for instance, as hypothesized by the authors in lines 133-137)? I do not believe these questions curb the other results of the paper.
 
 Their third major claim is that fluid flows only weakly cause cells to collide and adhere in a "coming together" process of aggregate formation. They test this claim in Figure 3, where they suspend single cells in their test chamber and stir them at moderate intensity, monitoring their size histogram. They show that the size histogram changes only slightly, indicating that aggregation is, by-and-large, not occurring at a high rate. Therefore, they lend support to the idea that cell aggregation likely does not initiate group formation in toxic cyanobacterial blooms. Additionally, they show that the median size of large colonies also does not change at moderate turbulent intensities. These results agree with previous studies (their own citation 25) indicating that aggregates in toxic blooms are clonal in nature. This is an important result, and well-supported by their data, but only for this specific particle concentration and stirring intensity. Later, in Figure 5 they show a much broader range of particle concentrations and energy dissipation rates that they leave untested. However, they refer to other literature that does test these regions of the phase map.
 
 The fourth major result of the manuscript is displayed in Equation 8 and Figure 5, where the authors derive an expression for the ratio between the rate of increase of a colony due to aggregation vs. the rate due to cell division. They then plot this line on a phase map, altering two physical parameters (concentration and fluid turbulence) to show under what conditions aggregation vs. cell division are more important for group formation. Because these results are derived from relatively simple biophysical considerations, they have the potential to be quite powerful and useful, and represent a significant conceptual advance. By combining their experiments with discussions of other experimental investigations of scum formation in cyanobacterial blooms, the authors have investigated the two most relevant zones of this map for the present study (Zones II and III), and have made a strong contribution to the literature in regards to artificial mixing to disrupt cyanobacterial blooms.
 
 Other notes:
 
 The authors rely heavily on size distributions to make the claims of their paper. I was pleased to find the calibration histograms in Supplemental Figure S8, which provide information as to how and why they made corrections to the histograms they observed. From these calibration histograms, it seems that larger colonies are more accurately measured in the cone-and-plate shear setup, while smaller colonies can be missed, presumably due to resolution issues.
 
 Review 1
3. Public_Reviews 10 Oct 2025
 
 in eLife (unscoped)
 
 Reviewer #2 (Public review):
 
 Summary:
 
 In this work, the authors investigate the role of fluid flow in shaping the colony size of a freshwater cyanobacterium Microcystis. To do so, they have created a novel assay by combining a rheometer with a bright field microscope. This allows them to exert precise shear forces on cyanobacterial cultures and field samples, and then quantify the effect of these shear forces on the colony size distribution. Shear force can affect the colony size in two ways: reducing size by fragmentation and increasing size by aggregation. They find limited aggregation at low shear rates, but high shear forces can create erosion-type fragmentation: colonies do not break in large pieces, but many small colonies are sheared off the large colonies. Overall, bacterial colonies from field samples seem to be more inert to shear than laboratory cultures, which the authors explain in terms of enhanced intercellular adhesion mediated by secreted polysaccharides.
 
 Strengths:
 
 This study is timely, as cyanobacterial blooms are an increasing problem in freshwater lakes. They are expected to increase in frequency and severeness because of rising temperatures, and it is worthwhile learning how these blooms are formed. More generally, how physical aspects such as flow and shear influence colony formation is often overlooked, at least in part because of experimental challenges. Therefore, the method developed by the authors is useful and innovative, and I expect applications beyond the presented system here.
 
 A strong feature of this paper is the highly quantitative approach, combining theory with experiments, and the combination of laboratory experiments and field samples.
 
 Weaknesses:
 
 Especially the introduction seems to imply that shear force is a very important parameter controlling colony formation. However, if one looks at the results this effect is overall rather modest, especially considering the shear forces that these bacterial colonies may experience in lakes. The main conclusion seems that not shear but bacterial adhesion is the most important factor in determining colony size. The writing could have done more justice to the fact that the importance of adhesion had been described elsewhere. This being said, the same method can be used to investigate systems where shear forces are biologically more relevant.
 
 Review 2
4. Public_Reviews 10 Oct 2025
 
 in eLife (unscoped)
 
 Author response:
 
 The following is the authors’ response to the original reviews
 
 Reviewer #1 (Public review):
 
 (1) Their first major claim is that fluid flows alone must be quite strong in order to fragment the cyanobacterial aggregates they have studied. With their rheological chamber, they explicitly show that energy dissipation rates must exceed "natural" conditions by multiple orders of magnitude in order to fragment lab strain colonies, and even higher to disrupt natural strains sampled from a nearby freshwater lake. This claim is well-supported by their experiments and data.
 
 We thank the reviewer for this positive comment. We fully agree, as our fragmentation experiments on division-formed colonies clearly demonstrate their strong mechanical resistance in naturally occurring flows.
 
 (2) The authors then claim that the fragmentation of aggregates due to fluid flows occurs through erosion of small pieces. Because their experimental setup does not allow them to explicitly observe this process (for example, by watching one aggregate break into pieces), they implement an idealized model to show that the nature of the changes to the size histogram agrees with an erosion process. However, in Figure 2C there is a noticeable gap between their experiment and the prediction of their model. Additionally, in a similar experiment shown in Figure S6, the experiment cannot distinguish between an idealized erosion model and an alternative, an idealized binary fission model where aggregates split into equal halves. For these reasons, this claim is weakened.
 
 The two idealized models of colony fragmentation, namely erosion of single cells and fragmentation into equal sizes (or binary fission), lead to distinguishable final size distributions. We believe that our experiments for division-formed colonies support the hypothesis of the erosion mechanism. Specifically, Figure 2E shows that colony fragmentation resulted in a decrease of large colonies and a strong increase of single cells and dimers (two cells). In our view, the strong increase of single cells and dimers provides quite convincing (but indirect) evidence supporting the erosion mechanism. This is described on lines 112-121. To further address the reviewer’s concern, we have included in the revised version of Figure 2 (panels B and D) a direct comparison between these two fragmentation models for large division-formed colonies fragmented at a high dissipation rate of ε = 5.8 m2/s3. Furthermore, we have included the new Supplementary Figure S9, which details the model predictions for the colony size distribution at various time points.
 
 The ideal equal fragments model (i.e., where every fracture event produces two identical fragments with half the original biovolume) does not capture the biovolume transfer from large colonies to single cells, as observed for the experimental results in panel D of Figure 2 and panel E of Figure S9. In contrast, the erosion model, in panel D of Figure 2 and panel D of Figure S9, provides a good prediction of the experimental results within the experimental uncertainty. The different fragmentation models are discussed in lines 226-228 of the revised manuscript and lines 865-873 of the SI.
 
 (3) Their third major claim is that fluid flows only weakly cause cells to collide and adhere in a "coming together" process of aggregate formation. They test this claim in Figure 3, where they suspend single cells in their test chamber and stir them at moderate intensity, monitoring their size histogram. They show that the size histogram changes only slightly, indicating that aggregation is, by and large, not occurring at a high rate. Therefore, they lend support to the idea that cell aggregation likely does not initiate group formation in toxic cyanobacterial blooms. Additionally, they show that the median size of large colonies also does not change at moderate turbulent intensities. These results agree with previous studies (their own citation 25) indicating that aggregates in toxic blooms are clonal in nature. This is an important result and well-supported by their data, but only for this specific particle concentration and stirring intensity. Later, in Figure 5 they show a much broader range of particle concentrations and energy dissipation rates that they leave untested.
 
 We thank the reviewer for this positive comment. We agree that our experimental results show clear evidence that aggregated colonies have a weaker structure in comparison to division-formed colonies, thus supporting the hypothesis that clonal expansion is the main mechanism for colony formation under most natural settings. The range of energy dissipation rates of our experimental setup covers almost entirely the region for which aggregated and division-formed colonies differ in their fragmentation behavior (Zone III of Figure 5). Within this zone, aggregated colonies are fragmented and only the division-formed colonies are able to withstand the hydrodynamic stresses. Furthermore, we show that this fragmentation behavior has a low sensitivity to the total biovolume fraction, as displayed in the Supplementary Figures S2 and S4 and discussed in lines 151-154 and 160-163. We agree that our cone-and-plate setup covers a limited parameter range, and we have added a detailed discussion of these limitations in the revised manuscript, under section Materials and Methods in lines 462-473.
 
 (4) The fourth major result of the manuscript is displayed in Equation 8 and Figure 5, where the authors derive an expression for the ratio between the rate of increase of a colony due to aggregation vs. the rate due to cell division. They then plot this line on a phase map, altering two physical parameters (concentration and fluid turbulence) to show under what conditions aggregation vs. cell division are more important for group formation. Because these results are derived from relatively simple biophysical considerations, they have the potential to be quite powerful and useful and represent a significant conceptual advance. However, there is a region of this phase map that the authors have left untested experimentally. The lowest energy dissipation rate that the authors tested in their experiment seemed to be \dot{epsilon}~1e-2 [m^2/s^3], and the highest particle concentration they tested was 5e-4, which means that the authors never tested Zone II of their phase map. Since this seems to be an important zone for toxic blooms (i.e. the "scum formation" zone), it seems the authors have missed an important opportunity to investigate this regime of high particle concentrations and relatively weak turbulent mixing.
 
 We agree with the reviewer that Zone (II) of Figure 5 is of great importance to dense bloom formation under wind mixing and that this parameter range was not covered by our experiments using a cone-and-plate shear flow. The measuring range of our device was motivated by engineering applications such as artificial mixing of eutrophic lakes using bubble plumes, as well as preliminary experiments which demonstrated that high levels of dissipation rate were required to achieve fragmentation. The range of dissipation rates that can be achieved by the cone-and-plate setup is limited at the lower end by the accumulation of colonies near the stagnation point at the conical tip and at the upper end by the spillage of fluid out of the chamber. We now discuss this measuring range in lines 462-473 of the revised manuscript.
 
 Although our setup does not cover Zone (II), we now refer to recent results in the literature for evidence of aggregation-dominance at Zone (II). The experimental study of Wu et al. (2024) (reference number 64 of the revised manuscript) investigated the formation of Microcystis surface scum layers in wind-mixed mesocosms. Their study identified aggregation of colonies in the scum layer, resulting in increases of colony size at rates faster than cell division. These results agree with our model, and the parameters range investigated fall within the Zone II. We have included in the revised version, lines 328-337, a detailed discussion elucidating the parameter range covered in our experiments and the findings of Wu et al. (2024).
 
 Other items that could use more clarity:
 
 (5) The authors rely heavily on size distributions to make the claims of their paper. Yet, how they generated those size distributions is not clearly shown in the text. Of primary concern, the authors used a correction function (Equation S1) to estimate the counts of different size classes in their image analysis pipeline. Yet, it is unclear how well this correction function actually performs, what kinds of errors it might produce, and how well it mapped to the calibration dataset the authors used to find the fit parameters.
 
 We agree with the reviewer that more details of the correction function should be included. We have included in the revised version of the Supporting Information, in lines 785-796, a more detailed explanation of the correction function. Furthermore, a direct comparison of raw and corrected histograms of the size distribution and its associated uncertainty is presented in the new Supplementary Figure S8.
 
 (6) Second, in their models they use a fractal dimension to estimate the number of cells in the group from the group radius, but the agreement between this fractal dimension fit and the data is not shown, so it is not clear how good an approximation this fractal dimension provides. This is especially important for their later derivation of the "aggregation-to-cell division" ratio (Equation 8)
 
 We agree with the reviewer that more details on the estimation of fractal dimension are needed. The revised version, under Materials and Methods in lines 508-515, now includes the detailed estimation procedure, the number of colonies analysed, and the associated uncertainty.
 
 Reviewer #1 (Recommendations For The Authors):
 
 In light of the weak evidence for claim #2 outlined above, I believe the paper would benefit from a more explicit comparison in Figure 2C of the two models - idealized erosion, and idealized binary fission. With such a comparison, the authors would have stronger footing to claim that one process is more important than the other.
 
 As mentioned in our answer above to comment #2 of public review, we have included in the revised version of Figure 2 (panels B and D) a direct comparison between the erosion and equal fragments (binary fission) models for large division-formed colonies fragmented under ε = 5.8 m2/s3. The comparison is further detailed in the new Supplementary Figure S9 for representative time points. Only the erosion models can recover the biovolume transfer from large colonies to single cells, as observed for the experimental results in Figure 2D and further detailed in Figure S9D. We believe that the revised version of Figure 2 and the new Supplementary Figure S9 provide strong evidence in support of the erosion fragmentation model.
 
 Would the authors comment on their chosen range of experimental dissipation rates? For instance, was their goal more to investigate industrial/engineering applications where the goal is to disrupt the cyanobacteria, but not really typical natural conditions under which the groups might form?
 
 The choice of experimental dissipation rates in our experiment was such that it covers engineering applications such as artificial mixing of eutrophic lakes using bubble plumes. We have now clarified in the Introduction, on lines 37-39, that artificial mixing has been successfully applied in several lakes to suppress cyanobacterial blooms. Furthermore, we have now clarified in the caption of Figure 5 that the bars on the right side indicate typical values of dissipation rates induced by natural wind-mixing, bubble plumes in artificially mixed lakes, and laboratory-scale experiments such as cone-and-plate systems and stirred tanks. The dissipation rates induced by the bubble plumes in artificially mixed lakes could potentially fragment aggregated cyanobacterial colonies and thus disrupt bloom formation. However, our preliminary experiments demonstrated that high levels of dissipation rate were required to achieve fragmentation, therefore we’ve focused on the upper range of values (0.01 to 10 m2/s3).
 
 The dissipation rates generated by the cone-and-plate approach are indeed higher than the dissipation rates under typical natural conditions in lakes. We have now added a detailed discussion of the range of dissipation rates generated by the cone-and-plate approach in the revised manuscript, under section Materials and Methods in lines 462-473, where we also explain that these values are higher than the natural dissipation rates generated by wind action in lakes. However, the more generic insights obtained by our study, shown in Figure 5, are relevant for dissipation rates of natural lakes (e.g., Zone II). Therefore, in our discussion of Figure 5 we have now included the recent findings of Wu et al. (2024) (reference number [64] of the revised manuscript), who studied bloom formation of Microcystis in mesocosm experiments at dissipation rates representative of natural conditions; see also our reply to the next comment.
 
 The authors should consider testing the space of Zone II on their phase map, for instance at very high particle concentrations and even lower rotational speeds, in order to show that their derivations match experiments.
 
 Good point. As mentioned in our answer above to comment #4 of the public review, Zone II lies beyond the measuring range of our experimental setup. Instead, we refer to the recent study of Wu et al. (2024) (reference number [64] of the revised manuscript) which demonstrated that dense scum layers of Microcystis colonies are aggregation-dominated. These mesocosm experiments agree with our model predictions and their parameter range falls within Zone II. We have included in the revised version, lines 328-337, a detailed discussion where we elucidate the parameter range covered in our experiments and compare our predictions for Zone II with the recent findings of Wu et al. (2024).
 
 The authors should show their calibration data and fit for the correction function of equation S1. Additionally, you may consider showing "raw" and "corrected" histograms of the size distribution, to demonstrate exactly what corrections are made.
 
 As mentioned in our answer above to comment #5 of the public review, we have included in the revised version of the Supporting Information the new Supplementary Figure S8, which shows the raw and adjusted histograms of the size distribution, including the associated uncertainties. Furthermore, the correction function is now explained in detail in the new Supporting Information Text in lines 785-796.
 
 The authors might consider commenting on Figure S3 a bit more in the main text. Even at very high dissipation rates, the cyanobacterial groups don't plummet to size 1, but stay in an equilibrium around 10-20x the diameter of a single cell. What might this mean for industrial applications trying to break up the groups?
 
 We agree with the reviewer that further discussion of Figure S3, panels E and F, is warranted. In the revised version of the manuscript, under section Fragmentation of Microcystis colonies occurs through erosion in lines 133-137, we have now included a discussion of this figure. Figure S3F shows that more than 90% of the total biovolume ends up in the category “small colonies” (mostly single cells and dimers); hence, most of the initially large colonies do fragment to single cells or dimers. Only about 5-10% of the biovolume remains as “large colonies” of 10-20 cells. Although it is challenging to draw definitive conclusions about the behavior of these remaining large colonies, as they account for only a minor fraction of the suspension, one hypothesis is that variability in mechanical properties between colonies results in a subset of colonies exhibiting exceptional resistance even to very high dissipation rates (see lines 133-137).
 
 Minor comments:
 
 Typo Caption of Figure 2: Should read [m^2/s^3] for units
 
 Thanks for catching this typo. The units in the caption of Figure 2 has been corrected to [m^2/s^3].
 
 There is no Equation 10 in Materials and Methods as indicated in the rheology section.
 
 We thank the reviewer for pointing out the lack of clarity in this algebraic manipulation. In fact, the yield stress has to be substituted in the current Equation 11 (previously Eq.10), from which the critical dissipation rate must be substituted in Equation 3. The result is the critical colony size (l* = 2.8) mentioned in line 243 of the revised manuscript. The correct equation numbers and algebraic substitutions are now indicated in lines 241-243 of the revised version of the manuscript.
 
 <Reviewer #2 (Public review):
 
 Especially the introduction seems to imply that shear force is a very important parameter controlling colony formation. However, if one looks at the results this effect is overall rather modest, especially considering the shear forces that these bacterial colonies may experience in lakes. The main conclusion seems that not shear but bacterial adhesion is the most important factor in determining colony size. As the importance of adhesion had been described elsewhere, it is not clear what this study reveals about cyanobacterial colonies that was not known before.
 
 We would like to emphasize several key findings that our study reveals about the impacts of fluid flow on cyanobacterial colonies:
 
 (I) Quantification of mechanical strength in cyanobacterial colonies: Our results demonstrate the high mechanical strength of cyanobacterial colonies, as evidenced by the requirement of high shear rates to achieve fragmentation. This is new knowledge, that was not known before for cyanobacterial colonies. To this end, our study highlights the resilience of these colonies against naturally occurring flows and bridges the gap between theoretical assumptions about colony strength and experimentally measured mechanical properties.
 
 (II) The discovery that the mechanical strength of colonies differs between colonies formed by cell division and colonies formed by aggregation. This is again new knowledge, that was not known before for cyanobacterial colonies.
 
 (III) Validation of a hypothesis regarding colony formation: Using a fluid-mechanical approach, we confirm the findings of recent genetic studies (references 25 and 67 of the revised version of the manuscript) which indicated that colony formation occurs predominantly via cell division rather than cell aggregation under natural conditions (except in very dense blooms).
 
 (IV) Practical guidelines for cyanobacterial bloom control: Our findings provide valuable insights into the design of artificial mixing systems applied in several lakes. Artificial mixing of lakes is based on fundamentals of fluid flow, aiming at preventing aggregation of buoyant cyanobacteria in scum layers at the water surface. Our results show that the dissipation rates generated by bubble blumes in artificially mixed lakes can fragment cyanobacterial colonies formed by aggregation, but are not intense enough to cause fragmentation of division-formed colonies (see Figure 5 and lines 348-360).
 
 The agreement between model and experiments is impressive, but the role of the fit parameters in achieving this agreement needs to be further clarified.
 
 The influence of the fit parameters (namely the stickiness α1 and the pairs of colony strength parameters S1,q1,S2,q2) is discussed in the sections Dynamical changes in colony size modelled by a two-category distribution in lines 247-253 and Materials and Methods in lines 559-565. We kept the discussion concise to maintain readability. However, we agree with the reviewer that additional details about the importance of the fit parameters and the sensitivity of the results to these parameters could be beneficial. In the revised version of the section Materials and Methods in lines 560-563, we have included a detailed discussion of the fit parameters.
 
 The article may not be very accessible for readers with a biology background. Overall, the presentation of the material can be improved by better describing their new method.
 
 We apologize for the limited readability of the description of the experimental setup and model used. In the revised version of the manuscript and the SI, we have detailed further the new methods presented here. The modifications include a detailed description of the operating range of the cone-and-plate shear setup (subsection Cone-and-plate shear of the section Materials and Methods, in lines 462-473). Furthermore, we think that incorporation of the recent experimental results of Wu et al. (2024), on lines 331-337 of the manuscript, will appeal to readers with a biology background. Their mesocosm experiments support our model prediction that aggregation is the dominant mechanism for colony formation in region (II) of Figure 5.
 
 Reviewer #2 (Recommendations For The Authors):
 
 (1) The authors seem too modest in claiming technological advance. They should describe the technological advance of combining microscopy with rheometry, in such a way that this invites others to apply this or similar approaches on biological samples. Even though I feel that the advancement of knowledge of this system by their method is relatively modest, there may be more advances in other systems.
 
 We appreciate the positive view of the reviewer towards the importance of this technology and we agree that its advantages should be advertised to researchers investigating similar systems. We have now given more attention to the technological advance of combining microscopic imaging with rheometry in the final paragraph of the Conclusions (lines 386400), where we now also briefly discuss an interesting recent study of marine snow (Song et al. 2023, Song and Rau 2022, reference numbers 70 and 71 of the revised manuscript), which used a similar combination of microscopy and rheometry as in our study. Furthermore, in the Methods section, we now briefly explain how the rheometry can be adjusted to investigate other systems (lines 474-480).
 
 (2) It seems reasonable -also based on what we already know about these aggregates - to assume that the main difference in shear sensitivity between field samples and cultures lies in the production of extracellular polysaccharide substance (EPS). To go beyond what is already known, the study could try to provide more direct and quantitative evidence for EPS involvement. For example, using a chemical quantification of EPS levels, or perturbing EPS levels using digestive enzymes.
 
 We agree with the reviewer that further characterization of the EPS is highly relevant to understand the mechanical strength of colonies. However, we believe that chemical quantification and/or degradation of EPS lies beyond the scope of our article and should be addressed by future studies.
 
 (3) Assuming EPS is indeed the reason for the differences in shear resistance: the authors speculate the reason why the field samples have more EPS lies in chemical composition (Calcium/nitrogen levels). In addition, there could be grazing that is known to promote aggregation (possibly increasing EPS), or just inherent genetic differences between strains. I am not necessarily expecting the authors to explore this direction experimentally, but it seems certainly feasible and would make the final result less speculative.
 
 We agree with the reviewer that there are more biotic and abiotic factors that can influence EPS amount and composition. The influence of grazing and other relevant factors on cell adhesion is discussed in references [26-29], cited in our introduction in lines 50-53. As discussed in our answer to recommendation #2, we believe that a quantitative investigation of these various factors is beyond the scope of this work and should be addressed in future studies.
 
 (4) A cool finding seems to be the critical relative diameter (Fig 2E), a colony size that seems invariant under shear. I was slightly surprised that the authors seem to take little effort to understand this critical diameter mechanistically (for example by predicting it, or experimentally perturbing it). Again, not a necessary requirement, but this is where the study could harness its technological advantage to provide a more quantitative understanding of something that goes beyond the existing knowledge of the system.
 
 We apologize to the reviewer if our descriptions and discussions of Figure 2 were unclear. One of the key conclusions from our experiments is that the critical relative diameter depends on the dissipation rate, as shown in Figure 2F. This dependence is also incorporated into the model through the constitutive equation (2). Furthermore, we expect the mechanical resistance of colonies, quantified by the critical relative diameter, to be affected by other biotic and abiotic factors that influence EPS amount and composition.
 
 (5) The jump from 0.019 to 1.1 m²/s³ seems large. What was the reason for not exploring intermediate values? The authors should also define low, modest and intense dissipation rates more clearly. Currently, they seem somewhat arbitrarily defined, i.e. 0.019 m²/s³ is described as low (methods) and moderate (results). In Fig 2, the authors further talk about low dissipation rates without a quantitative description.
 
 We thank the reviewer for pointing out the lack of clarity in the choice of parameter range and the nomenclature. Regarding the former, the suspension of division-formed colonies of Microcystis strain V163 displayed negligible fragmentation for dissipation rates between 0.019 to 1.1 m2/s3, as seen in Figures S2A and S3A. Due to the low sensitivity of the fragmentation results in this region, we don’t expect change in behavior for intermediate values. Regarding the nomenclature, we have corrected the inconsistencies throughout the text. We have chosen to name the dissipation rate values as: low for values typical of windmixing, moderate for values typical of the core of bubble plumes, and intense for values typical of propellers. Whenever mentioned in the text, the numerical value of dissipation rate is also included to avoid doubt.
 
 (6.) The structure and narrative of the paper can be improved. The article first describes all lab culture experiments and then the model, while the first figure already shows model fits. Perhaps it would be better to first describe the aggregation experiments, to constrain the appropriate terms of the model, and then move to fragmentation.
 
 We appreciate the recommendation of the reviewer regarding the structure. We have chosen to describe first the fragmentation experiments (Fig. 2), as these can be understood without introducing the aggregation effects. In contrast, the steady state results in the aggregation experiments (Fig. 3) come from the balance between aggregation and fragmentation. Therefore, we judged the current order to be more appropriate. The model fits are combined with the experimental results in Figures 2 and 3 to have a concise display. We have ensured that all the concepts required to understand each figure panel are explained prior to their discussion.
 
 (7) The number of data points that go into the histogram needs to be indicated. The main reason is that the authors report the distribution in terms of the biovolume fraction, suggesting the numerical counts are converted into volume. This to me seems like the most sensible parameter, but I could not find how this conversion is calculated (my apologies if I missed it). This seems especially relevant because a single large colony can impact this histogram quite considerably.
 
 We apologize for the lack of clarity in the calibration and conversion steps of the size distribution. As discussed above in the answer to comment #5 of the reviewer #1, more details of the calibration process have been added to the revised version of the Supporting Information Text in lines 785-796. Furthermore, the new Supplementary Figure S8 presents examples of the raw and adjusted size distribution, including the total number of counted colonies per histogram and the associated uncertainties in the concentration and biovolume distributions.
 
 (8) Over the timescales measured here, colonies could start sinking (or floating), possibly in a size-dependent manner, that could lead to a bias due to boundary effects. Did the authors consider this potential artifact?
 
 The sinking or floating of colonies is a relevant process which was taken into account in the choice of our parameter range for the dissipation rate. The minimum dissipation rate used in our experiments ensures that the upward inertial velocity near stagnation is sufficient to counteract the sedimentation of colonies. A detailed discussion of the choice of the parameter range is now included in the revised version of the Materials and Methods in lines 462-473.
 
 (9) "On the one hand, sequencing of the genetic diversity within Microcystis colonies supports the hypothesis that colony formation undernatural conditions is primarily driven by cell division [25]. On the other hand, cell aggregation can occur on a shorter time scale and may offer improved protection against high grazing pressure [26]." This appears somewhat constructed, as what is described as "on the other hand" is not evidence against the genetic diversity.
 
 We agree that the suggested dichotomy in this text appeared somewhat constructed, and we have now removed the wording “on the one hand” and “on the other hand”. The studies from reference [25] demonstrated that the genetic diversity between independent Microcystis colonies is much greater than the diversity within colonies. If cell aggregation was the dominant mechanism, a similar genetic diversity would be observed between and within colonies, which contrasts the findings from reference [25]. We have adjusted the text in the revised manuscript, in lines 46-54, to clarify this point.
 
 (10) The phase diagram seems largely based on extrapolations that are made outside of the measurement regime (e.g. dark red bars indicating the dissipation rate, Fig 5 - by the way 1 this color scheme could use some better contrast, by the way 2 Fig S7 suggests a wider dissipation rate range as indicated in Fig 5, why?). Hence there seems to be the need to more clearly lineate experimental results, simulations, and extrapolations in the phase diagram.
 
 We agree with the reviewer that further clarifications should be given about the parameter range covered in our experiments and apologize for the lack of readability in the color scheme of Fig 5. In lines 329-337, 346-347, 353-355, we have highlighted the parameters range covered by our experiments as well as the range covered by previous studies of windmixed mesocosm (namely reference [64] of the revised manuscript). Regarding the color scheme of Figure 5, we have modified the legend of the figure to improve readability. The color contrast was increased and leader lines were added to connect the colored bars with the respective label.
 
 (11) Unfortunately, the manuscript did not contain line numbers.
 
 We apologize to the reviewer for the lack of line numbers in our initial version. The revised version of the manuscript now contains line numbers, both in the main text and the supporting information.
 
 (12) Fig 2D. Caption is too minimal. Y-axis could better be named "Fraction of colonies" as both small and large colonies are plotted.
 
 The caption for Figure 2D was extended to better describe the plot. We have kept the y-axis label as “Fraction of small colonies”, since this is the quantity displayed by the three curves in the plot.
 
 (13) An inset should have axis labels.
 
 All the insets in our plots display the same variables as their respective plots. In order to keep the plots light and preserve readability, we therefore prefer to present the axis labels only along the x-axis and y-axis of the main plots, which implies by convention that the same axis labels also apply to the insets. To the best of our knowledge, this is a common approach.
 
 (14) Page 5, first words. Likely Fig 3A, not 2A was meant.
 
 We thank the reviewer for pointing out this readability issue. We intend to compare both Figures 2A and 3A. The text of the revised manuscript, in lines 146-148, has been adjusted with the correct figure numbers.
 
 (15) Introduction, second last paragraph, third last line. "suspension leaded to a broad distribution" I assume you meant "... led to a ..."
 
 We thank the reviewer for pointing out this typo. It has been corrected (line 122).
 
 AuthorResponse
Visit annotations in context

Tags

Review 2

Review 1

AuthorResponse

Summary

Annotators

Public_Reviews

URL

arxiv.org/abs/2407.21115
www.biorxiv.org www.biorxiv.org

Birds migrate longitudinally in response to the resultant Asian monsoons of the Qinghai-Tibet Plateau uplift

3
1. Public_Reviews 09 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This important and creative study finds that the uplift of the Qinghai-Tibet Plateau - via its resultant monsoon system rather than solely its high elevation - has shifted avian migratory directions from a latitudinal to a longitudinal orientation. The authors have expanded and clarified their lines of evidence (including an enlarged tracking set and explicit caveats on species-level eBird inference), such that the central claims are now solid. The conclusions - that monsoon dynamics, rather than elevation per se, are most consistent with observed longitudinal reorientation - illustrates how large, community-sourced and climate-model datasets can inform continent-scale shifts in migratory behavior over time that complement traditional approaches.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Joint Public Review:
  
  The study assesses how the rise of the Qinghai-Tibet Plateau affected patterns of bird migration between their breeding and wintering sites.
  
  This is an interesting topic and a novel theme. The visualisations and presentation are to a very high standard. The Introduction is very well-written and introduces the main concepts well, with a clear logical structure and good use of the literature. The Methods are detailed and well-described, and written in such a fashion that they are transparent and repeatable.
  
  Editorial note: These latest revisions are minor in the sense that they expand on the dataset but do not change the primary results.
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Author response:
  
  The following is the authors’ response to the previous reviews
  
  Reviewer #1 (Public review):
  
  The authors have done a good job of responding to the reviewer's comments, and the paper is now much improved.
  
  Again, we thank the reviewer for positive comments during review.
  
  Reviewer #2 (Public review):
  
  I would like to thank the authors for the revision and the input they invested in this study.
  
  We are grateful for your thoughtful feedback and enthusiasms, which helps us improve our manuscript.
  
  With the revised text of the study, my earlier criticism holds, and your arguments about the counterfactual approach are irrelevant to that. The recent rise of the counterfactual approach might likely mirror the fact that there are too many scientists behind their computers, and few go into the field to collect in situ data. Studies like the one presented here are a good intellectual exercise but the real impact is questionable.
  
  We understand your concern about the relevance of the counterfactual approach used in our study. Our intent in using a counterfactual scenario (reconstructing migration patterns assuming pre-uplift conditions on the QTP) was to isolate the potential influence of the plateau’s geological history on current migration routes. Similar approach was widely used to estimate how biogeographic barriers facilitated the divergent vertebrate communities across the world (e.g., Williams et al. 2024). We agree that such an approach must be used carefully. In the revision, we have explicitly clarified why this counterfactual comparison is useful – namely it provides a theoretical baseline to test how much the QTP’s uplift (and the associated monsoon system) might have redirected migration paths (Gilbert and Lambert 2010, Sanmartín 2012, Bull et al. 2021). We acknowledge that the counterfactual results are theoretical and have explicitly emphasised the assumptions involved (i.e., species–environment relationships hold between pre- and post- lift environments) in the main text (Lines 91- 98). Nonetheless, we defend the approach as a valuable study design: it helps generate testable hypotheses about migration (for instance, that the plateau’s monsoon-driven climate, rather than just its elevation, introduces an east–west shift en route).
  
  References:
  
  Bull, J. W., N. Strange, R. J. Smith, and A. Gordon. 2021. Reconciling multiple counterfactuals when evaluating biodiversity conservation impact in social-ecological systems. Conservation Biology 35:510-521.
  
  Gilbert, D., and D. Lambert. 2010. Counterfactual geographies: worlds that might have been. Journal of Historical Geography 36:245-252.
  
  Sanmartín, I. 2012. Historical Biogeography: Evolution in Time and Space. Evolution: Education and Outreach 5:555-568.
  
  Williams, P. J., E. F. Zipkin, and J. F. Brodie. 2024. Deep biogeographic barriers explain divergent global vertebrate communities. Nature Communications 15:2457.
  
  All your main conclusions are inferred from published studies on 7! bird species. In addition, spatial sampling in those seven species was not ideal in relation to your target questions. Thus, no matter how fancy your findings look, the basic fact remains that your input data were for 7 bird species only! Your conclusion, “our study provides a novel understanding of how QTP shapes migration patterns of birds” is simply overstretching.
  
  We appreciate the reviewer’s comment here. We would like to clarify that our conclusions regarding longitudinal shifts in migratory distributions are based on distribution models derived from eBird data of 50 species, not merely on migration tracks from seven species. These species-level spatiotemporal models allow us to infer large-scale biogeographic patterns across the Qinghai-Tibet Plateau (QTP).
  
  The original seven tracking species were used specifically for analysing the relationship between migration directions (azimuths) and environmental variables, offering independent support for the patterns revealed in the eBird-based distribution models. Recognising the reviewer’s concern on sample size and coverage, we have now expanded this part by incorporating migration tracks from 12 additional species, derived through georeferenced digitisation of published migratory maps. Importantly, this expansion did not change our conclusions, i.e., the monsoons instead of the high elevations act as a prominent role in shaping the current migration direction of birds in the QTP. While the overall conclusion remains unchanged, the expanded dataset led to slight changes in difference between spring and autumn migration. We have updated the Figure 2 and the corresponding results and conclusions throughout the manuscript. We have also clarified in the Discussion that regions of the QTP with relatively less data might lead to underestimation of some migration routes to make sure readers are aware of these data limitations (Lines 211-218).
  
  The way you respond to my criticism on L 81-93 is something different than what you admit in the rebuttal letter. The text of the ms is silent about the drawbacks and instead highlights your perspective. I understand you; you are trying to sell the story in a nice wrapper. In the rebuttal you state: “we assume species' responses to environments are conservative and their evolution should not discount our findings.” But I do not see that clearly stated in the main text.
  
  Thanks, as suggested we have clearly stated the assumptions of niche conservatism in the Introduction (Lines 91-98).
  
  In your rebuttal, you respond to my criticism of "No matter how good the data eBird provides is, you do not know population-specific connections between wintering and breeding sites" when you responded: ... "we can track the movement of species every week, and capture the breeding and wintering areas for specific populations" I am having a feeling that you either play with words with me or do not understand that from eBird data nobody will be ever able to estimate population-specific teleconnections between breeding and wintering areas. It is simply impossible as you do not track individuals. eBird gives you a global picture per species but not for particular populations. You cannot resolve this critical drawback of your study.
  
  We agree that inferring population-specific migratory connections (teleconnections) from eBird data is challenging and inherently limited. eBird provides occurrence records for species, but it generally cannot distinguish which breeding population an individual bird came from or exactly where it goes for winter. Our objective is not to determine one-to-one migratory links between specific populations, but to identify general broad-scale directional shifts when birds cross the QTP during their migration. We regret any confusion caused by our earlier wording. To make this clearer, we have now emphasised that our interests focus on the migratory direction and their environmental correlates, rather than population assignments. We have also rephrased the relevant text to explicitly clarify that our study operates at the species level and at large spatial scales (Lines 253–257). We exemplify how distribution of eBird observations and GPS tracking data of four species can be different from each other whilst showing similar migration patterns (Figure S10). We have also explicitly stated in the Discussion that confirming population connectivity would require targeted tracking or genetic studies, and that our eBird-based analysis could only suggest plausible routes and region-to-region linkages (Lines 200-202).
  
  I am sorry that you invested so much energy into this study, but I see it as a very limited contribution to understanding the role of a major barrier in shaping migration.
  
  We thank the reviewer’s honest assessment and understand the concern regarding the scope of our contribution. Our intention was not to provide an exhaustive account of all aspects of the QTP as a migratory barrier, but to address a specific and underexplored question: how the uplift of the plateau and the resulting monsoon system may have influenced the orientation of avian migration routes. By integrating both satellite tracking and community-contributed data, we have explored how the uplift of the QTP could shape avian migration across the area. We believe our findings provide important insights of how birds balance their responses to large-scale climate change and geological barrier, which yields the most comprehensive picture to date of how the QTP uplift have shaped migratory patterns of birds. We have also discussed the study’s limitations – including the small number of tracking species (Lines 205218), the use of occurrence data as a proxy for breeding and wintering regions (Lines 200-202), the uneven sampling coverage in the QTP (Lines 202-205) and the assumptions behind the counterfactual scenario (Lines 91-98). This ensures that readers understand the context and constraints of our findings.
  
  My modest suggestion for you is: go into the field. Ideally use bird radars along the plateau to document whether the birds shift the directions when facing the barrier.
  
  We thank the reviewer for this suggestion. We agree that radar holds promise for understanding certain aspects of bird migration, particularly for detecting flight intensity, altitudes, and timing. However, the radar systems are currently challenging to resolve migration at the level of species, populations, or individuals, which are central to questions of migratory connectivity and route selection. Most radar signals cannot distinguish between species in mixed flocks, nor can they link breeding and wintering sites for tracked individuals. In addition, the spatial coverage of radar installations remains limited, especially across remote and high-elevation regions like the Qinghai-Tibet Plateau, where infrastructure and continuous power supply are still logistically prohibitive.
  
  The eBird dataset used in our study is itself a form of field-based observation, contributed by tens of thousands of birdwatchers across continents, including the QTP region (Figure S11). While eBird cannot provide individual-level tracking, it captures spatiotemporal patterns of occurrence at broad scales, making it a valuable complement to satellite tracking data. We would also emphasis that our team has extensive field experience in the Qinghai-Tibet Plateau (about twenty years), including multi-year expeditions to deploy satellite tags and observe migration at stopover sites.
  
  We agree that more direct tracking (e.g. GPS tagging) would be an ideal way to validate migration pathways and population connectivity. Using the satellite-tracking data, we have showed that most tracking species shifted their migration direction when facing the QTP (Figure S6). In this revision, as stated we managed to add a number of 12 more species with satellite tracking routes. We have also noted that future studies should build on our findings by using dedicated tracking of more individual birds and monitoring of migration over the QTP. We have cited recent advances in these techniques and suggested that incorporating more tracking data could further test the hypotheses generated by our work (Lines 205-218).
  
  Reviewer #2 (Recommendations for the authors):
  
  L55 "an important animal movement behaviour is.." Is there any unimportant animal movement? I mean this sentence is floppy, empty.
  
  We used this sentence to introduce migration. We have removed “important” to reduce ambiguous phrasing.
  
  L 152-154 This sentence is full of nonsense or you misinterpretation. First of all, the issue of inflexible initiation of migration was related to long-distance migrants only! The way you present it mixes apples and oranges (long- and short-distance migrants). It is not "owing to insufficient responses" but due to inherited patterns of when to take off, photoperiod and local conditions.
  
  We stated that this claim is invoked for long-distance migrants before this sentence and have rewritten the sentence to highlight that this interpretation is for long-distance migrants.
  
  L 158 what is a migration circle? I do not know such a term.
  
  We have amended it as “annual migration cycle”, which is a more common way to describe the yearly round-trip journey between breeding and wintering grounds of birds.
  
  L 193 The way you present and mix capital and income breeding theory with your simulation study is quite tricky and super speculative.
  
  We thank the reviewer for raising this important concern. We have presented this idea as an inference rather than a conclusion: “This pattern could be consistent with a ‘capital breeding’ strategy — where birds rely on endogenous reserved energy gained prior to reproduction — rather than an ‘income’ strategy where birds ingest nutrients mainly collected during the period of reproductive activity. This collaborates with studies on breeding strategies of migratory birds in Asian flyways. However, we note that this interpretation would require further study.” By adding this caution, we made it clear that we are not asserting this link as proven fact, only suggesting it as one possible explanation. We have also doublechecked that the rest of the discussion around this point is framed appropriately. Moreover, to help illustrate why we raised this ecological interpretation, we would also draw attention to examples of satellite tracking points from several species (e.g., Beijing Swift, Demoiselle Crane) in the following, which show obvious shifts in migratory direction near the QTP region. These turning points suggest potential behavioral responses to environmental constraints, such as climatic corridors or energy availability, which could help motivate our discussion of possible capital breeding strategies in these species.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.10.21.619453v3
www.biorxiv.org www.biorxiv.org

Generation of knock-in Cre and FlpO mouse lines for precise targeting of striatal projection neurons and dopaminergic neurons

5
1. Public_Reviews 09 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This important work has the potential to expand the repertoire of transgenic animals for systems neuroscience investigations across multiple fields. The generation of new reagents has the potential to open new directions in experimental design, and the Cas9-based approach for generating mice may provide additional benefits compared to existing BAC transgenic mouse lines. However, whereas some of the imaging data are compelling, quantitative analysis of transgene fidelity is incomplete, as it relies on a qualitative description of reporter XFP expression at low magnification, with some electrophysiological characterization.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  I read with much attention the manuscript titled "Generation of knock-in Cre and FlpO mouse lines for precise targeting of striatal projection neurons and dopaminergic neurons" in which the authors reveal five transgenic lines to target diverse neuronal populations of the basal ganglia. In addition, the authors also provide some assessments of the functionality of the lines.
  
  Strengths:
  
  Knockin lines made readily available through Jackson. Lines show specific expression.
  
  Weaknesses:
  
  Although I have no doubt these knocking lines will be broadly used by researchers in the field, I find the scientific advances of the study and the breadth of the resource provided quite limited. This is partly because 4 of these lines have been generated by other laboratories. For instance, there are already two other Dat-FlpO lines generated (JAX#: 033673 and 035436), with one of them already characterized (PMID: 33979604). Similarly, Drd1-Cre and Adora2a-Cre have been used abundantly since they were generated over a decade ago, and a novel Drd1-FlpO line has been characterized thoroughly recently (PMID: 38965445). Indeed, some of these lines were BAC transgenic, and I agree with the authors that there is a sound rationale for generating knock-in mice; however, the authors should then demonstrate if/how their new drivers are superior. Overall, the valuable resource generated by the authors would benefit from additional quantification and validation.
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The authors report the generation and validation of new knock-in mouse lines enabling precise targeting of basal ganglia projection neurons and midbrain dopamine neurons. By inserting recombinase sequences at endogenous loci, they provide tools that improve on older BAC-based models, with the additional benefit that all lines are openly available through Jackson Laboratories. This work is timely, fills a longstanding gap for the community, and will support both basic circuit mapping and disease-related research.
  
  Strengths:
  
  The major strength of this study is the provision of new genetic resources that will be widely used by the basal ganglia and dopamine research communities. Anatomical and electrophysiological data indicate appropriate expression and preserved intrinsic properties. The Flp lines, in particular, show labeling largely confined to basal ganglia circuits, making them especially attractive for circuit-based studies. A further strength is the use of a T2A-recombinase insertion at the native gene stop codon, which preserves endogenous regulation and maintains near-physiological expression of Adora2a, Drd1a, and DAT. The availability of both Cre and Flp versions enables powerful intersectional strategies, and open distribution through Jackson Laboratories ensures broad accessibility and long-term value.
  
  Weaknesses:
  
  The major limitation is the discrepancy between Cre and Flp lines, with Cre generally driving broader expression than Flp. This raises concerns about anatomical fidelity that require validation at the cellular level. For the DAT-FlpO line, efficiency remains insufficiently quantified, and higher-resolution co-labeling with TH immunostaining is needed. Electrophysiological comparisons between Cre and Flp versions are also incomplete; current data suggest potential physiological differences, which warrant additional statistical testing and, at a minimum, explicit discussion in the manuscript.
  
  Review 2
4. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  Using latest knock-in technology, the authors generated a set of five mouse lines with expression of recombinases in striatal projection neurons and dopaminergic neurons for public use. They rigorously characterize the expression of the recombinases by intersectional crossing with reporter lines to demonstrate that these lines are faithful, and they perform electrophysiological experiments in slices to provide evidence that the respective neurons show the expected features in these assays.
  
  Strengths:
  
  The characterization of the new mouse lines is exceptional, and these will be widely used by the community. The mouse lines are openly available for the community to use.
  
  Weaknesses:
  
  No weaknesses were identified by this Reviewer.
  
  Review 3
5. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Author response:
  
  We thank all three reviewers for their thoughtful and constructive evaluations of our manuscript, “Generation of knock-in Cre and FlpO mouse lines for precise targeting of striatal projection neurons and dopaminergic neurons.” We are encouraged that the reviewers recognize the value, specificity, and utility of these new lines for the basal ganglia and dopamine research communities. Below, we summarize our planned revisions and clarifications in response to the reviewers’ comments.
  
  (1) Novelty and comparison with existing lines
  
  We appreciate Reviewer 1’s point regarding the existence of previously generated Cre and Flp lines targeting similar neuronal populations. Our project was initiated six years ago, and during the course of generating and characterizing all five lines, we became aware that similar individual lines have since been developed by other groups. Nevertheless, our study provides a coordinated and independently validated set of lines created using a standardized knock-in (KI) strategy and distributed through Jackson Laboratories for unrestricted community use. Importantly, whereas previous BAC transgenic approaches rely on random insertion, which can lead to position effects and ectopic expression, our design places the recombinase coding sequence immediately downstream of the endogenous stop codon using a self-cleaving T2A peptide. This ensures expression under native promoter and regulatory control, preserving physiological gene regulation.
  
  To address the Reviewers’ points, we will (i) expand the Introduction and Discussion to clarify the rationale and advantages of endogenous promoter–driven recombinase expression over BAC-based systems, emphasizing that our lines provide a uniform, promoter-controlled, and publicly accessible toolkit for the community, (ii) and explore including a comparative table summarizing differences in construct design, expression fidelity, and recombination efficiency across published lines (e.g., PMID 33979604, 38965445).
  
  (2) Quantification, validation, and comparison of Cre vs FlpO
  
  We agree with Reviewers 1 and 2 that further quantification and discussion of Cre versus FlpO fidelity will strengthen the manuscript. The observed difference in expression breadth between Cre and FlpO lines likely reflects a fundamental property of the recombinases themselves rather than a discrepancy in targeting. Cre recombinase is significantly more enzymatically efficient than FlpO, meaning that even very low endogenous levels of gene expression (e.g., Drd1a or Adora2a) can drive Cre-dependent recombination, whereas FlpO requires higher expression thresholds. Consequently, reporter-based readouts will inherently appear broader for Cre lines, despite both being driven by the same endogenous promoters.
  
  To address these points, we will (i) provide quantitative co-labeling analyses for the DAT-FlpO line with TH immunostaining to assess efficiency and specificity, (ii) clarify in the Results and Discussion that differences between Cre and FlpO expression patterns largely stem from differences in recombinase kinetics and sensitivity, not mismatched promoter activity, (iii) and include representative high-resolution images and relevant statistics in the revised figures. Importantly, we would like to note that RNAscope may not be an ideal validation approach in this context, as in situ transcript detection cannot capture the enzymatic threshold differences that determine reporter recombination and thus will not help address observed differences between Cre and FlpO lines. Finally, we are actively performing electrophysiological comparisons between Cre and FlpO lines to rigorously quantify potential physiological differences between them. Updated analyses will be incorporated as available or described as ongoing future work.
  
  (3) Discussion of scope and interpretation
  
  We appreciate the reviewers’ suggestions to better contextualize the scope of this resource. We will revise the Discussion to (i) highlight that the Cre–FlpO pairings enable powerful intersectional and cross-line strategies for dissecting basal ganglia and midbrain circuitry, (ii) and clarify that our goal was to generate a rigorously validated foundational resource, with detailed functional comparisons and manipulation studies to be explored in subsequent work.
  
  In summary, we thank the reviewers for their insightful feedback. The planned revisions and clarifications will underscore the unique strengths of our knock-in design, explore potential Cre–FlpO differences, and highlight the value of this standardized and accessible toolkit for the neuroscience community.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.06.15.659794v1
www.biorxiv.org www.biorxiv.org

Decapping activators Edc3 and Scd6 act redundantly with Dhh1 in post-transcriptional repression of starvation-induced pathways

5
1. Public_Reviews 09 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This important study reports on the redundant roles of the decapping activators Edc3 and Scd6 in orchestrating post-transcriptional programs to modulate metabolic responses to nutrients in yeast. The authors employed mutagenesis studies in conjunction with a battery of transcriptome-wide analyses to provide convincing evidence supporting their conclusions. Considering the broad implications of post-transcriptional regulation of gene expression, this study will be of interest across a variety of biomedical disciplines ranging from biochemistry and molecular and cellular biology to those specializing in studying various pathologies.
 
 Summary
2. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 mRNA decapping and decay factors play critical roles in post-transcriptionally regulating gene expression. Here, Kumar and colleagues investigate how deleting two yeast decapping enhancer proteins (Edc3 and Scd6), either alone or in tandem, affects the transcriptome. Using RNA-Seq, CAGE-Seq and ribosome profiling, they conclude that these factors generally act in a redundant fashion, with a mutant lacking both proteins showing an increased abundance of select mRNAs. As these upregulated transcripts are also upregulated in mutants lacking the decapping enzyme, Dcp2, and show no increases in transcription of their cognate genes, the authors conclude that this is at the level of mRNA decapping and decay. This was further supported by CAGE-Seq analyses carried out in WT cells and the scd∆6edc3∆ double mutant. Their ribosome profiling data also lead them to conclude that Scd6 and Edc3 display functional redundancy and cooperativity with Dhh1/Pat1 in repressing the translation of specific transcripts. Finally, as their data suggest that Scd6 and Edc3 repress mRNAs coding for proteins involved in cellular respiration, as well as proteins involved in the catabolism of alternative carbon sources, they go on to show that these decapping activators play a role in repressing oxidative phosphorylation.
 
 Strengths:
 
 Overall, this manuscript is well-written and contains a large amount of compelling high-quality data and analyses. At its core, it helps to shed light on the overlapping roles Edc3 and Scd6 have in sculpting the yeast transcriptome.
 
 Weaknesses:
 
 While not essential, it would be interesting if the authors carried out add-back experiments to determine which domain within Scd6/Edce3 plays a critical role for enforcing the regulation that they see? Their double mutant now puts them in a perfect position to carry out such experiments.
 
 Review 1
3. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 This manuscript by Kumar and Zhang presents compelling evidence that Edc3 and Scd6 decapping activators, present a high degree of redundancy that can only be overcome by double mutants of both. In addition, the authors provide strong evidence for their role in regulating starvation-induced pathways as evidenced by measurements of mitochondrial membrane potential, metabolomics and analysis of the flux of Krebs cycle intermediates.
 
 Strengths:
 
 Kumar, Zhang et al provide multiple source of evidence of the direct mechanism of Edc3 and Scd6, by using and comparing different approaches such as mRNA-seq, ribosome occupancies and translational efficiencies. By extensive analysis the authors show that this complex can also regulate genes outside the Environmental Stress Response (non-iESR) that are significantly up-regulated in all three mutants. Remarkably, the gene ontology analysis of these non-iESR genes identify enrichment for mitochondrial proteins that are implicated in the Krebs cycle. Overall, this study adds novel mechanistic insight into how nutrients control gene expression by modulating decapping and translational repression.
 
 Weaknesses:
 
 The authors show very nicely that growth phenotypes from scd6Δedc3∆ can be rescued by transformation of EDC3 (pLfz614-7) or SCD6 (pLfz615-5). Future work could make use of these rescue strategies, for example as a platform to further characterise protein-protein interactions between Edc3, Scd6 and Dhh1.
 
 Review 2
4. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 In this paper, Kumar et al investigated the role of two decapping activators, Edc3 and Scd6, in regulating mRNA decay and translation in yeast. Using a variety of approaches including RNA-seq, ribosome profiling, proteomics, polysome analysis, and metabolomics the authors demonstrate that whereas single deletions of Edc3 or Scd6 have modest effects, the double mutant leads to increased abundance of mRNAs, many of which overlap with those targeted by the decapping activators Dhh1 and Pat1. The data suggest that Edc3 and Scd6 function redundantly to recruit Dhh1 to the Dcp2 decapping complex, thereby promoting mRNA turnover and translational repression. The authors show that these factors cooperate with Dhh1/Pat1 to repress transcripts involved in respiration, mitochondrial function, and alternative carbon source utilization, linking post-transcriptional regulation to nutrient responses. The study establishes Edc3 and Scd6 as important, but redundant regulators that fine-tune gene expression and metabolic adaptation in response to nutrient availability.
 
 Strengths:
 
 The paper has several strengths, including the comprehensive approach taken by the authors using multiple experimental techniques (RNA-seq, ribosome profiling, Western blotting, TMT-MS, polysome profiling, and metabolomics) to provide multiple lines of evidence to support their conclusions. The authors demonstrate clear redundancy of the factors by using single and double mutants for Edc3 and Scd6 and their global approach enables an understanding of these factors' roles across the yeast transcriptome. The work connects post-transcriptional processes to nutrient-dependent gene regulation, providing insights into how cells adapt to changes in their environment. The authors demonstrate the redundant roles of Edc3 and Scd6 in mRNA decapping and translation repression. Their RNA-seq and ribosome profiling results convincingly show that many mRNAs are derepressed only in the double mutants, confirming their hypothesis of redundancy. Furthermore, the functional cooperation between Edc3/Scd6 and Dhh1/Pat1 in regulating specific metabolic pathways, including mitochondrial function and carbon source utilization, is supported by the metabolomic data.
 
 Weaknesses:
 
 The study uses indirect evidence to support claims about the effect on mRNA stability rather than directly measuring mRNA stability. However, the combination of Pol II occupancy and RNA abundance measurements is consistent with the claims regarding mRNA stability. The addition of new experiments in the revision co-IPing Dhh1 and Dcp2 strengthens the argument that Edc3 and Scd6 recruit these factors.
 
 Review 3
5. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public review):
 
 Strengths:
 
 Overall, this manuscript is well-written and contains a large amount of high-quality data and analyses. At its core, it helps to shed light on the overlapping roles of Edc3 and Scd6 in sculpting the yeast transcriptome.
 
 Weaknesses:
 
 (1) While the data presented makes conclusions about mRNA stability based on corresponding ChIP-Seq analyses and analyzing other mutants (e.g. Dcp2 knockout), at no point is mRNA stability actually ever directly assessed. This direct assessment, even for select transcripts, would further strengthen their conclusions.
 
 We appreciate the reviewer’s concern but wish to emphasize that we conducted ChIP-Seq analysis of RNA Polymerase II occupancies in the CDSs of all genes, known to be a reliable indicator of transcription rate, and found only small increases in Pol II occupancies that cannot account for the increased transcript levels of the cohort of mRNAs up-regulated in the scd∆6edc3∆ double mutant (Fig. 3E). This provides strong evidence that increased transcription is not the main driver of increased mRNA abundance in this mutant. Bolstering this conclusion, we showed that the Hap2/Hap3/Hap4/Hap5 complex of transcription factors responsible for induction of Ox. Phos. genes was not activated in scd6Δedc3Δ cells in glucose medium (Fig. 6F(ii)); nor was the Adr1 activator of CCR genes activated (Fig. S9C(i)), ruling out transcriptional induction of their target genes in glucose-replete scd6Δ/edc3Δ cells and instead favoring reduced degradation as the mechanism underlying derepression of Ox. Phos. and CCR gene transcripts in this mutant. In Fig. 3B, we further showed that the majority of mRNAs up-regulated in the scd6Δedc3Δ double mutant are also derepressed by dcp2Δ, and in Fig. 3D that the mRNAs up-regulated in scd∆6edc3∆ cells exhibit a higher than average codon protection index (CPI) indicating a heightened involvement of decapping and co-translational degradation by Xrn1 in their decay. To provide additional support for our conclusion, we have conducted new experiments to measure the abundance of capped mRNAs genome-wide by CAGE sequencing of total mRNA in both WT and scd∆6edc3∆ cells. As established previously, normalizing CAGE TPMs to total mRNA TPMs determined by RNA-Seq, dubbed the C/T ratio, provides a reliable measure of the capped proportion of each transcript. The new data presented in Fig. 3C indicate that the mRNAs up-regulated in the scd∆6edc3∆ mutant have significantly lower than average C/T ratios in WT cells, whereas the C/T ratios for the down-regulated transcripts are higher than average, and that these differences between the two groups and all expressed mRNAs are diminished in the scd∆6edc3∆ double mutant. These are the results expected if the up-regulated mRNAs are selectively targeted for decapping in WT cells dependent on Edc3/Scd6, whereas the downregulated mRNAs are targeted by Edc3/Scd6 less than the average transcript. In the original version of the paper, we came to the same conclusion by analyzing our previous CAGE data for the dhh1∆ mutant for the same transcripts dysregulated scd∆6edc3∆ cells, now presented as supportive data in Fig. S3F. Finally, we added the fact that among all four Dhh1 target mRNAs examined in the previous study of He et al. (2022) and found here to be up-regulated selectively in the scd6∆edc3∆ double mutant (Fig. S10), two of them (SDS23 and HXT6) were shown directly to have longer half-lives in dhh1∆ vs. WT cells by He et al. (2018). Hence, the combined evidence is compelling that selective up-regulation of particular mRNAs in the scd∆6edc3∆ mutant results from diminished decapping/decay rather than enhanced transcription; and we feel that the additional supporting evidence that would be provided by measuring half-lives of a small group of up-regulated transcripts would not justify the considerable effort required to do so. Moreover, the standard approach for such experiments of impairing transcription with an inhibitor of Pol II or a Pol II Ts- mutation has been criticized because of the known buffering (suppression) of mRNA decay rates in response to impaired transcription.
 
 (2) Scd6 and Edc3 show a high level of functional redundancy, as demonstrated by the double mutant. As these proteins form complexes with other decapping factors/activators, I'm curious if depleting both proteins in the double mutant destabilizes any of these other factors. Have the authors ever assessed the levels of other key decapping factors in the double mutants (i.e. Dhh1, Pat1, Dcp2...etc)? I wonder if depleting both proteins leads to a general destabilization of key complexes. It would also be interesting to see if depleting Edc3 or Scd6 leads to a concomitant increase in the other protein as a compensatory mechanism.
 
 We thank the reviewer for this insight. Examining our Ribo-Seq and TMT-MS data revealed that Dhh1 expression and steady-state abundance are increased ~2-fold in the scd6∆edc3∆ strain, indicating that the up-regulation of many of the same mRNAs by scd6∆edc3∆ and dhh1∆ does not result indirectly from reduced levels of Dhh1 in the scd6∆edc3∆ mutant. The predicted increased in Dhh1 expression might signify a compensatory response to the absence of Scd6/Edc3. We also observed an ~40% reduction in Dcp2 translation (RPFs) and mRNA abundance in the scd6∆edc3∆ strain, which might contribute to the up-regulation of mRNAs dysregulated in this mutant. However, our new immunoblot analyses revealed no significant reduction in steady-state Dcp2 levels in scd6∆edc3∆ cells (Input lanes in Figs. 3F and S4C(i)-(ii)). Moreover, our previous finding that the majority of mRNAs subject to NMD, up-regulated by both upf1∆ and dcp2∆, are not upregulated by scd6∆edc3∆ implies that Dcp2 abundance in scd6∆edc3∆ cells is adequate for normal levels of NMD and favors a direct role for Scd6/Edc3 in accelerating degradation of most transcripts up-regulated in this mutant. We have added these points to the DISCUSSION.
 
 (3) While not essential, it would be interesting if the authors carried out add-back experiments to determine which domain within Scd6/Edce3 plays a critical role in enforcing the regulation that they see. Their double mutant now puts them in a perfect position to carry out such experiments.
 
 We agree with the reviewer that our scd6∆edc3∆ strain provides an opportunity to dissect the Scd6 and Edc3 proteins to determine which domains and motifs of each protein are most critically required for their functions in activating mRNA decay. However, if conducted thoroughly, this would entail an extensive analysis requiring a combination of genetics, biochemistry and genomics. Considering the large amount of data already presented in 43 and 34 panels of main and supplementary figures, respectively, we feel that these additional experiments would be conducted more appropriately as a stand-alone follow-up study.
 
 Reviewer #2 (Public review):
 
 Weaknesses:
 
 The authors show very nicely in Figure S1A that growth phenotypes from scd6Δedc3∆ can be rescued by transformation of EDC3 (pLfz614-7) or SCD6 (pLfz615-5). The manuscript might benefit from using these rescue strategies in the analysis performed (e.g. RNA-seq, ribosome occupancies, and translational efficiencies). Also, these rescue assays could provide a good platform to further characterise the protein-protein interactions between Edc3, Scd6, and Dhh1.
 
 We responded to this point immediately above in responding to Rev. #1.
 
 Reviewer #3 (Public review):
 
 Weaknesses:
 
 The limitations of the study include the use of indirect evidence to support claims that Edc3 and Scd6 recruit Dhh1 to the Dcp2 complex, which is inferred from correlations in mRNA abundance and ribosome profiling data rather than direct biochemical evidence.
 
 While the reviewer makes a valid point, it is important to note that the greater correlations between effects of scd6∆edc3∆ with those conferred by dhh1∆ vs. pat1∆ also extended to changes in metabolites (Fig. 7A-C). To provide more direct evidence that Edc3 and Scd6 recruit Dhh1 to the Dcp2 complex, we have now conducted co-immunoprecipitation experiments (presented in new Figs. 3F and S5) demonstrating that association of Dhh1 with Dcp2 is diminished in the scd6∆edc3∆ double mutant but not in either scd6∆ or edc3∆ single mutant, thus providing biochemical support for our proposal.
 
 Also, there is limited exploration of other signals as the study is focused on glucose availability, and it is unclear whether the findings would apply broadly across different environmental stresses or metabolic pathways. Nonetheless, the study provides new insights into how mRNA decapping and degradation are tightly linked to metabolic regulation and nutrient responses in yeast. The RNA-seq and ribosome profiling datasets are valuable resources for the scientific community, providing quantitative information on the role of decapping activators in mRNA stability and translation control.
 
 While not disputing the facts of this comment, we think it is unjustified to label as a weakness that our study focused on glucose-grown cells considering the large amount of new data and insights made possible by our multi-omics approach, presented in >70 separate figure panels and nine supplementary datafiles, which the reviewer has characterized as being valuable to the scientific community. Parallel studies in non-preferred carbon or nitrogen sources are underway and represent large-scale investigations in their own right, for which the current dataset in glucose-replete cells provides the critical reference condition.
 
 Reviewer #1 (Recommendations for the authors):
 
 The authors made a note that a set of 37 mRNAs is repressed exclusively by Edc3 with little contribution by Scd6, a list that includes the RPS28B mRNA. Edc3 has been previously reported to promote the decay of this mRNA in a deadenylation-independent fashion by binding to an element in its 3'UTR (PMIDs 15225544, 24492965). Can the authors comment on whether Edc3 may be binding to similar elements in the 3'UTRs of these transcripts in their shortlist? This could be an interesting topic matter for discussion as well.
 
 While an interesting idea, this seems unlikely because the 3’UTR sequence in RPS28B mRNA was shown to bind Rps28 protein itself to confer heightened decapping and decay dependent on Edc3 in a negative autoregulatory loop that exerts tight control over Rps28 protein levels. It would be surprising if Edc3mediated repression of the other 36 mRNAs would involve Rps28 as none of them encode cytoplasmic ribosomal proteins. Nevertheless, we searched for a conserved motif among the 3’UTRs of the 37 mRNAs using the MEME suite and found enrichment for motifs identified for RNA binding proteins Hrp1 and Nab2 and two novel motifs, but none of these motifs could be recognized within in the Rps28 autoregulatory loop. We have chosen not to comment on these findings in the revised manuscript to avoid lengthening it unnecessarily with inconclusive observations.
 
 Reviewer #2 (Recommendations for the authors):
 
 The authors show very nicely in Figure S1A that growth phenotypes from scd6Δedc3∆ can be rescued by the transformation of EDC3 (pLfz614-7) or SCD6 (pLfz615-5). The manuscript might benefit from using these rescue strategies on the analysis performed (e.g. RNA-seq, ribosome occupancies, and translational efficiencies); or expressing truncated mutants of EDC3 (pLfz614-7) or SCD6 (pLfz615-5), to show that they can act as dominant negative competitors, either on the binding to Dhh1 and Dcp2.
 
 We addressed this comment above in our response to this Reviewer.
 
 Reviewer #3 (Recommendations for the authors):
 
 (1) Labels such as "mRNA_up_s6,e3" are not defined in figures or the text. I suggest clearer sample labeling throughout.
 
 The labels had been defined at first mention in the RESULTS but are now indicated there more explicitly, as well as in the legend to Fig. 1.
 
 (2) In Figure 1D it is surprising that the mRNA profile has a peak in the 5' UTR. I would expect to see such a peak in ribosome footprinting data. Is it possible these are incorrectly labeled?
 
 The figure is correctly labeled. Generally, one does not expect to see RPFs in the 5’UTR region unless there is an efficiently translated uORF, which appears not to be the case for MDH2.
 
 In general, the information in this panel and C is inadequate. None of the numbers are clearly explained in the figure legend or in the figure.
 
 We had cited the legend to Fig. S3C for details of all such gene browser images but have now inserted this information into the Fig. 1D legend, at the first occurrence of such data in the regular figures.
 
 (3) Figures 1C and 1D are in the wrong order.
 
 Corrected.
 
 (4) Figure 2D is a very complicated Venn Diagram. I suggest using UpSet plots as an alternative to Venn diagrams to more clearly convey overlaps between sets.
 
 We provided additional explanatory text in the Fig. 2D legend to facilitate understanding.
 
 (5) The use of the same color scheme to represent different sets in panels of the same figure is a source of confusion. E.g. the cyan in Figures 2A, 2D, and 2E indicates unrelated categories, but one would think they are related.
 
 The use of the same cyan color in these three figure panels actually does designate results for the same set of 591 mRNAs up-regulated in the three mutants. The application of the color schemes is now mentioned explicitly in Figs. 1, 2, and S3.
 
 (6) Reporting of p-values = 0 in figures is not useful.
 
 Corrected.
 
 (7) The whole manuscript is extremely long which reduces the overall impact. For example, the introduction is six pages long. I suggest reducing redundant text and being more concise to enhance readability.
 
 We tried to streamline the text wherever possible, in particular shortening the Introduction by two pages.
 
 (8) Many abbreviations are used throughout the text that are not introduced the first time they are used.
 
 Corrected throughout.
 
 (9) The ERCC normalization is unclear. Were the spike-ins added before cell lysis to allow estimation of per-cell RNA counts or to the extracted RNA? If added to extracted RNA rather than cells it is not clear to me how the claim can be made regarding increased mRNA abundance in the mutants.
 
 We thank the reviewer for this comment. As we explained in the Methods, 2.4 µl of 1:100 diluted ERCC RNA Spike-In Control Mix 1 was added to 1.2 µg of each total RNA sample prior to cDNA library preparation. Because the majority of total mRNA is comprised of rRNA, this normalization yields the abundance of each mRNA relative to rRNA. Owing to repression of rESR mRNAs encoding ribosomal proteins and biogenesis factors in the scd6∆edc3∆ strain (Fig. S3D), the ribosome content per cell is expected to be reduced in this mutant vs. WT. We showed previously that the isogenic dcp2∆ mutant that elicits an ESR response of similar magnitude, showed a 30% reduction in bulk ribosomal subunits per cell compared to same WT strain examined here {Vijjamarri, 2023 #7866}. Assuming a similar reduction in ribosome abundance in the scd6∆edc3∆ mutant, the changes in mRNA per cell conferred by the scd6∆edc3∆ mutation are expected to be 0.7-fold of the ERCCnormalized values given in Fig. 3E, yielding fold-changes of 2.00 and 0.62 for the mRNA_up and mRNA_dn, groups, respectively, which still differ substantially from the corresponding changes in normalized Rpb1 occupancies of 1.2 and 0.93, respectively. We have added this new analysis to the text of RESULTS.
 
 (10) The use of the terms "up-regulated" and "derepressed" throughout is confusing. Both refer to observed increased abundance of mRNAs, but they imply different causes which are never clearly defined.
 
 We changed all occurrences of “derepressed” to “up-regulated”.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.08.28.610059v2
www.biorxiv.org www.biorxiv.org

Conduction pathway for potassium through the E. coli pump KdpFABC

4
1. Public_Reviews 09 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This manuscript revisits the well-studied KdpFABC potassium transport system from bacteria with a convincing set of new higher resolution structures, a protein expression strategy that permits purification of the active wildtype protein, and solid insight obtained from mutagenesis and activity assays. The thorough and thoughtful mechanistic analyses makes this a valuable contribution to the membrane transport field.
 
 Summary
2. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The paper describes the high-resolution structure of KdpFABC, a bacterial pump regulating intracellular potassium concentrations. The pump consists of a subunit with an overall structure similar to that of a canonical potassium channel and a subunit with a structure similar to a canonical ATP-driven ion pump. The ions enter through the channel subunit and then traverse the subunit interface via a long channel that lies parallel to the membrane to enter the pump, followed by their release into the cytoplasm.
 
 The work builds on the previous structural and mechanistic studies from the authors' and other labs. While the overall architecture and mechanism have already been established, a detailed understanding was lacking. The study provides a 2.1 Å resolution structure of the E1-P state of the transport cycle, which precedes the transition to the E2 state, assumed to be the rate-limiting step. It clearly shows a single K+ ion in the selectivity filter of the channel and in the canonical ion binding site in the pump, resolving how ions bind to these key regions of the transporter. It also resolves the details of water molecules filling the tunnel that connects the subunits, suggesting that K+ ions move through the tunnel transiently without occupying well-defined binding sites. The authors further propose how the ions are released into the cytoplasm in the E2 state. The authors support the structural findings through mutagenesis and measurements of ATPase activity and ion transport by surface-supported membrane (SSM) electrophysiology.
 
 Review 1
3. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 By expressing protein in a strain that is unable to phosphorylate KdpFABC, the authors achieve structures of the active wildtype protein, capturing a new intermediate state, in which the terminal phosphoryl group of ATP has been transferred to a nearby Asp, and ADP remains covalently bound. The manuscript examines the coupling of potassium transport and ATP hydrolysis by a comprehensive set of mutants. The most interesting proposal revolves around the proposed binding site for K+ as it exits the channel near T75. Nearby mutations to charged residues cause interesting phenotypes, such as constitutive uncoupled ATPase activity, leading to a model in which lysine residues can occupy/compete with K+ for binding sites along the transport pathway.
 
 Strengths:
 
 The high resolution (2.1 Å) of the current structure is impressive, and allows many new densities in the potassium transport pathway to be resolved. The authors are judicious about assigning these as potassium ions or water molecules, and explain their structural interpretations clearly. In addition to the nice structural work, the mechanistic work is thorough. A series of thoughtful experiments involving ATP hydrolysis/transport coupling under various pH and potassium concentrations bolsters the structural interpretations and lends convincing support to the mechanistic proposal. The SSME experiments are generally rigorous.
 
 Weaknesses:
 
 The present SSME experiments do not support quantitative comparisons of different mutants, as in Figures 4D and 5E. Only qualitative inferences can be drawn among different mutant constructs.
 
 Review 2
4. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This study on potassium ion transport by the protein complex KdpFABC from E. coli reveals a 2.1 Å cryo-EM structure of the nanodisc-embedded transporter under turnover conditions. The results confirm that K+ ions pass through a previously identified tunnel that connects the channel-like subunit with the P-type ATPase-type subunit.
 
 Strengths:
 
 The excellent resolution of the structure and the thorough analysis of mutants using ATPase and ion transport measurements help to strengthen new and previous interpretations. The evidence supporting the conclusions is solid, including biochemical assays and analysis of mutants. The work will be of interest to the membrane transporter and channel communities and to microbiologists interested in osmoregulation and potassium homeostasis.
 
 Weaknesses:
 
 There is insufficient credit and citation of previous work.
 
 The manuscript has been thoroughly revised with special attention to acknowledging all past work relevant to the study.
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The paper describes the high-resolution structure of KdpFABC, a bacterial pump regulating intracellular potassium concentrations. The pump consists of a subunit with an overall structure similar to that of a canonical potassium channel and a subunit with a structure similar to a canonical ATP-driven ion pump. The ions enter through the channel subunit and then traverse the subunit interface via a long channel that lies parallel to the membrane to enter the pump, followed by their release into the cytoplasm.
 
 Strengths:
 
 The work builds on the previous structural and mechanistic studies from the authors' and other labs. While the overall architecture and mechanism have already been established, a detailed understanding was lacking. The study provides a 2.1 Å resolution structure of the E1-P state of the transport cycle, which precedes the transition to the E2 state, assumed to be the ratelimiting step. It clearly shows a single K+ ion in the selectivity filter of the channel and in the canonical ion binding site in the pump, resolving how ions bind to these key regions of the transporter. It also resolves the details of water molecules filling the tunnel that connects the subunits, suggesting that K+ ions move through the tunnel transiently without occupying welldefined binding sites. The authors further propose how the ions are released into the cytoplasm in the E2 state. The authors support the structural findings through mutagenesis and measurements of ATPase activity and ion transport by surface-supported membrane (SSM) electrophysiology.
 
 Weaknesses:
 
 While the results are overall compelling, several aspects of the work raised questions. First, the authors determined the structure of the pump in nanodiscs under turnover conditions and observed several structural classes, including E1-P, which is detailed in the paper. Two other structural classes were identified, including one corresponding to E2. It is unclear why they are not described in the paper. Notably, the paper considers in some detail what might occur during the E1-P to E2 state transition, but does not describe the 3.1 Å resolution map for the E2 state that has already been obtained. Does the map support the proposed structural changes?
 
 As was seen in previous work by Silberberg et at. (2022), imaging KdpFABC under turnover conditions can produce multiple enzymatic states. We focus on the E1~P state and associated biophysical analyses to provide a clear and concise story that is focused on the conduction pathway for K+ ions. We continue to work with the cryo-EM data as well as other supporting methodologies and datasets with the goal of producing an additional manuscript that will describe other conformations. The class of particles producing the 3.1 Å structure shown in Fig. 1 – figure suppl. 2 is heterogeneous and thus requires further classification to elucidate conformational changes, as is apparent from the downstream processing of the E1 classes also shown in that figure. We cannot therefore derive any conclusions about the configuration of side chains at the CBS based on this structure. Nevertheless, two previous structures of the E2.Pi state - 7BGY and 7BH2 which were stabilized MgF4 and BeFx, respectively – show the structural change that is described in the paragraph discussing D583A. Given the consistency and relatively high resolution (2.9 and 3.0 Å, respectively) of these two independent structures, we believe that they provide strong support for our proposal for Lys586 acting as a built-in counter ion.
 
 The paper relies on the quantitative activity comparisons between mutants measured using SSM electrophysiology. Such comparisons are notoriously tricky due to variability between SSM chips and reconstitution efficiencies. The authors should include raw traces for all experiments in the supplementary materials, explain how the replicates were performed, and describe the reproducibility of the results. Related to this point above, size exclusion chromatography profiles and reconstitution efficiencies for mutants should be shown to facilitate comparison between measured activities. For example, could it be that the inactive V496R mutant is misfolded and unstable?
 
 Similarly, are the reduced activities of V496W and V496H (and many other mutants) due to changes in the tunnel or poor biochemical properties of these variants? Without these data, the validity of the ion transport measurements is difficult to assess.
 
 To address this concern, we have generated a series of supplementary figures for Figs. 2, 4, 5, and 6, which show all of the raw traces underlying our SSME data (Figure 2 - figure supplements 2-4, Figure 4 - figure supplement 1,Figure 5 - figure supplement 3, Figure 6 - figure supplement 2). We have also included further detail about the experimental protocols, including number and type of replicates, in an expanded "Activity Assays" section of Methods.
 
 In addition, we have included SEC profiles for each of the V496 mutants, which show that they are all well behaved in detergent solution prior to reconstitution (Fig. 4 - figure supplement 1). We are not able to directly document reconstitution efficiencies as it is not practical to separate proteoliposomes from unincorporated protein prior to preparing the sensors used for SSME. Binding currents are seen for several of the inactive mutants (e.g., Q116R in Rb and NH4 in Fig. 2 - figure supplement 3 and V496R in Fig. 4 - figure supplement 1), which demonstrate that protein is indeed present in the corresponding proteoliposomes even though no sustained transport current is observed.
 
 The authors propose that the tunnel connecting the subunits is filled with water and lacks potassium ions. This is an important mechanistic point that has been debated in the field. It would be interesting to calculate the volume of the tunnel and estimate the number of ions that might be expected in it, given their concentration in bulk. It may also be helpful to provide additional discussion on whether some of the observed densities correspond to bound ions with low occupancy.
 
 As suggested, we calculated the internal volume of the tunnel within KdpA (from the S4 K+ site to the KdpA/KdpB subunit interface) based on the profile derived from Caver. Based on this volume (4.9 x 10-25 L), a single K+ ion within this cavity would correspond to 3.4 M, which is near saturation for a solution of KCl. We added this information together with an acknowledgment of low-occupancy K+ to the fourth paragraph of the Discussion:
 
 " Fourth, based on the volume of the cavity in KdpA, a single K+ ion would correspond to a concentration of 3.4 M, suggesting that multiple ions would exceed the solubility limit especially in the absence of counterions. Finally, map densities within the tunnel were either of comparable strength or weaker than surrounding side chain atoms, unlike at S3 and canonical binding sites. Although it is possible that weaker density could represent low occupancy K+ ions, we favor a mechanism whereby individual K+ ions occupy the tunnel transiently as they transit between the selectivity filter and the canonical binding site."
 
 In order to make this analysis, we developed a python script to calculate the volume of the tunnel as defined by the Caver software (this software is available via github.com/dls4n/tunnel). In turn, this enabled us to distinguish water molecules that were actually in the tunnel rather than bound more deeply within the structure of KdpA. As a result, we updated the water distribution plot in Fig. 4b. Notably, the 17 water molecules within this cavity would correspond to 57.8 M, which is reasonably near the expected 55 M for an aqueous solution.
 
 Reviewer #3 (Public review):
 
 Summary:
 
 By expressing protein in a strain that is unable to phosphorylate KdpFABC, the authors achieve structures of the active wild-type protein, capturing a new intermediate state, in which the terminal phosphoryl group of ATP has been transferred to a nearby Asp, and ADP remains covalently bound. The manuscript examines the coupling of potassium transport and ATP hydrolysis by a comprehensive set of mutants. The most interesting proposal revolves around the proposed binding site for K+ as it exits the channel near T75. Nearby mutations to charged residues cause interesting phenotypes, such as constitutive uncoupled ATPase activity, leading to a model in which lysine residues can occupy/compete with K+ for binding sites along the transport pathway.
 
 Strengths:
 
 Although this structure is not so different from previous structures, its high resolution (2.1 Å) is impressive and allows the resolution of many new densities in the potassium transport pathway. The authors are judicious about assigning these as potassium ions or water molecules, and explain their structural interpretations clearly. In addition to the nice structural work, the mechanistic work is thorough. A series of thoughtful experiments involving ATP hydrolysis/transport coupling under various pH and potassium concentrations bolsters the structural interpretations and lends convincing support to the mechanistic proposal.
 
 Weaknesses:
 
 The structures are supported by solid membrane electrophysiology. These data exhibit some weaknesses, including a lack of information to assess the rigor and reproducibility (i.e., the number of replicates, the number of sensors used, controls to assess proteoliposome reconstitution efficiency, and the stability of proteoliposome absorption to the sensor).
 
 To address this concern, we have generated a series of supplementary figures for Figs. 2, 4, 5, and 6, which show all of the raw traces underlying our SSME data (Figure 2 - figure supplements 2-4, Figure 4 - figure supplement 1,Figure 5 - figure supplement 3, Figure 6 - figure supplement 2). We have also included further detail about the experimental protocols, including number and type of replicates, in the "Activity Assays" section of Methods.
 
 Reviewing Editor Comments
 
 After discussing the evaluations, the Reviewers and Reviewing Editor have identified the following essential revisions that would need to be addressed to improve the eLife assessment:
 
 (1) Work from others in the field should be adequately described and acknowledged:
 
 (a) Page 2: " A series of X-ray and cryo-EM structures of KdpFABC from E. coli have led to proposals of a novel transport mechanism befitting the unprecedented partnership of these two superfamilies within a single protein complex."
 
 The authors must give credit where credit is due (namely, the Haenelt/Paulino groups having discovered the transport pathway). Why don't they cite Stock et al., where this pathway was described first? The Stokes group proposed an entirely different pathway initially.
 
 Explicit reference to this work has been added to as follows:
 
 “A series of X-ray and cryo-EM structures of KdpFABC from E. coli (Huang et al., 2017; Silberberg et al., 2022, 2021; Stock et al., 2018; Sweet et al., 2021) indicate a novel transport mechanism befitting the unprecedented partnership of these two superfamilies within a single protein complex. As first proposed by Stock et al. (Stock et al., 2018), there is now a consensus that K+ enters the complex from the extracellular side of the membrane through the selectivity filter of KdpA, but is blocked from crossing the membrane.”
 
 (b) Page 4 " As a result, many previous structures (Huang et al., 2017; Silberberg et al., 2021; Stock et al., 2018; Sweet et al., 2021) feature the S162A mutation to avoid inhibition rather than the fully WT protein used for the current work."
 
 This is not correct. At least the work by Huang et al 2017 and Stock et al 2021 was done without the mutation. This is why the structures also captured the off-cycle state when no E2 inhibitor was used. But in Silberberg et al 2022 the mutant was used, but this is not mentioned
 
 The Q116R mutant was used by Huang et al., but indeed not used for the Stock et al paper. We have replaced the sentence in the manuscript with the following:
 
 “Use of the KdpD knockout strain allowed us to produce WT and mutant protein free from Ser162 phosphorylation.”
 
 (c) Page 4: " In the paper, we report on the most highly populated state (44% of particles)". Exactly the same was also seen in detergent solution, which should be mentioned.
 
 Reference to the Silberberg 2022 paper, where E1~P was the most highly populated state, has been added. The percentage of particles was removed as we are still processing data from the other states, which will we hope will be described in a future manuscript.
 
 (d) Page 7 "Asp583 and Lys586 are two conserved residues on M5 that have previously been shown......indicating that this particular mutation interfered with energy coupling." The lack of discussion of the Haenelt/Paulino 2021 paper, where they have analyzed the coupling in detail and described a proximal binding site where K+ is coordinated by D583 and the neighbouring Phe is very concerning.
 
 To correct this oversight, we made the following changes to the text:
 
 On pg. 7 in the Results section, we refer to the 2005 paper from Bramkamp & Altendorf:
 
 “Consistent with earlier work on this mutant (Bramkamp and Altendorf, 2005), the D583A mutant displayed substantial ATPase activity (30% of WT) but no transport, indicating that this particular mutation interfered with energy coupling.”
 
 At the end of pg. 10 in the Discussion, we revised the paragraph discussing D583 and Lys586 to explicitly refer to the mechanism of transport described in the 2021 paper from Silberberg et al, including proximal and distal binding sites as well as uncoupling due to the D583A mutation.
 
 “Similar to the Glu370/Arg493 charge pair in KdpA, Asp583 and Lys586 are the only charged residues in the membrane core of KdpB. Although they are not seen to interact directly in our structure, they coordinate accessory waters associated with the canonical binding site. Previous molecular dynamics simulations (Silberberg et al., 2021) indicate that Asp583 couples with Phe232 to form a “proximal binding site” for K+ ions. Based on these simulations, these authors proposed a mechanism whereby neutralization of this site either by ion binding or by D583A substitution served to stimulate ATPase activity. Indeed, earlier work on D583A (Bramkamp and Altendorf, 2005) as well as current data demonstrate uncoupling, in which K+ independent ATPase activity was observed even though transport was abolished. A plausible explanation for this stimulation is seen in the behavior of Lys586 in previous structures of the E2·Pi state (7BGY and 7BH2) (Sweet et al., 2021). In these structures, M5 undergoes a conformational change that pushes the side chain of Lys586 into the CBS. As a consequence of the D583A mutation, this Lys could be freed to act as a built-in counter ion as in related P-type ATPases ZntA (Wang et al., 2014) and AHA2 (Pedersen et al., 2007). In regard to the proximal binding site and the partnering “distal binding site” on the KdpA-side of the subunit interface, our structure does not show densities at either site and thus does not provide any support for the related mechanism. In any case, in the WT complex it seems likely that Asp583 exerts allosteric control over Lys586 and ensures that its movement into the binding site is coordinated with the transition from E1~P to E2·Pi, thus leading to displacement of K+ from the CBS and release to the cytoplasm. “
 
 (e) Page 8 " The intersubunit tunnel is arguably one of the most intriguing elements of the KdpFABC complex. Although it has been postulated to conduct K+, experimental evidence has been lacking. "
 
 Incorrect, see Silberberg 2021.
 
 On this point, we beg to differ. Although this 2021 paper shows densities in experimental cryo-EM maps and effects of mutations to residues at the KdpA and KdpB interface, the intra-tunnel transport mechanism is based on computational analysis (MD simulations) and not experimental evidence. We softened the statement to read as follows:
 
 “Although it has been postulated to conduct K+, direct experimental evidence has been hard to come by.”
 
 (f) In this context, also f232 is not mentioned anywhere in the text, although depicted in almost all figures.
 
 Phe232 is shown as a point of reference for the KdpA/KdpB subunit interface. We added a reference to Phe232 in the Results section labeled “Intersubunit tunnel” as well as the paragraph in the Discussion addressed in point d) above.
 
 " These densities, which we have modeled as water, are most prevalent near the vestibule, which is the wider part of the tunnel, but then disappear completely at the subunit interface near Phe232, which is the narrowest part of the tunnel and also distinctly hydrophobic (Fig. 4)."
 
 " Previous molecular dynamics simulations (Silberberg et al., 2021) indicate that Asp583 couples with Phe232 to form a “proximal binding site” for K+ ions."
 
 (g) Page 2 "Later, it was recognized that KdpA belongs to the Superfamily of K+ Transporters (SKT superfamily), which also includes bona fide K+ channels such as KcsA, TrkH and KtrB (Durell et al., 2000). "
 
 KcsA is not a member of the SKT superfamily.
 
 Thanks. This is correct, although the SKT superfamily is believed to have evolved from KcsA. KcsA has been removed from the sentence and a reference added to a review of the SKT superfamily:
 
 “which also includes bona fide K+ channels such as TrkH and KtrB (Diskowski et al., 2015; Durell et al., 2000).”
 
 (2) Two other structural classes were identified, including one corresponding to E2. It is unclear why they are not described in the paper. Notably, the paper considers in some detail what might occur during the E1-P to E2 state transition, but does not describe the 3.1 Å resolution map for the E2 state that has already been obtained. Does the map support the proposed structural changes?
 
 As was seen in previous work by Silberberg et at. (2022), imaging KdpFABC under turnover conditions can produce multiple enzymatic states. We focus on the E1~P state and associated biophysical analyses to provide a clear and concise story. We continue to work with the cryo-EM data as well as other supporting methodologies and datasets with the goal of producing an additional manuscript that will describe other conformations. The class of particles producing the 3.1 Å structure shown in Fig. 1 – figure suppl. 2 is heterogeneous and thus requires further classification to elucidate conformational changes, as is apparent from the downstream processing of the E1 classes also shown in that figure. We cannot therefore derive any conclusions about the configuration of side chains at the CBS based on this structure. Nevertheless, two previous structures of the E2.Pi state - 7BGY and 7BH2 which were stabilized MgF4 and BeFx, respectively – show the structural change that is described in the paragraph discussing D583A. Given the consistency and relatively high resolution (2.9 and 3.0 Å, respectively) of these two independent structures, we believe that they provide strong support for our proposal for Lys586 acting as a built-in counter ion.
 
 (3) The paper relies on the quantitative activity comparisons between mutants measured using SSM electrophysiology. Such comparisons are notoriously tricky due to variability between SSM chips and reconstitution efficiencies. The authors should include raw traces for all experiments in the supplementary materials, explain how the replicates were performed, and describe the reproducibility of the results.
 
 To address this concern, we have generated supplementary figures for Figs. 2, 4, 5, and 6, which show all of the raw traces underlying our SSME data (Figure 2 - figure supplements 2-4, Figure 4 - figure supplement 1,Figure 5 - figure supplement 3, Figure 6 - figure supplement 2). We have also added a detailed description of replicates, sensor stability and the experimental protocols in the "Activity Assays" section of Methods. In addition, we have highlighted observations of pre-steady state binding currents that were seen for some mutants (e.g., Q116R assayed with Rb+, NH4+ and Na+), in which an initial, transient current response was observed without an ensuing transport current. The depiction of this raw data has allowed us to explain our use of the current response at 1.25 s, after decay of this binding current, as a measure of transport rate. This approach is consistent with recommendations by the manufacturer, as documented in their 2023 publication (Bazzone et al. https://doi.org/10.3389/fphys.2023.1058583).
 
 (4) Related to this point above, size exclusion chromatography profiles and reconstitution efficiencies for mutants should be shown to facilitate comparison between measured activities. For example, could it be that the inactive V496R mutant is misfolded and unstable? Similarly, are the reduced activities of V496W and V496H (and many other mutants) due to changes in the tunnel or poor biochemical properties of these variants? Without these data, the validity of the ion transport measurements is difficult to assess.
 
 We have included SEC profiles for each of the V496 mutants, which show that they are all well behaved in detergent solution prior to reconstitution (Fig. 4 - figure supplement 1). We are not able to directly document reconstitution efficiencies as it is not practical to separate proteoliposomes from unincorporated protein prior to preparing the sensors used for SSME. Binding currents are seen for several of the inactive mutants (e.g., Q116R in Rb and NH4 in Fig. 2 - figure supplement 3 and V496R in Fig. 4 - figure supplement 1), which demonstrate that protein is indeed present in the corresponding proteoliposomes even though no sustained transport current is observed.
 
 (5) What are the different lines in Figure 1 - Supplement 1, panel G?
 
 This panel depicted a series of SSME traces as an example of the raw data, but has been removed from the revised version given the inclusion of all the raw traces. These new figures include a legend explaining the conditions for each trace.
 
 (6) How was the 44 % population of the single-occupancy E1 state estimated (it does not correspond to the number of particles in Figure 1 - Supplement 2.
 
 The calculation of 44% for the E1~P state was premature, given that we are still analyzing the data from the turnover conditions. The revised manuscript simply states that E1~P represented the largest population of particles, which is consistent with this state preceding the rate limiting step of the PostAlbers cycle. Reference is made to the Silberberg 2022 paper, which made a similar observation in a detergent-solubilized sample.
 
 (7) The text states that Km for Q116E is "<10 uM". However, the fitted value is 90 µM in Figure 2e.
 
 This was a typographical error. The text now states that Km for Q116E is <100 M.
 
 (8) The Km values for Rb, NH4, and Na in Figures 2g and h, and Na in Figure 2i do not make sense. They should be removed.
 
 The values for Km were determined by fitting the Michaelis-Menton equation to the data as detailed in the Methods section. Although the curves visually appear rather flat relative to other ions, the fitting generated respectable confidence limits and are therefore defensible in a statistical context. Furthermore, the curves that are shown are based on those values of Km and it would be inappropriate not to cite them.
 
 (9) Figure 3 would benefit from a slice through the protein to orient the viewer.
 
 Thanks for the suggestion. We have added panels to Figs. 3, 5 and 6 in an effort to orient the reader to the site that is depicted.
 
 (10) The differences between R493E, Q, and M do not appear to be significant.
 
 The y-axis is logarithmic which makes a visual comparison difficult. To alleviate this, P values were calculated based on one-way ANOVA analysis are results are indicated in Fig. 3c and 3d. They show that all of the Arg493 mutations have Km significantly higher than WT. Differences between R493E orR493Q and R493Q orR493M are not significant at the p<0.01 level, while the difference between R493E and R493M is highly significant (p<0.001). The associated text on pg. 6 has been slightly modified as follows:
 
 “Changes to Arg493 generally increase Km (lower apparent affinity) without affecting Vmax, with Met substitution having greater effect than charge reversal (R493E).”
 
 (11) Page 5, paragraph 2. Q116R and G232D don't seem like the world's most intuitive mutations. It appears there is a historical reason for looking at these. Could the rationale be explained in the text? (Why R and D specifically?)
 
 These mutations have historical significance, having been generated by random mutagenesis during early characterization of the Kdp system by Epstein and colleagues. A sentence containing relevant references has been added to this paragraph to provide this context:
 
 “Specifically, Q116R and G232D substitutions were initially discovered by random mutagenesis during early characterization of the Kdp system (Buurman et al., 1995; Epstein et al., 1978) and have featured in many follow-up studies (Dorus et al., 2001; Schrader et al., 2000; Silberberg et al., 2021; Sweet et al., 2020; van der Laan et al., 2002).”
 
 Below are the recommendations from each of the reviewers, some of which were not included as essential revisions, but that can also be helpful to further strengthen the manuscript.
 
 Reviewer #1 (Recommendations for the authors):
 
 It is essential that the authors correct their selective, incomplete, and in places inappropriate references to work from others in the field.
 
 Specific points:
 
 (1) Page 2: " A series of X-ray and cryo-EM structures of KdpFABC from E. coli have led to proposals of a novel transport mechanism befitting the unprecedented partnership of these two superfamilies within a single protein complex."
 
 The authors must give credit where credit is due (namely, the Haenelt/Paulino groups having discovered the transport pathway). Why don't they cite Stock et al., where this pathway was described first? The Stokes group proposed an entirely different pathway initially.
 
 (2) Page 4 " As a result, many previous structures (Huang et al., 2017; Silberberg et al., 2021; Stock et al., 2018; Sweet et al., 2021) feature the S162A mutation to avoid inhibition rather than the fully WT protein used for the current work."
 
 This is not correct. At least the work by Huang et al 2017 and Stock et al 2021 was done without the mutation. This is why the structures also captured the off-cycle state when no E2 inhibitor was used. But in Silberberg et al 2022 the mutant was used, but this is not mentioned
 
 (3) Page 4: " In the paper, we report on the most highly populated state (44% of particles)". Exactly the same was also seen in detergent solution, which should be mentioned.
 
 (4) Page 7 "Asp583 and Lys586 are two conserved residues on M5 that have previously been shown......indicating that this particular mutation interfered with energy coupling." The lack of discussion of the Haenelt/Paulino 2021 paper, where they have analyzed the coupling in detail and described a proximal binding site where K+ is coordinated by D583 and the neighbouring Phe is very concerning.
 
 (5) Page 8 " The intersubunit tunnel is arguably one of the most intriguing elements of the KdpFABC complex. Although it has been postulated to conduct K+, experimental evidence has been lacking. "
 
 Incorrect, see Silberberg 2021.
 
 (6) In this context, also f232 is not mentioned anywhere in the text, although depicted in almost all figures.
 
 References have been added to address all of these points. See item 1) under Reviewing Editor’s Comments above.
 
 Other points:
 
 (7) Page 2 "Later, it was recognized that KdpA belongs to the Superfamily of K+ Transporters (SKT superfamily), which also includes bona fide K+ channels such as KcsA, TrkH and KtrB (Durell et al., 2000). "
 
 KcsA is not a member of the SKT superfamily.
 
 KcsA has been removed from the sentence and a reference added to a review of the SKT family:
 
 “which also includes bona fide K+ channels such as TrkH and KtrB (Diskowski et al., 2015; Durell et al., 2000).”
 
 (8) Page 9 " Our demonstration of coupled transport of NH4+ and Rb+ G232D not only confirms that the selectivity filter governs ion selection, but that the pump subunit, KdpB, is relatively promiscuous." Check grammar.
 
 This sentence has been updated as follows:
 
 “Our observation that G232D is capable of coupled transport for NH4++ confirms not only that the selectivity filter governs ion selection, but that the pump subunit, KdpB, is relatively promiscuous.
 
 Reviewer #2 (Recommendations for the authors):
 
 (1) From an editorial point of view, I suggest a few changes to enhance readability and clarity for non-specialists. A description of the overall transport cycle at the start of the paper (perhaps as a supplementary figure) could help put the work into perspective for general readers who may not be familiar with P-type ATPase mechanisms. It is unclear what "single" and "double" occupancy refer to in the structural classes description. Why is only one structural class described in detail? I would suggest moving the discussion of what is going on with the Nterminus of KdpB to the Results section, where it is described, and shortening the corresponding paragraph in the Discussion. I would furthermore suggest adding a figure that illustrates the proposed regulatory role of the terminus and how phosphorylation might affect it. Otherwise, this section of the results reads very hollow.
 
 A diagram showing the Post-Albers cycle is shown as part of Fig. 1 and is described at the end of the second paragraph. This sentence only mentioned KdpB, which may have caused confusion. We therefore changed the sentence to read as follows:
 
 “Like other P-type ATPases, KdpFABC employs the Post-Albers reaction cycle (Fig. 1) involving two main conformations (E1 and E2) and their phosphorylated states (E1~P and E2-P) to drive transport (Albers, 1967; Post et al., 1969).”
 
 Single and double occupancy was meant to refer to the number of KdpFABC complexes residing in a nanodisc. This can be seen in the class averages in Fig. 1 - figure supplement 2. The legends to Fig. 1 figure supplements 1 and 2 have been revised to explain this observation more explicitly:
 
 "Slight asymmetry of the main peak is consistent with a subpopulation of nanodiscs containing two KdpFABC complexes (Fig. 1 - figure supplement 2)."
 
 and
 
 "A subset of these particles were further classified to generate four main classes representing nanodiscs with a single copy of KdpFABC in either E1 or E2 conformations, nanodiscs with two copies of KdpFABC which were mainly E1 conformation, and junk."
 
 As stated above, the class of particles producing the 3.1 Å structure shown in Fig. 1 – figure suppl. 2 is heterogeneous and requires further classification to elucidate conformational changes, as is apparent from the downstream processing of the E1 classes also shown in that figure. We continue to analyze the cryo-EM data and aim to produce a second manuscript that will include descriptions of other conformations together with the additional biophysical analysis related to their function.
 
 With regard to the N-terminus, we have gone on to generate a truncation of residues 2-9 in KdpB. After expression and purification, this construct remained coupled with ATPase and transport activities similar to WT, which makes proposals of a regulatory effect less compelling. Because of the novelty of observing the N-terminus and the possibility that it plays a subtle role in the kinetics of the cycle not revealed under the current assay conditions, we have retained a brief discussion of this structural observation, but moved it into the Results section as suggested.
 
 "Given the regulatory roles played by N- and C-termini of a variety of other P-type ATPases (Bitter et al., 2022; Cali et al., 2017; Lev et al., 2023; Timcenko et al., 2019; Zhao et al., 2021), we generated a construct in which residues 2-9 of the N-terminus of KdpB were truncated. However, ATPase and transport activities remained coupled at levels similar to WT, indicating that any functional role of the N-terminus is relatively subtle and not manifested under current assay conditions."
 
 (2) The wording "exceedingly strong densities" seems ambiguous.
 
 We have changed this to “strong” in the Abstract and "exceptionally strong" in the Discussion. The precise values for these densities are shown in density histograms in Fig. 2 – figure supplement 1 and Fig. 5 – figure supplement 2. In the text, the densities are described as follows:
 
 Results sections describing the selectivity filter:
 
 "In fact, this S3 site contains the strongest densities in the entire map, measuring 7.9x higher than the threshold used for Fig. 2a (Fig. 2 – figure suppl. 1a)."
 
 Results section describing the CBS:
 
 "Given that this is the strongest density in KdpB, measuring 5.6x higher than the map densities shown in Fig. 5 (Fig. 5 – figure suppl 2b), we have modeled it as K+."
 
 (3) What are the different lines in Figure 1 - Supplement 1, panel G?
 
 This panel depicted a series of SSME traces as an example of the raw data, but has been removed from the revised version given the inclusion of all the raw traces. These new figures include a legend explaining the conditions for each trace.
 
 (4) How was the 44 % population of the single-occupancy E1 state estimated (it does not correspond to the number of particles in Figure 1 - Supplement 2.
 
 The calculation of 44% for the E1~P state was premature, given that we are still analyzing the data from the turnover conditions. We will consider citing an updated value in a future publication once this analysis is complete. The revised manuscript simply states that E1~P represented the largest population of particles, which is consistent with this state preceding the rate limiting step of the Post-Albers cycle. Reference was made to the Silberberg 2022 paper, where a similar observation was made.
 
 (5) Panel 1d is called out of order after panel 1e. Please label Ser 162 in the panel.
 
 The order of these panels have been switched and Ser162 has been labelled as suggested.
 
 (6) Several panels in Figure 1- Supplement 1 are neither referenced nor described.
 
 This figure supplement is referred to multiple times in the Results and the Methods sections of the text as well as in the figure legends. Although each panel is not individually referenced, all of this information is relevant at different points in the manuscript and is explained in the legend.
 
 (7) Is the coordinating geometry for the S3 site consistent with what was previously observed for KcsA and relatives?
 
 The general arrangement of carbonyl atoms in the S3 site is the same in KcsA and KdpA, described by the MacKinnon group as a square antiprism. However, KcsA has strict four-fold symmetry and KdpA does not. As a result, there are small discrepancies between the coordinating geometries in the two structures. This point was made graphically in our original report on the X-ray structure of KdpFABC (Huang et al. 2007, Extended Data Fig. 3), though the positions of the carbonyls are more accurately determined in the current structure due to increased resolution. We added a sentence to the Selectivity Filter section of the Results stating the following:
 
 "This coordination geometry is also consistent with that seen in the K+ channel KcsA, though the strict four-fold symmetry of that homo-tetramer produces a more regular structure, as indicated by the smaller variance in liganding distance (2.77 Å with s.d. 0.075 Å in 1K4C) and as depicted by Huang et al. in Extended Data Fig. 3 (Huang et al., 2017)."
 
 (8) Label G232D in Figure 2a.
 
 G232 is out of the plane shown in Fig. 2a. However, we have added a label for Cys344 to help identify the selectivity filter strands that are shown. Note, however, that G232 is visible and labeled in Fig. 2 - figure suppl. 1. This has now been noted in the legend for Fig. 2.
 
 (9) The text states that Km for Q116E is "<10 uM". However, the fitted value is 90 uµ in Figure 2e.
 
 This was a typographical error. The text now states that Km for Q116E is <100 M.
 
 (10) The Km values for Rb, NH4, and Na in Figures 2g and h, and Na in Figure 2i do not make sense. They should be removed.
 
 The values for Km were determined by fitting the Michaelis-Menton equation to the data as detailed in the Methods section. Although the curves visually appear rather flat relative to other ions, the fitting generated respectable confidence limits and are therefore defensible in a statistical context. Furthermore, the curves that are shown are based on those values of Km and it would be inappropriate not to cite them.
 
 (11) Figure 3 would benefit from a slice through the protein to orient the viewer.
 
 Thank you for the suggestion. We have added panels to Figs. 3, 5 and 6 in an effort to orient the reader to the site that is depicted.
 
 (12) The differences between R493E, Q, and M do not appear to be significant.
 
 The y-axis is logarithmic which makes a visual comparison difficult. To alleviate this, P values were calculated based on one-way ANOVA analysis are results are indicated in Fig. 3c and 3d. They show that all of the Arg493 mutations have Km significantly higher than WT. Differences between R493E orR493Q and R493Q orR493M are not significant at the p<0.01 level, while the difference between R493E and R493M is highly significant (p<0.001). The associated text on pg. 6 has been slightly modified as follows:
 
 “Changes to Arg493 generally increase Km (lower apparent affinity) without affecting Vmax, with Met substitution having greater effect than charge reversal (R493E).”
 
 Reviewer #3 (Recommendations for the authors):
 
 Overall, the text was very clear, experiments were rationalized well, and conclusions were justified. A few small comments:
 
 (1) Page 5, paragraph 2. Q116R and G232D don't seem like the world's most intuitive mutations. It appears there is a historical reason for looking at these. Could the rationale be explained in the text? (Why R and D specifically?)
 
 These mutations are of historical importance, having been generated by random mutagenesis during early characterization of the Kdp system. A sentence containing relevant references has been added to this paragraph to provide this information as context:
 
 “Specifically, Q116R and G232D substitutions were initially discovered by random mutagenesis during early characterization of the Kdp system (Buurman et al., 1995; Epstein et al., 1978) and have featured in many follow-up studies (Dorus et al., 2001; Schrader et al., 2000; Silberberg et al., 2021; Sweet et al., 2020; van der Laan et al., 2002).”
 
 (2) Typo: page 14, "diluted"
 
 This typo has been corrected.
 
 (3) The Methods section for SSM electrophysiology could use some additional description of how the data/statistics were collected. How many replicates? Were all replicates from a single sensor/ were multiple sensors examined? Were controls done to test whether the same number of liposomes remain absorbed by the sensor over the length of the experiment?
 
 We have extended our description of experimental protocols in the "Activity Assays" section of Methods. This includes the number and type of replicates as well as a discussion of binding currents that were seen for some mutants. Furthermore, a new series of supplementary figures for Figs. 2, 4, 5, and 6 show all of the raw traces for the SSME measurements (Figure 2 - figure supplements 2-4, Figure 4 - figure supplement 1, Figure 5 - figure supplement 3, Figure 6 - figure supplement 2).
 
 We have included SEC profiles for each of the V496 mutants, which show that they are all well behaved in detergent solution prior to reconstitution (Fig. 4 - figure supplement 1). We are not able to directly document reconstitution efficiencies as it is not practical to separate proteoliposomes from unincorporated protein prior to preparing the sensors used for SSME. Binding currents are seen for several of the inactive mutants (e.g., Q116R in Rb and NH4 in Fig. 2 - figure supplement 3 and V496R in Fig. 4 - figure supplement 1), which demonstrate that protein is indeed present in the corresponding proteoliposomes even though no sustained transport current is observed.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.05.05.652293v2
www.biorxiv.org www.biorxiv.org

Center-surround inhibition by expectation: a neuro-computational account

4
1. Public_Reviews 09 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This is a methodologically rich manuscript that is important for revealing the center-surround inhibition profile of expectation in orientation space. The analyses are compelling in validating the critical role of predictive coding feedback. The findings provide novel insights into how expectation optimizes perception via enhancement and suppression.
 
 Summary
2. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The authors tested two competing mechanisms of expectation (1) a sharpening model that suppresses unexpected information via center-surround inhibition; (2) a cancellation model that predicts a monotonic gradient response profile. Using two psychophysical experiments manipulating feature space distance between expected and unexpected stimuli, the results consistently supported the sharpening model. Computational modeling further showed that expectation effects were explained by either sharpened tuning curves or tuning shifts. Finally, convolutional neural network simulations revealed that feedback connections critically mediate the observed center-surround inhibition.
 
 Strengths:
 
 The manuscript provides compelling and convergent evidence from both psychophysical experiments and computational modeling to robustly support the sharpening model of expectation, demonstrating clear center-surround inhibition of unexpected information.
 
 Comments on revisions:
 
 I appreciate the authors' thoughtful revisions. I have no further comments.
 
 Review 1
3. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 This is a compelling and methodologically rich manuscript. The authors used a variety of methods, including psychophysics, computational modeling, and artificial neural networks, to reveal a non-monotonic, center-surround "Mexican-hat" profile of expectation in orientation space. Their data convincingly extend analogous findings in attention and working memory, and the modeling nicely teases apart sharpening vs. shift mechanisms.
 
 Strengths:
 
 The findings are novel and important in elucidating the potential neural mechanisms by which expectation shapes perception. The authors conducted a series of well-designed psychophysical experiments to careful examination of the profile of expectation's modulation. Computational modeling also provides further insights, linking the neural mechanisms of expectation to behavioral results.
 
 Comments on revisions:
 
 I think the authors did a great job in addressing my previous comments. I have no further comments.
 
 Review 2
4. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the original reviews.
 
 Reviewer #2 (Public review):
 
 (1) The sharpening model of expectation can predict surround suppression. The authors could further clarify how the cancellation model predicts a monotonic profile of expectation (Figure 1C) with the highest response at the expected orientation, while the cancellation model suggests a suppression of neurons tuned toward the expected stimulus.
 
 We thank the reviewer for the comment. We would like to emphasize that as the expected signal is suppressed, the relative weight or salience of unexpected inputs increases. We have clarified this interpretation in the manuscript as follows:
 
 “Here, given these two mechanisms making opposite predictions about how expectation changes the neural responses of unexpected stimuli, thereby displaying different profiles of expectation, we speculated that if expectation operates by the sharpening model with suppressing unexpected information, we should observe an inhibitory zone surrounding the focus of expectation, and its profile then should display as a center-surround inhibition (Fig. 1c, left). If, however, expectation operates as suggested by the cancelation model with highlighting unexpected information, the inhibitory zone surrounding the focus of expectation should be eliminated, and the profile should instead display a monotonic gradient (Fig. 1c, right).”
 
 (2) I'm a bit concerned about whether the profile solely arises from modulation of expectation. The two auditory cues are each associated with a fixed orientation, which may be confounded by other cognitive processes like visual working memory or attention (which I think the authors also discussed). Although the authors tried to use SFD task to render orientation task-irrelevant, luminance edges (i.e., orientation) and spatial frequency in gratings are highly intertwined and orientation of the gratings may help recall the first grating's SF (fixed at 0.9 c/{degree sign}), especially given the first and second grating's orientations are not very different (4.8{degree sign}).
 
 We agree that dissociating expectation from attention and other top-down processes remains a key challenge in visual expectation research (see Summerfield & Egner, 2009; Summerfield & de Lange, 2014; de Lange et al., 2018). As is generally acknowledged, expectation reflects the probability of a sensory event, while selective attention relates to its behavioral relevance. To minimize attentional influences, our task design ensured that grating orientation was not taskrelevant: on each trial, participants discriminated either orientation or spatial frequency difference, such that orientation itself did not require attentional allocation, a point already discussed in the manuscript.
 
 Regarding visual working memory, we argue that even if participants recalled the first grating’s spatial frequency in the SFD task, they were not required to retain its precise spatial frequency (or orientation), as their task was simply to judge whether the second grating appeared denser or sparser. In other words, orientation (or spatial frequency) itself was not task-relevant. Moreover, although not included in the manuscript, we conducted a post-experiment debriefing in which participants were asked whether they noticed any association between the auditory tone and the grating orientation. None of the participants reported this relationship correctly, suggesting that the tone-orientation mapping remained implicit and was unlikely to be driven by strategic attention or memory.
 
 However, we acknowledge that certain confounding processes such as statistical learning or implicit mapping acquisition cannot be fully ruled out given the current paradigm. Future studies using methods with higher temporal resolution (e.g., EEG/MEG) may help to dissociate these mechanisms more precisely.
 
 (3) For each of the expected orientations (20{degree sign} or 70{degree sign}), the unexpected ones are linearly separable (i.e., all unexpected ones lie on one side of the expected angle). This might further encourage people to shift their attended or expected orientation, according to the optimal tuning hypothesis. Would this provide an alternative explanation to the tuning shift that the authors found?
 
 We thank the reviewer for pointing out the relevance of the optimal tuning hypothesis. We acknowledge that the optimal tuning theory (Navalpakkam & Itti, 2007) is an important framework, particularly in visual search paradigms, where attentional templates may shift away from non-target features to enhance discriminability.
 
 In our task, this hypothesis would predict a shift of expectation toward <20° in E20° trials and >70° in E70° trials, given that all unexpected orientations lie on one side of the expected angle. Importantly, the optimal tuning hypothesis predicts such shifts not only in Δ20°, Δ25°, and Δ30° trials but also in the Δ0° trials. In this regard, the observed shift in Δ20° and Δ30° (Experiment 2) and Δ25° (Experiment 3) trials is broadly consistent with the predictions of the optimal tuning account. However, we did not observe a corresponding shift away from nontarget features in the Δ0° condition, suggesting limited behavioral evidence for optimal tuning effects under our current task settings.
 
 It is important to note that most previous studies supporting optimal tuning (e.g., Navalpakkam & Itti, 2007; Scolari & Serences, 2009; Geng, DiQuattro, & Helm, 2017; Yu & Geng, 2019) have used visual search paradigms that differ from our design in several critical ways, including the number of stimuli presented, their spatial arrangement (eccentricity), task demands, and so on. Therefore, it is difficult to determine whether the optimal tuning hypothesis could serve as an alternative explanation within the context of our current study. We agree that future studies could further examine how such task parameters influence the presence or absence of optimal tuning.
 
 (4) It is great that the authors conducted computational modeling to elucidate the potential neuronal mechanisms of expectation. But I think the sharpening hypothesis (e.g., reviewed in de Lange, Heilbron & Kok, 2018) focuses on the neural population level, i.e., narrowing of population tuning profile, while the authors conducted the sharpening at the neuronal tuning level. However, the sharpening of population does not necessarily rely on the sharpening of individual neuronal tuning. For example, neuronal gain modulation can also account for such population sharpening. I think similar logic applies to the orientation adjustment experiment. The behavioral level shift does not necessarily suggest a similar shift at the neuronal level. I would recommend that the authors comment on this.
 
 We thank the reviewer for this to-the-point comment. As de Lange et al. (2018) noted, “there is not always a direct correspondence between neural-level and voxel-level selectivity patterns.” That is, neuronal tuning, population-level tuning, voxel-level selectivity, and behavioral adaptive outcomes may reflect different underlying mechanisms and do not necessarily align in a one-toone fashion. We fully acknowledge that population-level tuning effects may also result from various neuronal mechanisms such as gain modulation (for review, see Salinas & Thier, 2000), shifts in preferred orientation (Ringach, et al., 1997; Jeyabalaratnam et al., 2013), asymmetric broadening of tuning curves (Schumacher et al., 2022), or tuning curve sharpening (Ringach, et al., 1997; Schoups et al., 2001).
 
 In our modeling, we implemented sharpening and shifts of neuronal tuning curves as a conceptual model simplification, intended to explore potential mechanisms underlying expectation-related center-surround suppression effects. While sharpening-based accounts (e.g., Kok et al. 2012) have often been emphasized, we stress that other mechanisms, such as gain modulation or tuning shifts, may also contribute. Our goal is not to provide a definitive account, but to highlight such plausible mechanisms and encourage future investigation. We have revised the Discussion to emphasize that multiple mechanisms may underlie the observed effects.
 
 “We note that our implementation of sharpening and shifts at the neuronal level serves as a conceptual model simplification, as population-level tuning, voxel-level selectivity, and behavioral adaptive outcomes may reflect different underlying neuronal mechanisms and do not necessarily align in a one-to-one fashion. Here, we stress that other potential mechanisms beyond sharpening, such as tuning shifts, may also contribute to visual expectation.”
 
 (5) If the orientation adjustment experiment suggests that both sharpening and shifting are present at the same time, have the authors tried combining both in their computational model?
 
 We agree with the reviewer that it is necessary to consider the combined model. Accordingly, we implemented a computational model incorporating sharpening of the expected orientation channel together with shifting of the unexpected orientation channels. This model
 
 successfully captured the sharpening of the expected-orientation channel and the shift of the unexpectedorientation channels (Supplementary Fig. 3). For the expected orientation (Δ0°) , results showed that the amplitude change was significantly higher than zero on both OD (t(23) = 2.582, p = 0.017, Cohen’s d = 0.527) and SFD (t(23) = 2.078, p = 0.049, Cohen’s d = 0.424) tasks (Supplementary Fig. 3e, vertical stripes); the width change was significantly lower than zero on both OD (t(23) = -2.438, p = 0.023, Cohen’s d = 0.498) and SFD (t(23) = -2.578, p = 0.017, Cohen’s d = 0.526) tasks (Supplementary Fig. 3e, diagonal stripes). For unexpected orientations (Δ10°-Δ40°), however, the amplitude and width changes were not significant with zero on either OD (amplitude change: t(23) = 0.443, p = 0.662, Cohen’s d = 0.091; width change: t(23) = -1.819, p = 0.082, Cohen’s d = 0.371) or SFD (amplitude change: t(23) = 1.130, p = 0.270, Cohen’s d = 0.231; width change: t(23) = -1.710, p = 0.101, Cohen’s d = 0.349) tasks (Supplementary Fig. 3f). In the meantime, the location shift was significantly different than zero for unexpected orientations (Δ10°-Δ40°, OD task: t(23) = 3.611, p = 0.001, Cohen’s d = 0.737; SFD task: t(23) = 2.418, p = 0.024, Cohen’s d = 0.493 (Supplementary Fig. 3g). These results provided further evidence that tuning sharpening and tuning shift jointly contribute to center– surround inhibition in expectation.
 
 Reviewer#1 (Recommendation for the Author):
 
 (1) A direct comparison between tasks (baseline vs. expectation conditions) would have strengthened the findings. Specifically, contrasting performance in the orientation discrimination task with the spatial frequency discrimination task could have provided clearer evidence that participants actually used the auditory cues to attend to the expected orientation. This comparison would be particularly important for validating cue manipulation in the orientation discrimination task.
 
 We agree that a direct comparison between the orientation discrimination (OD) and spatial frequency discrimination (SFD) tasks could further clarify how expectation (auditory cues) differentially modulates orientation relevance. However, the primary goal of the current study was to examine expectation effects within each task separately and to demonstrate that such effects are independent of attentional modulation driven by the task-relevance of orientation.
 
 In addition, the OD and SFD tasks differ not only in the relevant task features (orientation vs. spatial frequency discrimination), but also in stimulus properties and difficulty, for example, the arbitrary use of 20–70° as the orientation range and ~0.9 cycles/° as the spatial frequency setting, a direct comparison could introduce confounding factors unrelated to expectation.
 
 Importantly, Previous studies (e.g., Kok et al., 2012, 2017; Aitken et al., 2020) and our current results show that participants performed significantly better when the auditory cue matched the expected orientation, supporting the validity of our expectation manipulation.
 
 (2) An interesting consideration is why the center-surround inhibition profile of expectation was independent of the task-relevance of orientation. Previous studies (e.g., Kok et al., 2012) have found that orientation discrimination patterns differ depending on whether orientation is taskrelevant or irrelevant. This could be useful to discuss the possible discrepancies.
 
 We thank the reviewer for this inspiring comment. Kok et al. (2012) showed that both orientation and contrast tasks elicited similar fMRI decoding results, regardless of task relevance, suggesting neural mechanisms of expectation operate independently of whether orientation is task relevant. Behaviorally, they reported better performance for expected versus unexpected trials in the orientation task (3.4° vs. 3.8°, t(17) = 2.8, p = 0.013), and a marginal trend (although not significant) in the contrast task (4.3% vs. 5.0%, t(17) = 1.9, p = 0.075). If any differences between the two tasks exist, they may lie in the correlation between behavioral and fMRI effects, a question that goes beyond the scope of the current study. Therefore, it is hard to strongly conclude that orientation discrimination patterns differ depending on whether orientation is taskrelevant or irrelevant in their paper.
 
 Our study differs from theirs in at least two important ways, which may account for the clearer expectation facilitatory effect we observed in the expectation (Δ0°) condition. First, in our study, the orientation-irrelevant task involved spatial frequency discrimination (SFD) rather than contrast discrimination. Compared to contrast, spatial frequency has been shown to exhibit a clear cueing effect, as reported in Fang & Liu (2019). Second, our design included a baseline condition, which was absent in their study. We computed discrimination sensitivity (DS) to quantify how much the discrimination threshold (DT) changed relative to baseline. By using this baseline-referenced approach, we observed a significant facilitatory expectation effect in the Δ0° condition, an effect that shifted from marginal significance in their orientation-irrelevant task to clear significance in our study.
 
 (3) The authors might consider briefly explaining how the orientation adjustment paradigm used in this study is particularly effective for examining the potential co-existence of tuning sharpening and tuning shift computations, and how this approach complements traditional orientation discrimination tasks in characterizing expectation-related mechanisms.
 
 We thank the reviewer for this valuable suggestion. We agree that further clarification is needed to better connect the two experiments. To explain this, we have elaborated further in the manuscript.
 
 “To further explore the co-existence of both Tuning sharpening and Tuning shift computations in center-surround inhibition profile of expectation, participants were asked to perform a classic orientation adjustment experiment. Unlike profile experiment (discrimination tasks), the adjustment experiment provides a direct, trial-by-trial measure of participants’ perceived orientation, capturing the full distribution of responses. This enables the construction of orientation-specific tuning curves, allowing us to detect both tuning sharpening and tuning shifts, thereby offering a more nuanced understanding of the computational mechanisms underlying expectation.”
 
 (4) These interesting findings raise important questions about their relationship to existing hybrid models of attentional modulation. Could the authors discuss how their results might align with or extend previous work demonstrating combined feature-similarity gain and surround suppression effects for orientation (e.g., Fang & Liu, 2019)? Could a hybrid model potentially provide a better account of these data than the pure surround suppression model?
 
 We thank the reviewer for this valuable comment. We agree that hybrid model should be mentioned in the manuscript and we have elaborated further in the Discussion.
 
 “For example, within the orientation space, the inhibitory zone was about 20°, 45°, and 54° for expectation evident here, feature-based attention[21], and visual perceptual learning[35], respectively; within the feature-based attention, it was about 30° and 45° in color [77] and motion direction [53] spaces, respectively These variations hint at the exciting possibility that the width of the inhibitory surround may flexibly adapt to stimulus context and task demands, ultimately facilitating our perception and behavior in a changing environment. This principle is consistent with the hybrid model of feature-based attention [53,54,75], where attention is deployed adaptively to prioritize task-relevant information through feature-similarity gain which filters out the most distinctive distractors, and surround suppression which inhibits similar and confusable ones, thereby jointly shaping the attentional tuning profile.”
 
 (5) On page 19, there appears to be a missing symbol in the description of the Tuning Sharpening model. The text states: 'the tuning width of each channel's tuning function is parameterized by ??', where the question marks seem to indicate a missing parameter symbol.
 
 We appreciate the reviewer’s careful attention. Yes, the "ơ" is missing, which was likely caused by a formatting issue. We have corrected it.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.08.26.609781v2
rori.figshare.com rori.figshare.com

The Matthew effect and early-career setbacks in research funding - a replication study (RoRI Working Paper No. 16)

4
1. Public_Reviews 09 Oct 2025
  
  in eLife (unscoped)
  
  eLife Assessment
  
  This important study reports the results of efforts to replicate two phenomena of significant interest to early-career scientists and scientific policymakers: the Matthew effect and the early-career setback effect. Several previous studies of these effects have focused on early-career researchers with grant proposals that fell just below or just above a funding threshold. Those just above the threshold were more likely to be successful when they applied for funding later in the career (an example of the well-known Matthew effect), while those just below were more likely to go on to have stronger publication records (the early-career setback effect). In this study the Matthew effect was found to be robust across funders, and to generalize from those close to the funding threshold to the whole population. The early-career setback effect was not robust across funders and did not generalize to the whole population. The evidence reported is convincing.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife (unscoped)
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The authors performed a multi-funder study to determine if the Matthew effect and early-career setback effect were reproducible across funding programs and processes. The authors extended the analysis of these effects to all applicants and compared the results to the prior studies that only looked at near-hit/near-miss applicants to determine if the effects were generalizable to the whole applicant pool. Further, the authors included new models that also account for researcher behavior and their overall likelihood to reapply for later funding and how this behavior may resolve what appears to be a paradox between the Matthew effect and the early-career setback effect.
  
  Strengths:
  
  Figure 4 shows that the "Post (late) MFCR" is the same for the funded and unfunded groups, indicating that the impact of early career funding (at least, in terms of citation metrics) is transient in researcher's overall careers. This finding should encourage researchers to persevere when needed and that long-term success is attainable.
  
  The inclusion of the collider bias in the models to account for researcher behavioral responses is a key strength of the paper and enhance the analysis and nuanced discussion of the results.
  
  Weaknesses:
  
  The discussion of limitations is thorough and point to the need for additional studies. One limitation that is acknowledged is that the authors only looked at applicants who reapplied for funding at the same funder. Given that the authors had the names and affiliations of the applicants from all of the funders, it would be helpful to understand why they were not able to look at applicants across their full data set. Was the limitation technical or a result of the study design? What would have to change to enable this broader analysis?
  
  In Section 4.1, the authors make a statement that the "between MFCR" difference was seen at 5 years, but not at 10 years, and so the authors chose to use the 5-year period for the presentation of their results. It would be helpful to also see the 10-year analysis and have further justification from the authors on why they selected to look at the 5-year period and how their conclusions might or might not change if they consider the longer time period.
  
  The discussion could also include that many funders require novel research directions as a condition of receiving an early-career award. For those who receive these awards, they must establish the new research program, begin publishing, and they may initially see a lower citation rate until the impact of the research is more broadly recognized. Are there ways to explore how these time lags impact the "Between MFCR" on those who were funded more so than those who were not funded?
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife (unscoped)
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The manuscript evaluates the generalizability of two phenomena of great interest to early-career scientists and scientific policymakers. These phenomena describe how early funding success can promote future funding success (the Matthew Effect) and how initially unsuccessful applicants may later succeed (the early-career setback effect). Given the often-normative aspirations of science-of-science studies, the manuscript represents a much-needed and highly significant effort, as it allows a broader audience to assess whether they should reconsider their behavior or policies.
  
  Strengths:
  
  The evidence provided by the authors for the generalizability of the Matthew Effect is very strong and convincing. The manuscripts addresses an important topic of practical concern to early-career scientists and scientific policymakers.
  
  Weaknesses: If I am correctly interpreting S11 and S12, the statements on the early-career setback effect could be stronger and more direct. The argument in the main text relies on assumptions and simulations to suggest that observations of the early-career setback effect may depend on reapplications. In contrast, S11 and S12 appear to provide more direct evidence against its generalizability, showing that the effect seems to exist in, and be driven by, only one of the six funding agencies considered (FWF). This narrow replication may not be obvious to readers ("the early-career setback effect also replicates, but is not robust across funders").
  
  I would also suggest that the authors provide a more nuanced discussion of the limitations of their Bayesian model. While the model seems appropriate for accounting for major factors, it appears to exclude others, such as the emergence of new scientific fields or the strategic reorientation of funders toward such fields.
  
  Review 2
4. Public_Reviews 09 Oct 2025
  
  in eLife (unscoped)
  
  Reviewer #3 (Public review):
  
  Summary:
  
  This paper investigates the Matthew effect, where early success in funding peer review can translate into potentially unwarranted later success. It also investigated the previously found "setback" effect for those who narrowly miss out on funding.
  
  Strengths:
  
  The study used data from six funding agencies, which increases the generalisability, and was able to link bibliographic data for around 95% of applicants. The authors nicely illustrate how the previously found "setback" effect for near-miss applicants could be a collider bias due to those who chose to apply sometime later. This is a good explanation for the counter-intuitive effect and is nicely shown in Figure 5.
  
  Weaknesses:
  
  Most of the methods were clearly presented, but I have a few questions and comments, as outlined below.
  
  In Figure 4(a) why are the "post" means much lower than the "pre"? This contradicts the expected research trajectory of researchers. Or is this simply due to less follow-up time? But doesn't the field citation ratio control for follow-up time?
  
  The choice of the log-normal distribution for latent quality was not entirely clear to me. This would create some skew, rather than a symmetric distribution, which may be reasonable but log-normal distributions can have a very long tail which might not mimic reality, as I would not expect a small number of researchers to be extremely above the crowd. However, then the skew was potentially dampened by using percentile scores. Some further reasoning and plots of the priors would help.
  
  Can the authors confirm the results of Figure S9 which show no visible effect of altering the standard deviation for the review parameter or the mean citations? Is this just because the prior for quality is dominated by the data? Could it be that the width of the distribution for quality does not matter, as it's the relative difference/ranking that counts? So the beta in equation 6 changes to adjust to the different quality scale?
  
  The contrary result for the FWF is not explained (Table S3). Does this funder have different rules around re-applicants or many other competing funders?
  
  The outlined qualitative research sounds worthwhile. Another potential mechanism (based on anecdote) is that some researchers react irrationally to rejection or acceptance, tending to think that the whole agency likes or hates their work based on one experience. Many researchers do not appreciate that it was a somewhat random selection of reviewers who viewed their work, and it will unlikely be the same reviewers next time.
  
  "A key implication is the importance of encouraging promising, but initially unsuccessful applicants to reapply." Yes, A policy implication is to give people multiple chances to be lucky, perhaps by giving fewer grants to more people, which could be achieved by shortening the funding period (e.g., 4 year fellowships instead of 5 years). Although this will have some costs as applicants would need to spend more time on applications and suffer increased stress of shorter-term contracts. The bridge grants is potentially an ideal half-way house between many short-term and few long-term awards. Giving more grants to fewer people is supported by this analysis showing a diminishing returns in research outputs with more funding, DOI: 10.1371/journal.pone.0065263.
  
  Making more room for re-applicants also made me wonder if there should be an upper cap on funding, potentially for people who have been incredibly successful. Of course, funders generally want to award successful researchers, but people who've won over some limit, for example $50 million, could likely be expected to win funding from other sources such as philanthropy and business. Graded caps could occur by career stage.
  
  Review 3
Visit annotations in context

Tags

Review 2

Review 1

Review 3

Summary

Annotators

Public_Reviews

URL

rori.figshare.com/articles/preprint/The_i_Matthew_i_effect_and_early-career_setbacks_in_research_funding_-_a_replication_study_RoRI_Working_Paper_No_16_/29302004
www.biorxiv.org www.biorxiv.org

Sense of control buffers against stress

5
1. Public_Reviews 09 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This important research addresses the effects of subjective control and task difficulty on experienced stress using a novel behavioral task administered on the same day in two large online samples. Convincing evidence is provided, establishing the internal and external task validity of the task, as well as a relationship between the sense of control and task difficulty, with individual differences in relevant mental health constructs. Evidence for the specificity of the link between control and stress would be more substantial if the design had not conflated control and reward rate. This work will be of interest to psychologists and clinicians studying the concepts of controllability, stress, and psychopathology.
 
 Summary
2. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This work investigated how the sense of control influences perceptions of stress. In a novel "Wheel Stopping" task, the authors used task variations in difficulty and controllability to measure and manipulate perceived control in two large cohorts of online participants. The authors first demonstrate that their behavioral task exhibits good internal consistency and external validity, indicating that perceived control during the task is linked to relevant measures of anxiety, depression, and locus of control. Most importantly, manipulating controllability in the task resulted in reduced subjective stress, demonstrating a direct impact of control on stress perception. However, this work has some minor limitations to this work due to the design of the stressor manipulations/measurements and the necessary logistics associated with online versus in-person stress studies. Nevertheless, this research adds to our understanding of when and how control can influence the effects of stress and has particular relevance for mental health interventions.
 
 Strengths:
 
 The primary strength of this research is the development of a unique and clever task design that can reliably and validly elicit variations in beliefs about control. Impressively, higher subjective control in the task was associated with decreased psychopathology measures such as anxiety and depression in a non-clinical sample of participants. In addition, the authors found that lower control and higher task difficulty led to higher perceived stress, suggesting that the task can reliably manipulate perceptions of stress. Prior tasks have not included both controllability and difficulty in this manner and have not directly tested the direct influence of these factors on incidental stress, making this work both novel and important for the field.
 
 Weaknesses:
 
 One minor weakness of this research is the validity of the online stress measurements and manipulations. In this study, the authors measure subjective stress via self-report both during the task and after either a Trier Social Stress Test (high-stress condition) or a memory test (low-stress condition). One concern is that these stress manipulations were really "threats" of stress, where participants never had to complete the stress tasks (i.e., recording a speech for judgment). While this is not unusual for an in-lab study and can reliably elicit substantial stress/anxiety, in an online study, there is a possibility for communication between participants (via online forums dedicated to such communication), which could weaken the stress effects. That said, the authors did find sensible increases and decreases in perceived stress between relevant time points; however, future work could improve upon this design by including more comprehensive stress manipulations and by measuring implicit physiological signs of stress.
 
 Comments on revisions:
 
 I appreciate the authors' responses to my comments and concerns. I have decided not to make changes to my public review, as I believe it remains relevant and fair after revisions.
 
 Review 1
3. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The authors have developed a behavioral paradigm to experimentally manipulate the sense of control experienced by participants by varying the level of difficulty in a wheel-stopping task. In the first study, this manipulation is tested by administering the task in a factorial design with two levels of controllability and two levels of stressor intensity to a large number of participants online, while simultaneously recording subjective ratings of perceived control, anxiety, and stress. In a second study, the authors employed the wheel stopping task to induce a high sense of controllability and investigate whether this manipulation buffers the response to a subsequent stress induction when compared to a neutral task, such as watching pleasant videos.
 
 Strengths:
 
 (1) The authors validate a method to manipulate stress.
 
 (2) The authors use an experimental manipulation to induce an enhanced sense of controllability to test its impact on the response to stress induction.\
 
 (3) The studies involved big sample sizes.
 
 Weaknesses:
 
 (1) The study was not preregistered.
 
 (2) The control manipulation is conflated with task difficulty and, therefore, the reward rate. In the revised version of the manuscript, the authors perform statistical analysis to demonstrate that the relationship between perceived level of control and subjective stress remains robust after the inclusion of win rate in the model. This analysis strengthens the authors's claims, but the evidence would more substantial if the design did not conflate reward rate and control. The authors properly discuss this issue in the revised manuscript.
 
 This study will be of interest to psychologists and cognitive scientists who are interested in understanding how controllability and its subjective perception influence how people respond to stress exposure. The demonstration that an increased sense of control buffers/protects against subsequent stress is important and may trigger further studies to characterize this phenomenon better. However, beyond the highlighted weaknesses, the current study only studied the effect of stress induction consequent to the performance of the WS task on the same day, and its generalizability is not warranted.
 
 Review 2
4. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 This is an interesting investigation on the benefits of perceiving control and its impact on the subjective experience of stress. To assess the subjective sense of control, the authors introduce a novel wheel stopping (WS) task where control is manipulated via size and speed to induce conditions of low and high control. The authors demonstrate that the subjective sense of control is associated with experienced subjective stress and individual differences related to mental health measures. In a second experiment, they further demonstrate that an increased sense of control buffers subjective stress induced by a trier social stress manipulation, more so than a typical stress-buffering mechanism of watching neutral/calming videos.
 
 Strengths:
 
 Several strengths of the manuscript can be highlighted. For instance, the paper introduces a new paradigm and a clever manipulation to test a significant and important question. Additionally, it is a well-powered investigation that allows for confidence in replicability and demonstrate both high internal consistency and high external validity, along with an interesting set of individual difference analyses. Finally, the results are quite interesting and support prior literature, while also making a significant contribution to the field in understanding the benefits of perceiving control.
 
 Weaknesses:
 
 The authors have addressed all my queries, and I believe the revised paper has been improved and will make an important contribution to the literature.
 
 Review 3
5. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the previous reviews.
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This work investigated how the sense of control influences perceptions of stress. In a novel "Wheel Stopping" task, the authors used task variations in difficulty and controllability to measure and manipulate perceived control in two large cohorts of online participants. The authors first show that their behavioral task has good internal consistency and external validity, showing that perceived control during the task was linked to relevant measures of anxiety, depression, and locus of control. Most importantly, manipulating controllability in the task led to reduced subjective stress, showing a direct impact of control on stress perception. However, this work has minor limitations due to the design of the stressor manipulations/measurements and the necessary logistics associated with online versus in-person stress studies.
 
 Nevertheless, this research adds to our understanding of when and how control can influence the effects of stress and is particularly relevant to mental health interventions.
 
 We thank the reviewer for their clear and accurate summary of the findings.
 
 Strengths:
 
 The primary strength of this research is the development of a unique and clever task design that can reliably and validly elicit variations in beliefs about control. Impressively, higher subjective control in the task was associated with decreased psychopathology measures such an anxiety and depression in a non-clinical sample of participants. In addition, the authors found that lower control and higher difficulty in the task led to higher perceived stress, suggesting that the task can reliably manipulate perceptions of stress. Prior tasks have not included both controllability and difficulty in this manner and have not directly tested the direct influence of these factors on incidental stress, making this work both novel and important for the field.
 
 We thank the reviewer for their positive comments.
 
 Weaknesses:
 
 One minor weakness of this research is the validity of the online stress measurements and manipulations. In this study, the authors measure subjective stress via self-report both during the task and also after either a Trier Social Stress Test (high-stress condition) or a memory test (low-stress condition). One concern is that these stress manipulations were really "threats" of stress, where participants never had to complete the stress tasks (i.e., recording a speech for judgment). While this is not unusual for an in-lab study and can reliably elicit substantial stress/anxiety, in an online study, there is a possibility for communication between participants (via online forums dedicated to such communication), which could weaken the stress effects. That said, the authors did find sensible increases and decreases of perceived stress between relevant time points, but future work could improve upon this design by including more complete stress manipulations and measuring implicit physiological signs of stress.
 
 We thank the reviewer for urging us to expand on this point. The reviewer is right that stress was merely anticipatory and is in that sense different to the canonical TSST. However, there are ample demonstrations that such anticipatory stress inductions are effective at reliably eliciting physiological and psychological stress responses (e.g. Nasso et al., 2019; Schlatter et al., 2021; Steinbeis et al., 2015). Further, there is evidence that online versions of the TSST are also effective (DuPont et al., 2022; Meier et al., 2022), including evidence that the speech preparation phase conducted online was related to increases in heart rate and blood pressure (DuPont et al., 2022). Importantly, and as the reviewer notes in relation to our study specifically, the anticipatory TSST had a significant impact on subjective stress in the expected direction demonstrating that it was effective at eliciting subjective stress. We have elaborated further on this in our manuscript (pages 8 and 9) as follows:
 
 “Prior research has found TSST anticipation to elicit both psychological and physiological stress responses [37-39], suggesting that the task anticipation would be a valid stress induction despite participants not performing the speech task. Moreover, prior research has validated the use of remote TSST in online settings [40, 41], including evidence that the speech preparation phase (online) was related to increased heart rate and blood pressure compared to controls [40].”
 
 Reviewer #2 (Public review):
 
 Summary:
 
 The authors have developed a behavioral paradigm to experimentally manipulate the sense of control experienced by the participants by changing the level of difficulty of a wheel-stopping task. In the first study, this manipulation is tested by administering the task in a factorial design with two levels of controllability and two levels of stressor intensity to a large number of participants online while simultaneously recording subjective ratings on perceived control, anxiety, and stress. In the second study, the authors used the wheel-stopping task to induce a high sense of controllability and test whether this manipulation buffers the response to a subsequent stress induction when compared to a neutral task, like looking at pleasant videos.
 
 We thank the reviewer for their accurate summary.
 
 Strengths:
 
 (1) The authors validate a method to manipulate stress.
 
 (2) The authors use an experimental manipulation to induce an enhanced sense of controllability to test its impact on the response to stress induction.
 
 (3) The studies involved big sample sizes.
 
 We thank the reviewer for noting these positive aspects of our study.
 
 Weaknesses:
 
 (1) The study was not preregistered.
 
 This is correct.
 
 (2) The control manipulation is conflated with task difficulty, and, therefore the reward rate. Although the authors acknowledge this limitation at the end of the discussion, it is a very important limitation, and its implications are not properly discussed. The discussion states that this is a common limitation with previous studies of control but omits that many studies have controlled for it using yoking.
 
 We agree that these are very important issues to consider in the interpretation of our findings. It is important to note, that while our task design does not separate these constructs, we are able to do so in our statistical analyses. For example, our measure of perceived difficulty was included in analyses assessing the fluctuations in stress and control in which subjective control still had a unique effect on the experience of stress over and above perceived difficulty, suggesting that subjective control explains variance in stress beyond what is accounted for by perceived difficulty. Similarly, we have also included additional analyses in which we include the win rate (i.e. percentage of trials won) as a covariate when assessing the relationship between subjective control, perceived difficulty and subjective stress, in which subjective control and perceived difficulty still uniquely predict subjective stress when controlling for win rate. This suggests that there is unique variance in subjective control, separate from perceived task difficulty and win rate that is relevant to stress. We have included these analyses (page 16 of manuscript) as follows:
 
 “To further isolate the relationship between subjective control and stress separate from perceived task difficulty or objective task performance, we also included the overall win rate (percentage of trials won during the WS task) in the models. In Study 1, lower feelings of control were related to higher levels of subjective stress (β= -0.12, p<.001) even when controlling for both win rate (β= -0.06, p=.220) and perceived task difficulty (β= 0.37, p<.001, Table S10). This also replicated in Study 2, where lower subjective control was associated with higher feelings of stress (β= -0.32, p<.001) when controlling for perceived task difficulty (β= 0.31, p<.001) and win rate (β= -0.11, p=.428, Table S11). This suggests that there is unique variance in subjective feelings of control, separate from task performance, relevant to subjective stress.”
 
 As well as expanding on this in the Discussion (pages 27 and 28) as follows:
 
 “While our task design does not separate control from obtained reward, we are able to do so in the statistical analyses. Like with perceived difficulty, we statistically accounted for reward rate and showed that the relationship between subjective control and stress was not accounted for by reward rate, for example. Similarly, participants received feedback after every trial, and thus feedback valence may contribute to stress perception. However, given that overall win rate (which captures the feedback received during the task) did not predict stress over and above perceived difficulty or subjective control, it suggests that feedback is unlikely to relate to stress over and above difficulty. Future work will need to disentangle this further to rule out such potential confounds.”
 
 Further, in terms of the wider literature on these issues, we have added more to this point in our discussion, especially in relation to previous literature that also varies control by reward rate (e.g. Dorfman & Gershman, 2019, who use a reward rate of 80% in high control conditions and 50% in low control conditions). This can be found in the manuscript on page 27 as follows:
 
 “Previous research typically accounts for different outcomes (e.g. punishment) by yoking controllable and uncontrollable conditions [3] though other work has manipulated the controllability of rewards by changing the reward rate [for example 30] where a decoy stimulus is rewarded 50% of the time in the low control condition but 80% in the high control condition).”
 
 (3) The methods are not always clear enough, and it is difficult to know whether all the manipulations are done within-subjects or some key manipulations are done between subjects.
 
 We have added more information in the methods section (page 8) clarifying withinsubject manipulations (WS task parameters) and between-subject manipulations (stressor intensity task, WS task version in Study 1, and WS task/video task in Study 2). Additionally, as recommended by Reviewer 1, we have provided more information in the methods section and Table S3 regarding the details of on-screen written feedback provided to participants after each trial of the WS Task.
 
 (4) The analysis of internal consistency is based on splitting the data into odd/even sliders. This choice of data parcellation may cause missed drifts in task performance due to learning, practice effects, or tiredness, thus potentially inflating internal consistency.
 
 We agree that this can indeed be an issue, though drift is likely to be present in any task including even in mood in resting-state (Jangraw et al., 2023). To respond to this specific point, we parcellated the timepoints into a 1st/2nd half split and report the ICC in the supplementary information. While values are lower, indeed likely due to systematic drifts in task performance as participants learn to perform the task (especially for Study 2 since the order of parameters were designed to get easier throughout the experiment), the ICC values are still high. Control sliders: Study 1 = 0.82, Study 2: = 0.68; Difficulty sliders: Study 1: = 0.84, Study 2 = 0.57; Stress sliders: Study 1 = 0.45, Study 2 = 0.71. As seen, the lowest ICC is for stress sliders in Study 1. This may be because the first 3 sliders (included in the 1st half split) were all related to the stress task (initial, post-stress, task, post-debrief) and the final 4 sliders (in the 2nd half split) were the three sliders during the WS task and shortly afterwards.
 
 (5) Study 2 manipulates the effect of domain (win versus loss WS task), but the interaction of this factor with stressor intensity is not included in the analysis.
 
 We agree that this would be a valuable analysis to include. We have run additional analyses (section Sensitivity and Exploratory Analyses, pages 24 and 25), testing the interaction of Domain (win or loss) with stressor intensity (and time) when predicting the stress buffering and stress relief effects. This revealed no significant main effects of domain or interactions including domain, suggesting that domain did not impact the stress induction or relief differently depending on whether it was followed by the high or low stressor intensity condition. While the control by time interaction (our main effect of interest) still held for stress induction in this more complex model, the control by time interaction did not hold for the stress relief. However, this more complex model did not provide a better fit for the data, motivating us to continue to draw conclusions from the original model specification with domain as a covariate (rather than an interaction).
 
 We outline these analyses on page 24 of the manuscript, as follows:
 
 “Third, we included the interaction of domain with stressor intensity and with time, to test whether the win or loss domain in the WS task significantly impacted stress induction or stress relief differently depending on stressor intensity. There were no significant effects or interactions of domain (Table S14) for stress induction or stress relief, and the main effect of interest (the interaction between time and control) still held for the stress induction (β= 10.20, SE=4.99 p=.041, Table S14), though was no longer significant for the stress relief (β= 6.72, SE=4.28, p=.117, Table S14). This more complex model did not significantly improve model fit (χ²(3)= 1.46, p=.691) compared to our original specification (with domain as a covariate rather than an interaction) and had slightly worse fit (higher AIC and BIC) than the original model (AIC = 5477.2 versus 5472.7, BIC = 5538.5 versus 5520.8).”
 
 This study will be of interest to psychologists and cognitive scientists interested in understanding how controllability and its subjective perception impact how people respond to stress exposure. Demonstrating that an increased sense of control buffers/protects against subsequent stress is important and may trigger further studies to characterize this phenomenon better. However, beyond the highlighted weaknesses, the current study only studied the effect of stress induction consecutive to the performance of the WS task on the same day and its generalizability is not warranted.
 
 We thank the reviewer for this assessment and agree that we cannot assume these findings would generalise to more prolonged effects on stress responses.
 
 Reviewer #3 (Public review):
 
 Summary:
 
 This is an interesting investigation of the benefits of perceiving control and its impact on the subjective experience of stress. To assess a subjective sense of control, the authors introduce a novel wheel-stopping (WS) task where control is manipulated via size and speed to induce low and high control conditions. The authors demonstrate that the subjective sense of control is associated with experienced subjective stress and individual differences related to mental health measures. In a second experiment, they further show that an increased sense of control buffers subjective stress induced by a trier social stress manipulation, more so than a more typical stress buffering mechanism of watching neutral/calming videos.
 
 We agree with this accurate summary of our study.
 
 Strengths:
 
 There are several strengths to the manuscript that can be highlighted. For instance, the paper introduces a new paradigm and a clever manipulation to test an important and significant question. Additionally, it is a well-powered investigation that allows for confidence in replicability and the ability to show both high internal consistency and high external validity with an interesting set of individual difference analyses. Finally, the results are quite interesting and support prior literature while also providing a significant contribution to the field with respect to understanding the benefits of perceiving control.
 
 We thank the reviewer for this positive assessment.
 
 Weaknesses:
 
 There are also some questions that, if addressed, could help our readership.
 
 (1) A key manipulation was the high-intensity stressor (Anticipatory TSST signal), which was measured via subjective ratings recorded on a sliding scale at different intervals during testing. Typically, the TSST conducted in the lab is associated with increases in cortisol assessments and physiological responses (e.g., skin conductance and heart rate). The current study is limited to subjective measures of stress, given the online nature of the study. Since TSST online may also yield psychologically different results than in the lab (i.e., presumably in a comfortable environment, not facing a panel of judges), it would be helpful for the authors to briefly discuss how the subjective results compare with other examples from the literature (either online or in the lab). The question is whether the experienced stress was sufficiently stressful given that it was online and measured via subjective reports. The control condition (low intensity via reading recipes) is helpful, but the low-intensity stress does not seem to differ from baseline readings at the beginning of the experiment.
 
 We agree that it would be helpful to expand on this further. Similar to the comment made by Reviewer 1, we wish to point out that there are ample demonstrations that such anticipatory stress inductions are effective at reliably eliciting physiological and psychological stress responses (e.g. Nasso et al., 2019; Schlatter et al., 2021; Steinbeis et al., 2015). Further, there is evidence that online versions of the TSST are also effective (DuPont et al., 2022; Meier et al., 2022), including evidence that the speech preparation phase conducted online was related to increases in heart rate and blood pressure (DuPont et al., 2022). We have elaborated further on this in our manuscript on pages 8 and 9 as follows:
 
 “Prior research has found TSST anticipation to elicit both psychological and physiological stress responses [37-39], suggesting that the task anticipation would be a valid stress induction despite participants not performing the speech task. Moreover, prior research has validated the use of remote TSST in online settings [40, 41], including evidence that the speech preparation phase (online) was related to increased heart rate and blood pressure compared to controls [40].”
 
 (2) The neutral videos represent an important condition to contrast with WS, but it raises two questions. First, the conditions are quite different in terms of experience, and it is interesting to consider what another more active (but not controlled per se) condition would be in comparison to the WS performance. That is, there is no instrumental action during the neutral video viewing (even passive ratings about the video), and the active demands could be an important component of the ability to mitigate stress. Second, the subjective ratings of the stress of the neutral video appear equivalent to the win condition. Would it have been useful to have a high arousal video (akin to the loss condition) to test the idea that experience of control will buffer against stress? That way, the subjective stress experience of stress would start at equivalent points after WS3.
 
 We agree with the reviewer that this is an important issue to clarify. In our deliberations when designing this study, we considered that that any task with actionoutcome contingencies would have a degree of controllability. To better distinguish experiences of control (WS task) to an experience of no/neutral control (i.e., neither high nor low controllability), we decided to use a task in which no actions were required during the task itself. Importantly, however, there was an active demand and concentration was still required in order to perform the attention checks regarding the content of the videos and ratings of the videos.
 
 Thank you for the suggestion of having a high arousal video condition. This would indeed be interesting to test how experiencing ‘neutral’ control and high(er) stress levels preceding the stressor task influences stress buffering and stress relief, and we have included this suggestion for future research in the discussion section (page 28) as below:
 
 “Another avenue for future research would be to test how control buffers against stress when compared to a neutral control scenario of higher stress levels, akin to the loss domain in the WS Task, given that participants found the video condition generally relaxing. However, given that we found no differences dependent on domain for the stress induction in the WS Task conditions, it is possible that different versions of a neutral control condition would not impact the stress induction.”
 
 (3) For the stress relief analysis, the authors included time points 2 and 3 (after the stressor and debrief) but not a baseline reading before stress. Given the potential baseline differences across conditions, can this decision be justified in the manuscript?
 
 We thank the reviewer for raising this. Regarding the stress relief analyses (timepoints 2 and 3) and not including timepoint 1 (after the WS/video task) stress in the model, we have added to the manuscript that there was no significant difference in stress ratings between the high control and neutral control (collapsed across stress and domain) at timepoint 1 (hence why we do not think it’s necessary to include in the stress relief model). Nevertheless, we have now included a sensitivity analysis to test the Timepoint*Control interaction of stress relief when including timepoint 1 stress as a covariate. The timepoint by control interaction still holds, suggesting that the initial stress level prior to the stress induction does not impact our results of interest. The details of this analysis are included in the Sensitivity and Exploratory Analyses section on page 24:
 
 “Although there were no significant differences between control groups in subjective stress immediately after the WS/video task (t(175.6)=1.17, p=.244), we included participants’ stress level after the WS/video task as a covariate in the stress relief analyses (Table S12). The results revealed a main effect of initial stress (β= 0.643, SE=0.040, p<.001, Table S12) on the stress relief after the stressor debrief. Compared to excluding initial stress as in the original analyses (Table 4), there was now no longer a main effect of domain (β= 0.236, SE=2.60, p=.093, Table S12), but the inference of all other effects remained the same. Importantly, there was still a significant time by control interaction (β= 9.65, SE=3.74, p=.010, Table S12) showing that the decrease in stress after the debrief was greater in the highly controllable WS condition than the neutral control video condition, even when accounting for the initial stress level.”
 
 (4) Is the increased control experience during the losses condition more valuable in mitigating experienced stress than the win condition?
 
 We agree that this would be helpful to clarify. To test whether the loss domain was more valuable at mitigating experiences of stress than the win condition, we ran additional analyses with just the high control condition (WS task) to test for a Domain*Time interaction. This revealed no significant Domain*Time interaction, suggesting that the stress buffering or stress relief effect was not dependent on domain in the high control conditions. These analyses are outlined in the Sensitivity and Exploratory Analyses section on page 25:
 
 “Finally, to test whether the loss domain was more valuable at mitigating experiences of stress than the win condition, we ran additional analyses with just the high control condition (WS task) for the stress induction and stress relief to test for an interaction of domain and time. For the stress induction, there was no significant two-way interaction of domain and time (β= -1.45, SE=4.80, p=.763), nor a significant three-way interaction of domain by time by stressor intensity (β= -3.96, SE=6.74, p=.557, Table S15), suggesting that there were no differences in the stress induction dependent on domain. Similarly for the stress relief, there was no significant two-way interaction of domain and time (β= -5.92, SE=4.42, p=.182), nor a significant three-way interaction of domain by time by stressor intensity interaction (β= 8.86, SE=6.21, p=.154, Table S15), suggesting that there were no differences in the stress relief dependent on the WS Task domain.
 
 (5) The subjective measure of control ("how in control do you feel right now") tends to follow a successful or failed attempt at the WS task. How much is the experience of control mediated by the degree of experienced success/schedule of reinforcement? Is it an assessment of control or, an evaluation of how well they are doing and/or resolution of uncertainty? An interesting paper by Cockburn et al. 2014 highlights the potential for positive prediction errors to enhance the desire for control.
 
 We thank the reviewer for this comment. Similar to comments regarding reward rate, our task does not allow us to fully separate control from success/reinforcement because of the manipulation of difficulty. However, we did undertake sensitivity analyses and the inclusion of overall win rate accounted for limited variance when predicting stress over and above subjective control and difficulty (page 16).
 
 “To further isolate the relationship between subjective control and stress separate from perceived task difficulty or objective task performance, we also included the overall win rate (percentage of trials won during the WS task) in the models. In Study 1, lower feelings of control were related to higher levels of subjective stress (β= -0.12, p<.001) even when controlling for both win rate (β= -0.06, p=.220) and perceived task difficulty (β= 0.37, p<.001, Table S10). This also replicated in Study 2, where lower subjective control was associated with higher feelings of stress (β= -0.32, p<.001) when controlling for perceived task difficulty (β= 0.31, p<.001) and win rate (β= -0.11, p=.428, Table S11). This suggests that there is unique variance in subjective feelings of control, separate from task performance, relevant to subjective stress.”
 
 (6) While the authors do a very good job in their inclusion and synthesis of the relevant literature, they could also amplify some discussion in specific areas. For example, operationalizing task controllability via task difficulty is an interesting approach. It would be useful to discuss their approach (along with any others in the literature that have used it) and compare it to other typically used paradigms measuring control via presence or absence of choice, as mentioned by the authors briefly in the introduction.
 
 We are delighted to expand on this particular point and have done so in the Discussion on page 27:
 
 “Previous research typically accounts for different outcomes (e.g. punishment) by yoking controllable and uncontrollable conditions [3] though other work has manipulated the controllability of rewards by changing the reward rate [for example 30] where a decoy stimulus is rewarded 50% of the time in the low control condition but 80% in the high control condition). While our task design does not separate control from obtained reward, we are able to do so in the statistical analyses.”
 
 (7) The paper is well-written. However, it would be useful to expand on Figure 1 to include a) separate figures for study 1 (currently not included) and 2, and b) a timeline that includes the measurements of subjective stress (incorporated in Figure 1). It would also be helpful to include Figure S4 in the manuscript.
 
 We have expanded Figure 1 to include both Studies 1 and 2 and a timeline of when subjective stress was assessed throughout the experiment as well as adding Figure S4 to the main manuscript (now top panel within Figure 4).
 
 Reviewer #1 (Recommendations for the authors):
 
 (1) Study 2 shows a greater decrease in subjective stress after the high-control task manipulation than after the pleasant video. One possible confound is whether the amount of time to complete the WS task and the video differ. It could be helpful to look at the average completion time for the WS task and compare that to the length of the videos. Alternatively, in future studies, control for this by dynamically adjusting the video play length to each participant based on how long they took to complete the WS task.
 
 This is an interesting suggestion. As a result, we have included the time taken as a covariate in the stress induction and stress relief analyses to ensure that any differences in time between the WS task and video task were not accounting for any of the stress induction or relief analyses. Controlling for the total time taken did not impact the stress induction or relief results. This is included in the Sensitivity and Exploratory Analyses section on page 24:
 
 “Our second sensitivity analyses was conducted because the experiment took longer to complete for the video condition (mean = 54.3 minutes, SD = 12.4 minutes) than the WS task condition (mean = 39.7 minutes, SD = 12.8 minutes, t(186.19)=-9.32, p<.001). We therefore included the total time (in ms) as a covariate in the stress induction and stress relief analyses for Study 2. This showed that accounting for total time did not change the results of interest (Table S13), further highlighting that the time by control interactions were robust.”
 
 (2) Because participants received feedback about their success/failure in the WS task, a confounding factor could be that they received positive feedback on highly controllable trials and negative feedback on low control trials (and/or highly difficult trials). This would suggest that it is not controllability per se that contributes to stress perception but rather feedback valence. The authors show that this is a likely factor in their results in Study 2, which shows significant effects of the loss domain on perceived control and stress. Was a similar analysis done in Study 1? Do participants receive feedback in Study 1? It would be helpful to include this information somewhere in the manuscript. I would be curious to know whether *any* feedback at all influences controllability/stress perceptions.
 
 We thank the reviewer for this interesting suggestion. It is an interesting question as to whether feedback valence is related to stress in Study 1, and we have added this point to the Discussion on pages 27 and 28. To speak to this point, when we include the overall win rate (which captures the subsequent feedback received) when predicting subjective stress, win rate is not a significant predictor of stress over and above perceived difficulty and subjective control, suggesting that overall feedback valence may not be related to stress in Study 1. We take this as evidence that feedback may not be as important in terms of accounting for the relationship between stress and control. However, we unfortunately do not have any data in which there was no feedback provided to speak to this conclusively. This would be an interesting future study. The excerpt below is added to pages 27 and 28 of the discussion section:
 
 “Like with perceived difficulty, we statistically accounted for reward rate and showed that the relationship between subjective control and stress was not accounted for by reward rate, for example. Similarly, participants received feedback after every trial, and thus feedback valence may contribute to stress perception. However, given that overall win rate (which captures the feedback received during the task) did not predict stress over and above perceived difficulty or subjective control, it suggests that feedback is unlikely to relate to stress over and above difficulty. Future work will need to disentangle this further to rule out such potential confounds.”
 
 To respond specifically to the reviewer’s question about the feedback given to participants, written feedback was provided on screen to participants on a trial-bytrial basis also in Study 1 (i.e. for both studies), and we have provided more clarity about this in the manuscript on page 8 as well as providing additional details in Table S3:
 
 “After each trial, participants were shown written feedback on screen as to whether the segment had successfully stopped on the red zone (or not), and the associated reward (or lack of). See Table S3 for details.”
 
 (3) I'm not sure how to interpret the fact that in Figure S1, the BICs are all essentially the same. Does this mean that you don't really need all of these varying aspects of the task to achieve the same effects? Could the task be made simpler?
 
 The similarity of BIC values suggests that a simpler WS task would have produced a worse account of the data approximately in keeping with the extent to which it is a simpler model. Here, the BIC scores for the models are similar, suggesting that adding these parameters adds explanatory power in keeping with what would have been expected from adding a parameter, but not more. We do note that the BIC is a relatively strict and conservative comparison. The fact that the most complex model overall narrowly improves parsimony; combined with the interpretable parameter values and the prior expectations given the task setup led us to focus on this most complex model.
 
 (4) A minor point, but the authors refer to their sample as "neurotypical." Were they assessed for prior/current psychopathology/medications? If not, I might use a different term here (perhaps "non-clinical sample"), since some prior work has shown that online samples actually have higher instances of psychopathology compared to community samples.
 
 We have changed the phrasing of ‘neurotypical’ to a ‘non-clinical sample’ as recommended.
 
 Reviewer #2 (Recommendations for the authors):
 
 Figure 4S is very informative and could be presented in the main text.
 
 We have expanded Figure 1 to include both Studies 1 and 2 and a timeline of when subjective stress was assessed throughout the experiment as well as adding Figure S4 to the main manuscript (top panel of Figure 4).
 
 References:
 
 Dorfman, H. M., & Gershman, S. J. (2019). Controllability governs the balance between Pavlovian and instrumental action selection. Nature Communications, 10(1), 5826. https://doi.org/10.1038/s41467-019-13737-7
 
 DuPont, C. M., Pressman, S. D., Reed, R. G., Manuck, S. B., Marsland, A. L., & Gianaros, P. J. (2022). An online Trier social stress paradigm to evoke affective and cardiovascular responses. Psychophysiology, 59(10), e14067. https://doi.org/10.1111/psyp.14067
 
 Jangraw, D. C., Keren, H., Sun, H., Bedder, R. L., Rutledge, R. B., Pereira, F., Thomas, A. G., Pine, D. S., Zheng, C., Nielson, D. M., & Stringaris, A. (2023). A highly replicable decline in mood during rest and simple tasks. Nature Human Behaviour, 7(4), 596–610. https://doi.org/10.1038/s41562-023-015197
 
 Meier, M., Haub, K., Schramm, M.-L., Hamma, M., Bentele, U. U., Dimitroff, S. J., Gärtner, R., Denk, B. F., Benz, A. B. E., Unternaehrer, E., & Pruessner, J. C. (2022). Validation of an online version of the trier social stress test in adult men and women. Psychoneuroendocrinology, 142, 105818. https://doi.org/10.1016/j.psyneuen.2022.105818
 
 Nasso, S., Vanderhasselt, M.-A., Demeyer, I., & De Raedt, R. (2019). Autonomic regulation in response to stress: The influence of anticipatory emotion regulation strategies and trait rumination. Emotion, 19(3), 443–454. https://doi.org/10.1037/emo0000448
 
 Schlatter, S., Schmidt, L., Lilot, M., Guillot, A., & Debarnot, U. (2021). Implementing biofeedback as a proactive coping strategy: Psychological and physiological effects on anticipatory stress. Behaviour Research and Therapy, 140, 103834. https://doi.org/10.1016/j.brat.2021.103834
 
 Steinbeis, N., Engert, V., Linz, R., & Singer, T. (2015). The effects of stress and affiliation on social decision-making: Investigating the tend-and-befriend pattern. Psychoneuroendocrinology, 62, 138–148. https://doi.org/10.1016/j.psyneuen.2015.08.003
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.12.05.626945v2
www.biorxiv.org www.biorxiv.org

Evolutionary Adaptations of IRG1 Refines Itaconate Synthesis and Mitigates Innate Immunometabolism Trade-offs

5
1. Public_Reviews 09 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This important study addresses the timely and interesting question of how itaconate generation emerged in evolution, using taxonomic analysis of the gene and enzyme cis-aconitate decarboxylase (CAD). The authors provide solid evidence identifying three CAD branches in metazoans and showing that the early metazoan paleo-form indeed generates aconitate and is already linked to innate immunity. They further provide limited evidence suggesting that taxonomic differences in subcellular localisation of this enzyme may allow for innate immune signalling without compromising cellular energetics. The implications of the study will be of high interest to the field of innate host defence and immunometabolism.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The taxonomic analysis of IRG1 evolution is compelling and fills an important gap in the literature. However, the experimental evidence for IRG1 localization requires greater detail and confirmation.
  
  Strengths:
  
  The phylogenetic analysis of IRG1 evolution fills an important gap in the literature. The identification of independent acquisition of metazoan and fungal IRG1 from prokaryotic sources is novel, and the observation that human IRG1 lost mitochondrial matrix localization is particularly interesting, with potentially significant implications for the study of itaconate biology.
  
  Weaknesses:
  
  The protease protection assay was conducted with MTS-IRG1 but not with wild-type IRG1, which should also be tested. Moreover, no complementary methods, such as microscopy, were employed to validate localization. Beyond humans, the structure and localization of mouse IRG1, highly relevant given the widespread use of the mouse as a model for IRG1 functional studies, are not addressed. Finally, if itaconate is indeed synthesized outside the mitochondrial matrix to safeguard metabolic activity, it is not discussed how this reconciles with its reported inhibitory effect on SDH.
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The authors are trying to explain how the metabolite itaconate evolved, since although it's involved in host defense, it can also limit mitochondrial function. They are trying to probe the trade-off between these two functions.
  
  Strengths:
  
  The evolutionary aspect is novel; this is the first time to my knowledge that the evolution of IRG1 has been analysed, and there are interesting findings here. The key finding appears to be that subcellular localisation is an important aspect, allowing host defense in some organisms without compromising bioenergetics. This is an interesting finding in the context of immunomebolism, although it needs extra analysis.
  
  Weaknesses:
  
  The work concerning sub-mitochondrial localisation is confusing and needs better analysis.
  
  Review 2
4. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  IRG1 is highly expressed in activated human and mouse myeloid cells. It encodes the mitochondrial enzyme cis-aconitate decarboxylase 1 (ACOD1) that generates itaconate. Itaconate has anti-microbial activity and acts immunoregulatory by interfering with cellular metabolism, signaling to cytokine production, and multiple other processes.
  
  The authors perform a phylogenetic analysis of IRG1 to obtain insight into the evolution of itaconate biosynthesis. Combining BLAST with human IRG1 and a MmgE/Ptrp domain search, they find CAD in all domains of life, but the presence of IRG1 homologs is patchy in eukaryotes, indicating that itaconate biosynthesis is not essential. The phylogenetic analysis showed a more distant relationship of fungal and metazoan CAD/IRG1 to many prokaryotic sequences, suggesting independent acquisition of these metazoan and fungal CAD genes. In metazoans, three subbranches of paleo-IRG1 (in mollusks/early chordates) and two paralogous vertebrate forms (IRG1 and IRG1-like) were identified, with the latter derived from paleo-IRG1, and by genome duplication. While most jawed vertebrates have both IRG1 and IRG1L, metatherian and eutherian mammals have lost IRG1L and contain only IRG1.
  
  Interestingly, sequence analysis of both paralogues showed that many IRG1L genes contain an N-terminal mitochondrial targeting sequence (MTS) that is absent from most IRG1 sequences. Limited proteolysis of submitochondrial localization confirmed that zebrafish IRG1L is only sensitive to proteases in the presence of high Triton X-100, indicative of association with mitochondrial matrix. In contrast, a recent paper from the Galan lab (Lian 2003 Nature Microbiology) reported that human IRG1 is not localized to the mitochondrial matrix, although enriched in mitochondria. Here, the authors generated a matrix-targeted human IRG1 by adding the N-terminal MTS and found that it localizes to the matrix based on a limited proteolysis assay. The loss of MTS-containing IRG1L from most mammals appears, therefore, to indicate that itaconate generation is directed to the cytoplasm, potentially reducing inhibition of TCA cycle activity in the mitochondria.
  
  Next, the authors confirmed that the recombinant IRG1L protein has CAD activity in vitro. The last part of the manuscript addresses the expression of paleo-IRG1 in oysters and amphioxus, where they found high mRNA levels in oyster hemocytes which was further increased by poly(I:C), which was also the case in amphioxus tissues after feeding of LPS or poly(I:C), indicating a role for paleo-IRG1/itaconate in early metazoan innate immunity.
  
  Strengths
  
  (1) Phylogenetic perspective largely lacking so far in the IRG1/itaconate field.
  
  (2) Manuscript clearly written and understandable across disciplines.
  
  (3) Phylogenetic analyses complemented by biochemical and gene expression analyses to link to function.
  
  (4) Lack of MTS in IRG1 and change in localization from mitochondria, highly relevant antimicrobial and cellular effects of itaconate.
  
  Weaknesses:
  
  (1) Biochemical and functional analysis of different CAD mRNA and proteins lacks depth.
  
  (2) The submitochondrial localization assay lacks a native human IRG1 control.
  
  (3) CAD activity shown for IRG1L but not paleo-IRG1.
  
  (4) Itaconate production by early metazoans after PAMP stimulation?
  
  (5) No measurement of energy metabolism (trade-offs?).
  
  I acknowledge that some of these limitations are inevitable because the range of detailed experimental analysis is necessarily limited. However, some of these data would be important to support central claims of the manuscript (further discussed below).
  
  Review 3
5. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Author response:
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The taxonomic analysis of IRG1 evolution is compelling and fills an important gap in the literature. However, the experimental evidence for IRG1 localization requires greater detail and confirmation.
  
  Strengths:
  
  The phylogenetic analysis of IRG1 evolution fills an important gap in the literature. The identification of independent acquisition of metazoan and fungal IRG1 from prokaryotic sources is novel, and the observation that human IRG1 lost mitochondrial matrix localization is particularly interesting, with potentially significant implications for the study of itaconate biology.
  
  We thank the reviewer for appreciating the novelty of our study in exploring IRG1 evolution.
  
  Weaknesses:
  
  The protease protection assay was conducted with MTS-IRG1 but not with wild-type IRG1, which should also be tested. Moreover, no complementary methods, such as microscopy, were employed to validate localization. Beyond humans, the structure and localization of mouse IRG1, highly relevant given the widespread use of the mouse as a model for IRG1 functional studies, are not addressed.
  
  Regarding submitochondrial localization of IRG1, we want to draw attention to the published data that a protease protection assay for wild-type mammalian IRG1 has been performed by Lian et al. 2023 (Extended Data Fig. 4), which convincingly demonstrated an outer-mitochondrial membrane localization of endogenous mouse IRG1 in mouse DC2.4 cells upon LPS stimulation that induces IRG1 expression.
  
  Regarding complementary microscopy evidence, the same paper performed two-color, DNA-paint super-resolution imaging to demonstrate an enrichment of IRG1 to mitochondria with a lack of co-localization of the inner membrane/matrix marker Cox IV.
  
  Given the direct visualization of sub-mitochondrial localization, we consider applying super-resolution microscopy to revisit the sub-mitochondrial localization of di[erent IRG1 constructs in the study.
  
  Reference:
  
  Lian H, Park D, Chen M, Schueder F, Lara-Tejero M, Liu J, Galán JE. Parkinson's disease kinase LRRK2 coordinates a cell-intrinsic itaconate-dependent defence pathway against intracellular Salmonella. Nat Microbiol. 2023 Oct;8(10):1880-1895. doi: 10.1038/s41564-023-01459-y. Epub 2023 Aug 28. PMID: 37640963; PMCID: PMC10962312.
  
  Finally, if itaconate is indeed synthesized outside the mitochondrial matrix to safeguard metabolic activity, it is not discussed how this reconciles with its reported inhibitory e[ect on SDH.
  
  We thank the excellent point raised by the reviewer. Indeed, itaconate has been proposed to inhibit matrix SDH exhibiting anti-inflammation function (Lampropoulou, Cell Metab 2016). While the mitochondrial transport of itaconate has not been fully characterized in vivo or in cells, a specific itaconate transport activity has been shown for the mitochondrial 2-oxoglutarate transporter OGC using in vitro proteoliposome system (Mills et al. Nature 2018).
  
  We plan to discuss this important point on mitochondrial itaconate transport in the revision.
  
  Reference:
  
  Lampropoulou V, Sergushichev A, Bambouskova M, Nair S, Vincent EE, Loginicheva E, Cervantes-Barragan L, Ma X, Huang SC, Griss T, Weinheimer CJ, Khader S, Randolph GJ, Pearce EJ, Jones RG, Diwan A, Diamond MS, Artyomov MN. Itaconate Links Inhibition of Succinate Dehydrogenase with Macrophage Metabolic Remodeling and Regulation of Inflammation. Cell Metab. 2016 Jul 12;24(1):158-66. doi: 10.1016/j.cmet.2016.06.004. Epub 2016 Jun 30. PMID: 27374498; PMCID: PMC5108454.
  
  Mills EL, Ryan DG, Prag HA, Dikovskaya D, Menon D, Zaslona Z, Jedrychowski MP, Costa ASH, Higgins M, Hams E, Szpyt J, Runtsch MC, King MS, McGouran JF, Fischer R, Kessler BM, McGettrick AF, Hughes MM, Carroll RG, Booty LM, Knatko EV, Meakin PJ, Ashford MLJ, Modis LK, Brunori G, Sévin DC, Fallon PG, Caldwell ST, Kunji ERS, Chouchani ET, Frezza C, Dinkova-Kostova AT, Hartley RC, Murphy MP, O'Neill LA. Itaconate is an anti-inflammatory metabolite that activates Nrf2 via alkylation of KEAP1. Nature. 2018 Apr 5;556(7699):113117. doi: 10.1038/nature25986. Epub 2018 Mar 28. PMID: 29590092; PMCID: PMC6047741.
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The authors are trying to explain how the metabolite itaconate evolved, since although it's involved in host defense, it can also limit mitochondrial function. They are trying to probe the trade-o[ between these two functions.
  
  Strengths:
  
  The evolutionary aspect is novel; this is the first time to my knowledge that the evolution of IRG1 has been analysed, and there are interesting findings here. The key finding appears to be that subcellular localisation is an important aspect, allowing host defense in some organisms without compromising bioenergetics. This is an interesting finding in the context of immunomebolism, although it needs extra analysis.
  
  Weaknesses:
  
  The work concerning sub-mitochondrial localisation is confusing and needs better analysis.
  
  We thank the reviewer for the constructive feedback. As in our response to reviewer 1, we want to draw attention to the published data in which the outer mitochondrial membrane localization of IRG1 has been demonstrated by protease protection assay and explored using super-resolution imaging by Lian et al. 2023 (Extended Data Fig. 4). Given the direct visualization of sub-mitochondrial localization by super-resolution imaging, we plan to revisit and to apply the method to di[erent IRG1 constructs used in the paper.
  
  Reviewer #3 (Public review):
  
  Summary:
  
  IRG1 is highly expressed in activated human and mouse myeloid cells. It encodes the mitochondrial enzyme cis-aconitate decarboxylase 1 (ACOD1) that generates itaconate. Itaconate has anti-microbial activity and acts immunoregulatory by interfering with cellular metabolism, signaling to cytokine production, and multiple other processes.
  
  The authors perform a phylogenetic analysis of IRG1 to obtain insight into the evolution of itaconate biosynthesis. Combining BLAST with human IRG1 and a MmgE/Ptrp domain search, they find CAD in all domains of life, but the presence of IRG1 homologs is patchy in eukaryotes, indicating that itaconate biosynthesis is not essential. The phylogenetic analysis showed a more distant relationship of fungal and metazoan CAD/IRG1 to many prokaryotic sequences, suggesting independent acquisition of these metazoan and fungal CAD genes. In metazoans, three subbranches of paleo-IRG1 (in mollusks/early chordates) and two paralogous vertebrate forms (IRG1 and IRG1-like) were identified, with the latter derived from paleo-IRG1, and by genome duplication. While most jawed vertebrates have both IRG1 and IRG1L, metatherian and eutherian mammals have lost IRG1L and contain only IRG1.
  
  Interestingly, sequence analysis of both paralogues showed that many IRG1L genes contain an N-terminal mitochondrial targeting sequence (MTS) that is absent from most IRG1 sequences. Limited proteolysis of submitochondrial localization confirmed that zebrafish IRG1L is only sensitive to proteases in the presence of high Triton X-100, indicative of association with mitochondrial matrix. In contrast, a recent paper from the Galan lab (Lian 2003 Nature Microbiology) reported that human IRG1 is not localized to the mitochondrial matrix, although enriched in mitochondria. Here, the authors generated a matrix-targeted human IRG1 by adding the N-terminal MTS and found that it localizes to the matrix based on a limited proteolysis assay. The loss of MTS-containing IRG1L from most mammals appears, therefore, to indicate that itaconate generation is directed to the cytoplasm, potentially reducing inhibition of TCA cycle activity in the mitochondria.
  
  Next, the authors confirmed that the recombinant IRG1L protein has CAD activity in vitro. The last part of the manuscript addresses the expression of paleo-IRG1 in oysters and amphioxus, where they found high mRNA levels in oyster hemocytes which was further increased by poly(I:C), which was also the case in amphioxus tissues after feeding of LPS or poly(I:C), indicating a role for paleo-IRG1/itaconate in early metazoan innate immunity.
  
  Strengths
  
  (1) Phylogenetic perspective largely lacking so far in the IRG1/itaconate field.
  
  (2) Manuscript clearly written and understandable across disciplines.
  
  (3) Phylogenetic analyses complemented by biochemical and gene expression analyses to link to function.
  
  (4) Lack of MTS in IRG1 and change in localization from mitochondria, highly relevant antimicrobial and cellular e[ects of itaconate.
  
  We thank the reviewer for the positive comments with the strengths.
  
  Weaknesses:
  
  (1) Biochemical and functional analysis of di[erent CAD mRNA and proteins lacks depth.
  
  We plan to explore two types of experiments:
  
  First, we plan to purify di[erent CAD recombinant proteins; and if successful, we will test their in vitro enzymatic activity in synthesize itaconate. The positive data will also answer question (3) below.
  
  Second, we plan to measure itaconate level in oyster hemocytes after PAMP stimulation, to demonstrate an in vivo itaconate production activity by paleo-IRG1. The data will also address question (4) below.
  
  (2) The submitochondrial localization assay lacks a native human IRG1 control.
  
  As in our response to reviewer 1, we believe Lian et al. 2023. provided strong evidence supporting an outer mitochondrial membrane localization of wild-type endogenous, mouse IRG1. Given the direct visualization using suer-resolution imaging, we plan to revisit submitochondrial localization of di[erent IRG1 constructs using super-resolution imaging.
  
  (3) CAD activity shown for IRG1L but not paleo-IRG1.
  
  We plan to purify di[erent CAD recombinant proteins; and if successful, we will test their in vitro enzymatic activity in producing itaconate.
  
  (4) Itaconate production by early metazoans after PAMP stimulation?
  
  We plan to measure itaconate level in oyster hemocytes after PAMP stimulation, to demonstrate an in vivo itaconate production activity by paleo-IRG1.
  
  (5) No measurement of energy metabolism (trade-o[s?).
  
  Because PAMP signaling might trigger other downstream e[ects that also impair mitochondrial function, for instance nitric oxide that inhibits complex IV, we plan to avoid PAMP condition and direct test the e[ect of itaconate production. We plan to compare the impact on mitochondrial bioenergetics, if the same CAD enzymes (thus with the same activity) can be expressed at the same level intra-mitochondrially and extramitochondrially, for instance in the case of MTS-hACOD1 and hACOD1.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.06.17.496652v3
www.biorxiv.org www.biorxiv.org

When word order matters: human brains represent sentence meaning differently from large language models

5
1. Public_Reviews 09 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This work provides a valuable comparison of sentence structure representations in the human brain and state-of-the-art Large Language Models (LLMs). Based on solid analysis of 7T fMRI data, it systematically identifies sentences in which LLMs underperform relative to models that explicitly code for syntactic structure. The study will be of significant interest to both cognitive neuroscientists and artificial intelligence researchers.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  This paper investigates whether transformer-based models can represent sentence-level semantics in a human-like way. The authors designed a set of 108 sentences specifically to dissociate lexical semantics from sentence-level information and collected 7T fMRI data from 30 participants reading these sentences. They conducted representational similarity analysis (RSA) comparing brain data and model representations, as well as the human behavioral ratings. It is found that transformer-based models match brain representation better than a static word embedding baseline, which ignores word order, but fall short of models that encode the structural relations between words. The main contributions of this paper are:
  
  (1) The construction of a sentence set that disentangles sentence structure from word meaning.
  
  (2) A comprehensive comparison of neural sentence representations (via fMRI), human behavior, and multiple computational models at the sentence level.
  
  Strengths:
  
  (1) The paper evaluates a wide variety of models, including layer-wise analysis for transformers and region-wise analysis in the human brain.
  
  (2) The stimulus design allows precise dissociation between lexical and sentence-level semantics. The RSA-based approach is empirically sound and intuitive.
  
  (3) The constructed sentences, along with the fMRI and behavioral data, represent a valuable resource for studying sentence representation.
  
  Weaknesses:
  
  (1) The rationale behind averaging sentence embeddings across multiple transformer models (with different architectures and training objectives) is unclear. These transformer-based models have different training paradigms and model architectures, which may result in misaligned semantic spaces. The averaging operation may dilute the distinct sentence representations learned by each model, potentially weakening the overall semantic encoding for sentences. Please clarify this choice or cite supporting methodology.
  
  (2) All structure-sensitive models discussed incorporate semantics to some extent. Including a purely syntactic baseline, such as a model based on context-free grammar, would help confirm the importance of syntactic structures.
  
  (3) In Figure 2, human behavioral judgments show weak correlations with neural data, and even fall below those of computational models, suggesting the behavioral judgments may not reflect the sentence structures in a brain-like way. This discrepancy between behavioral and neural data should be clarified, as it affects the interpretation of the results.
  
  (4) To better contextualize model and neural performance, sentence similarity should be anchored to a notion of semantic "ground truth", such as the matrix shown in Figure 1a. Comparing this reference with human judgments, brain responses, and model similarities would help establish an upper bound.
  
  (5) The structure of this paper is confusing. For instance, Figure 5 is cited early but appears much later. Reordering sections and figures would enhance readability.
  
  (6) While the analysis is broad and comprehensive, it lacks depth in some respects. For instance, it remains unclear what specific insights are gained from comparing across brain regions (e.g., whole brain, language network, and other subregions). Similarly, the results of simple-average and group-average RSA appear quite similar and may not advance the interpretation.
  
  (7) While explaining the grid-like pattern due to sentence length is important, this part feels somewhat disconnected from the central question of this paper (word order). It might be better placed in supplementary material.
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The paper used fMRI data while reading a set of sentences. The sentences are designed to disentangle syntax from meaning. RSA was performed using voxel activations and a variety of language models. The results show that transformers are inferior to models with explicit syntactic representation in terms of matching brain representations.
  
  Strengths:
  
  (1) The study controls for some variables that allow for an investigation of sentence structure in the brain. This controlled setting has an advantage over naturalistic stimuli in targeting more specific linguistic phenomena.
  
  (2) The study combines fMRI data with behavioral similarity ratings and a variety of language models (static, transformers, graph-based models).
  
  Weaknesses:
  
  (1) The stimuli are not fully controlled for lexical content across conditions. Residual lexical differences between sentences could still influence both brain and model similarity patterns. To more cleanly isolate syntactic effects, it would be useful to systematically vary only a single structural element while keeping all other lexical content constant (e.g., the boy kicked the ball / the ball kicked the boy). It would be better to engage more with the minimal pair paradigm, which is widely used in large language model probing research.
  
  (2) The comparisons are done across fundamentally different model types, including static embeddings, graph-based parsers, and transformers. The inherent differences in dimensionality and training objectives might make the conclusion drawn from RSA inconclusive. Transformer embeddings typically occupy much higher-dimensional, anisotropic representational spaces, and their similarity structure may reflect richer, more heterogeneous information than models explicitly encoding semantic roles. A lower RSA correlation in this study does not necessarily imply that transformers fail to encode syntactic information; rather, they may represent additional aspects of meaning or context that diverge from the narrow structural contrasts probed here.
  
  (3) The interpretation of the RSA correlation largely depends on the understanding of models. The authors suggest that because hybrid models correlate better than transformers, this implies that transformers are inferior at representing syntax. However, this is not a direct test of syntactic ability. Transformers may encode syntactic information, but it may not be expressed in a way that aligns with the RSA paradigm or the chosen stimuli. RSA does not reveal what the model encodes, and the models might achieve a good correlation for non-syntactic reasons (e.g., length of sentence, orthographic similarity, lexical features).
  
  Review 2
4. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  Large Language Models have revolutionized Artificial Intelligence and can now match or surpass human language abilities on many tasks. This has fueled interest in cognitive neuroscience in exposing representational similarities between Language Models and brain recordings of language comprehension. The current study breaks from this mold by: (1) Systematically identifying sentence structures for which brain and Large Language Model representations diverge. (2) Demonstrating that brain representations for these sentences can be better accounted for by a model structured by the semantic roles of words in the sentence. As such, the study may now fuel interest in characterizing how Large Language Models and brain representations differ, which may prompt new, more brain-like language models.
  
  Strengths:
  
  (1) This study presents a bold and solid challenge to a literature trend that has touted similarities between Transformer models and human cognition based on representational correlations with brain activity. This challenge is substantiated by identifying sentences for which brain and model representations of sentences diverge and explaining those divergences using models structured by semantic roles/syntax.
  
  (2) This study conducts a rigorous pre-registered analysis of a comprehensive selection of the state-of-the-art Large Language Models, on a controlled sentence comprehension fMRI dataset. The analysis is conducted within a Representation Similarity framework to support similarity comparisons between graph structures and brain activity without needing to vectorize graphs. Transformer models are predicted and shown to diverge from brain representations on subsets of sentences with similar word-level content but different sentence structures.
  
  (3) The study introduces a 7T fMRI sentence comprehension dataset and accompanying human sentence similarity ratings, which may be a fruitful resource for developing more human-like language models. Unlike other model-based sentence datasets, the relation between grammatical structure and word-level content is controlled, and subsets of sentences for which models and brains diverge are identified.
  
  Weaknesses:
  
  (1) The interpretation of findings is nuanced. Although Transformers underperform as brain models on the critical subsets of controlled sentences, a Transformer outperforms all other models when evaluated on the union of all sentences when both word-level content and structure vary. Transformers also yield equivalent or better models of human behavioral data. Thus, although Transformers have demonstrable flaws as human models, which are pinpointed here, in the general case, (some) Transformers are more human-like than the other models considered.
  
  (2) There may be confounds between the critical sentence structure manipulations and visual representations of sentence stimuli. This is inconvenient because activation in brain regions that process semantics tends to partially correlate with visual cortex representations, and computational models tend to reflect the number of words/tokens/elements in sentences. Although the study commendably controls for confounds associated with sentence length, there could still be residual effects that remain. For instance, the Graph model correlates most strongly with the visual cortex despite these sentence length controls.
  
  (3) Sentence similarity computations are emphasized as the basis for unifying comparative analyses of graph structures and vector data. A strength of this approach is that correlation is not always the ideal similarity metric. However, a weakness is that similarity computations are not unified across models. This has practical consequences here because different similarity metrics applied to the same model produce positive or negative correlations with brain data.
  
  Review 3
5. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Author response:
  
  We thank the reviewers for their insightful comments on our manuscript. Here we briefly highlight our responses to several issues raised by reviewers, and also provide a summary of planned changes to be made with the next draft.
  
  Reviewer 1:
  
  (1) The reviewer questions the rationale for averaging sentence embeddings across different models. However, our method involves computing correlations separately for each model, then averaging the correlations. We also report model correlations for each model separately in Fig S2. We will clarify this in our revised manuscript.
  
  (2) We agree with the reviewer that including a context-free grammar model as a comparison would be informative. We will incorporate this in the revised manuscript.
  
  (3) The reviewer raises questions about the low correlation between behavioural and brain similarities. While the behavioural judgements are made by different participants and involve a different task than the neuroimaging results, nonetheless we agree the difference is surprising and warrants more detailed consideration. We will provide additional discussion of the relationship between behavioural judgements and brain data in the revised manuscript.
  
  (4) The reviewer suggests contrasting our models with a ‘semantic ground truth’, as in our design matrix shown in Fig 1. While our design matrix served as the basis for constructing a set of stimuli with systematic modifications, we respectfully suggest that it should not be regarded as a ‘semantic ground truth’. In particular, sentence pairs within each category will not have the same degrees of semantic similarity since the words and context differ across sentences in a graded manner. Furthermore, while we anticipated ‘different’ sentence pairs would be less similar than ‘swapped’ sentence pairs, and that within each of the six block diagonals the ‘modified’ or ‘substituted’ sentence pairs would be the most similar, we did not have any prediction about the magnitude of these differences. Our goal was to construct a set of sentence pairs which spanned a range of semantic similarities, and allowed for dissociation between lexical similarity and overall similarity in meaning. The design matrix is not intended to represent a ‘ground truth’ that human judgements or brain representations would be expected to conform with.
  
  (5) In the revised draft we will modify the location of Fig. 5 so that it flows better with the text.
  
  (6) We agree that the discussion of the differences between brain regions could be expanded. We will include this in the revised version of our manuscript. The reviewer questions our inclusion of the simple-average and group-average RSA analysis as they show similar results. We included both analyses in line with our preregistration, and also because we believe the fact that two distinct approaches to analyzing the data yield similar results strengthens our conclusions.
  
  (7) We believe that the grid-like pattern in the RSA results is an important unexpected finding that warrants discussion in the main manuscript.
  
  Reviewer 2:
  
  (1) The reviewer argues that our stimuli do not fully control for lexical content across conditions, and that a more appropriate paradigm may be to utilise minimal pairs in which only a single variable of interest (such as sentence structure) is modified. We agree that most of our sentence pairs do not constitute minimal pairs, however this was not our objective. Our study design aimed to synthesise traditional minimal pair approaches with more recent research paradigms using naturalistic stimuli. As such, we selected stimuli which are more complex and contain more variable features than traditional minimal pair studies, but which also are tailored to highlight differences which are of particular theoretical interest. Because we are interested in comparing the effects of multiple sentence elements and semantic roles, a systematic pairwise comparison of minimal pairs is not necessarily optimal. Instead, we designed our stimuli to leverage the advantage of fMRI in that we can measure the brain representations corresponding to each sentence, and hence can conduct a full series of pairwise comparisons of sentence representations. Most of these comparisons will not be between minimal pairs, but we selected sentences so as to provide a range of semantic similarities (low to high), while also providing for semantic contrasts of theoretical interest (such as the ‘swapped’ and ‘substituted’ sentence pairs). We do not claim this approach to be universally superior to a minimal pair approach, but we do believe our novel approach provides additional insights and a new perspective on semantic representation relative to minimal pair studies. We will add additional detail in the revised manuscript providing additional explanation for how stimuli were chosen, and contrasting this with minimal pair approaches.
  
  (2) The reviewer notes that low RSA correlations do not imply that transformers fail to encode syntactic information. We acknowledge this in our discussion (page 10), where we also highlight that our focus is not on whether transformers encode such information, but rather what transformer representations can tell us about how sentence structure is represented in the brain. Our results indicate that transformer embeddings do not have the same geometric properties as brain representations of sentence meaning, at least for certain types of sentences where lexical information is insufficient to determine overall meaning. The reviewer also notes that transformer embeddings are highly anisotropic, however we adjust for this by normalising each feature as discussed on page 14. Finally, the reviewer notes that the transformers we examine differ in architecture and training objectives. This is not critical for our study because we are not seeking to determine which architecture or training objectives are best. Our goal is simply to compare a range of approaches and see which, if any, have similar sentence representations to those formed by the brain. In fact, our results indicate that architecture and training regime make relatively little difference for our stimuli.
  
  (3) The reviewer argues that RSA correlations do not measure the extent to which a model encodes syntactic information. This is very similar to the previous point. We do not claim that our results show that transformers do not encode syntactic information. Rather, our claim is that sentence embeddings derived from transformers have different geometric properties to brain representations, and that brain representations are better described by models explicitly representing key semantic roles. From this we conclude that, at least for the sentences we present, the brain is highly sensitive to semantic roles in a way that transformer representations are not (at least to the same extent). We also respectfully disagree with the reviewer’s suggestions that sentence length and orthographic or lexical similarities may drive model correlations with brain activity. As we discuss on page 19, we explicitly control for differences in sentence length when computing correlations. Our process for constructing our sentence set also controls for lexical similarity by generating pairs of sentences with all or mostly the same words but different orderings. We did not explicitly address orthographic similarity, but this will be strongly correlated with lexical similarity.
  
  Reviewer 3:
  
  (1) The reviewer emphasises the need for nuance in our conclusions, given that some of the transformers achieve higher correlations when assessed over the full set of sentences. We agree with this comment, and will modify the discussion section in the revised manuscript to address this point. Having said that, we would like to note one of the disadvantages of transformers as a model of mind or brain representations is that they are largely a ‘black box’ whose workings are poorly understood. One advantage of hybrid models like our simple semantic role model is that they can be much easier to interpret, thereby enabling them to be used to determine which features are most important for brain representations of sentence meaning, and what mechanisms are used to combine individual words into a full sentence. Given their relative simplicity and interpretability, we believe hybrid models have considerable value as scientific tools, even in cases where they achieve comparable correlations to transformers. We will highlight this issue more clearly in our revised manuscript.
  
  (2) The reviewer notes that despite our existing controls, residual confounds of sentence length may remain. We agree that this is a potential issue, and will add discussion to the revised manuscript. We also will present further supplementary analyses which we believe indicate that sentence length effects do not drive our main results. At the same time, we believe the fact that our results are robust to simultaneously controlling for sentence length and the ‘minimum length effect’ (Fig. S5) indicates they are not primarily driven by sentence length effects.
  
  (3) The reviewer notes that the method for computing similarities differs between the vector-based (mean and transformer) models, and the hybrid and syntax-based models, thereby potentially adding an additional confound to our results. We agree that this is a potential limitation, and our correlations should always be understood as applying to a model paired with a similarity metric. However, we believe that this is mostly unavoidable when comparing different formalisms. An alterative approach of first embedding a graph into a vector and then training an encoding model on the graph embeddings has a similar limitation of being dependent not just on the graph representation, but also on the way it was embedded into a vector and the way the encoding model was trained. Arguably this process is more opaque than similarity methods, since it is unclear to what extent the graph embeddings preserve the logic and properties of a graph-based representation. Further, it not clear whether there is any single method which can overcome the difficulty of comparing distinct formalisms for representing semantics. The reviewer also highlights how the correlations measured for the syntax model differ greatly depending on whether the Smatch or WWLK similarity metrics are used. We believe this highlights the need for careful examination of commonly used graph similarity metrics, as has been noted in previous research. We will include additional discussion of this issue in our revised manuscript.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.07.19.665701v1
www.biorxiv.org www.biorxiv.org

TrueProbes: Quantitative Single-Molecule RNA-FISH Probe Design Improves RNA Detection

5
1. Public_Reviews 09 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This useful study introduces a computational pipeline for designing RNA in situ fluorescence hybridization probes that could improve the sensitivity and specificity of RNA detection in cells. While the approach is novel and the preliminary data suggestive, the evidence supporting a clear advantage over existing probe design strategies is incomplete. The work will be of interest to researchers developing or using molecular tools for imaging RNA in cells.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  The authors describe a new computational pipeline designed to identify smFISH probes with improved RNA detection compared to preexisting approaches. smFISH is a powerful and relatively straightforward technique to detect single RNAs in cells at subcellular resolution, which is critical for understanding gene expression regulation at the RNA level. However, existing methods for designing smFISH oligos suffer from several limitations, including off-target binding that produces high background signals, as well as a restricted number of probes that are sufficiently specific to target shorter-than-average mRNAs. To address these challenges, the authors developed TrueProbes, a computational method that aims to minimize off-target-mediated background fluorescence.
  
  Overall, the study addresses a technically relevant problem. If improved, this would allow researchers to study gene expression regulation more effectively using single-molecule FISH. However, based on the current presentation of data, it is not yet clear that TrueProbes offers significant advantages over preexisting pipelines. In the following section, I describe some concerns, which should be adequately addressed.
  
  Major Comments:
  
  (1) The manuscript currently presents only one example in which different pipelines were tested to generate probes (targeting ARF4). While the images suggest that both TrueProbes and Stellaris outperform the other pipelines, the comparison is potentially misleading because the number of probes used differs substantially. I recommend that the authors include at least three independent examples in which an equal number of probes are designed across pipelines, so that signal-to-noise can be assessed in a controlled and comparable way. This would allow the probe number to be held constant while directly evaluating performance.
  
  (2) It is also unclear how many biological replicates were performed for the ARF4 experiments. If only a single replicate was included, it is difficult to conclude that TrueProbes consistently outperforms other pipelines in a robust and reproducible manner. I suggest the authors include data from at least three biological replicates with appropriate statistical analysis, and ideally extend this to additional smFISH targets as outlined in Comment 1.
  
  (3) No controls are presented to demonstrate that the TrueProbes-designed smFISH spots are specifically detecting ARF4. The current experiment primarily measures signal-to-noise, but it remains possible that some detected spots do not correspond to ARF4 mRNAs. Since one of the major criteria used by TrueProbes is to limit cross-hybridization, the authors should perform ARF4 knockdown experiments and demonstrate that nearly all ARF4 smFISH signal is lost. A similar approach should be applied to the additional examples recommended in Comment 1.
  
  (4) In the limitations of the study, the authors note that "RNA secondary and tertiary structures are not included, which may lead to inaccuracies if binding sites are structurally occluded." However, I am not convinced that this is a true limitation, since formamide in the smFISH protocol should denature secondary structures and allow oligo access to the RNA. I recommend that the authors comment on this point and clarify whether secondary structure poses a practical limitation in smFISH probe design.
  
  (5) The authors also correctly acknowledge in their limitations that "RNA-protein interactions, which can modulate accessibility of the transcript, are not modeled." I suggest referencing relevant studies on this issue, particularly Buxbaum et al. (2014, Science), which would provide important context.
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  Hughes et al present a new single-molecule RNA fluorescence in situ hybridization (smFISH) probe design software, termed "TrueProbes" in this manuscript. They claim that all existing smFISH (and variants) probe design software packages have limitations that ultimately impact experimental performance. The author's claim to address the majority of these limitations in TrueProbes by introducing multiple computational steps to ensure high-quality probe design. The manuscript's goal is clear, and the authors provide some evidence by designing and targeting one gene. Overall, the manuscript lacks rigorous evidence to support the claims, does not demonstrate its suitability for a variety of smFISH-type experiments, and some of the provided quantification data are unclear. While TrueProbes clearly has potential, more data is required, or the authors should tone down the claims.
  
  Strengths:
  
  (1) The problem is well-articulated in the abstract and the introduction.
  
  (2) Figures 3 and 4 follow a consistent color scheme where each probe design method has its own color, which helps the reader visually compare methods.
  
  (3) The authors compared multiple probe design software packages both computationally and experimentally.
  
  (4) TrueProbes does produce visually and quantitatively better results when compared to 2 of the 4 existing smFISH probe design packages (Paintshop and MERFISH panel designer).
  
  (5) The authors introduce a comprehensive steady-state thermodynamic model to help optimally guide probe design.
  
  Weaknesses:
  
  (1) The abstract describes the problem well and introduces the solution (the TrueProbes software), but fails to provide specific ways in which the TrueProbes software performs better. The authors state that "...[TrueProbes] consistently outperformed alternatives across multiple computational metrics and experimental validation assays", but specific, quantitative evidence of improved performance would strengthen the statement.
  
  (2) The text claims that TrueProbes outperforms all other probe design software, but Figure 3 indicates that TrueProbes has neither the greatest number of on-target binding nor the lowest number of off-target binding. The data in Figure 3 does not support the claims made in the text. Specifically, the authors claim that "RNA FISH Experimental Results Demonstrate that Off Target and Binding Affinity Inclusive Probe Design Improve RNA FISH Signal Discrimination" (lines 217-218). However, despite their claim that Stellaris and Oligostan-HT produce more off-target probes when evaluated with the TrueProbes framework, the experiment results are nearly identical. The authors should consider modifying their claims or performing new experiments that more clearly demonstrate their claims.
  
  (3) The bar graphs in Figure 3 do not seem to agree with the probability graphs in Figure 4. For example, Figure 3 indicates that Stellaris probes have higher off-target binding than TrueProbes; however, in Figure 4, their probability graphs lie almost on top of each other.
  
  (4) The authors performed validation for only one gene (ARF4), because "...it had the highest gene expression (in TPM units) and the fewest isoforms among all candidate genes for the Jurkat cell line" (lines 176-177). While the results do look good, this is a minimal use case and does not really showcase the power of their method. One experiment that could be helpful would be two-color (or more) smFISH in tissue, where the chances for off-target binding contributing to higher errors are much greater than in an adherent cell line.
  
  (5) A common strategy for both smFISH and highly multiplexed methods is to use secondary DNA oligos with dye molecules instead of direct conjugation. Given that this is a primary design goal of PaintSHOP and the Zhuang lab's MERFISH probe design code, it would be helpful to demonstrate that TrueProbes can design a two-layer probe strategy for high-quality RNA-FISH labeling.
  
  (6) The authors claim, "For every probe set, TrueProbes can simulate expected smRNA FISH outcomes including optimal probe, RNA, and salt concentrations and optionally account for probe secondary structure, hybridization temperature, multiple targets, fluorophore choice, DNA, nascent RNA, and photon count statistics (Figures S2A, S2B). The model can be used to generate predictions for temperature and cell line sensitivity, multi-target discrimination, multiple fluorophore colocalization; when provided transcript expression levels and probe/background intensity, it can start to generate predictions for spot intensity, background, signal to noise ratio, and false negative rates (Figure S2C)." (lines 156-163). Figure S2 is a flow chart and does not provide evidence for any of these items. The authors should provide evidence for these claims, either as a figure or an example script in their software repository. If that is not possible, then it should be removed.
  
  (7) All thermodynamic equations are performed at steady state. The authors do not justify this assumption, and there is no discussion of the potential impacts of either low molecule numbers or violations of the well-mixed assumption. Can the authors please include a discussion on the potential impacts non non-steady state dynamics?
  
  Review 2
4. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  Summary:
  
  This manuscript introduces a new platform termed "TrueProbes" for designing mRNA FISH probes. In comparison to existing design strategies, the authors incorporate a comprehensive thermodynamic and kinetic model to account for probe states that may contribute to nonspecific background. The authors validate their design pipeline using Jurkat cells and provide evidence of improved probe performance.
  
  Strengths:
  
  A notable strength of TrueProbes is the consideration of genome-wide binding affinities, which aims to minimize off-target signals. The work will be of interest to researchers employing mRNA FISH in certain human cell lines.
  
  Weaknesses:
  
  However, in my view, the experimental validation is not sufficient to justify the broad claims of the platform. Given the number of assumptions in the model, additional experimental comparisons across probe design methods, ideally targeting transcripts with different expression levels, would be necessary to establish the general superiority of this approach.
  
  Review 3
5. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Author response:
  
  Reviewer #1 (Public Review):
  
  The authors describe a new computational pipeline designed to identify smFISH probes with improved RNA detection compared to preexisting approaches. smFISH is a powerful and relatively straightforward technique to detect single RNAs in cells at subcellular resolution, which is critical for understanding gene expression regulation at the RNA level. However, existing methods for designing smFISH oligos suffer from several limitations, including off-target binding that produces high background signals, as well as a restricted number of probes that are sufficiently specific to target shorter-than-average mRNAs. To address these challenges, the authors developed TrueProbes, a computational method that aims to minimize off-target-mediated background fluorescence.
  
  Overall, the study addresses a technically relevant problem. If improved, this would allow researchers to study gene expression regulation more effectively using single-molecule FISH. However, based on the current presentation of data, it is not yet clear that TrueProbes offers significant advantages over preexisting pipelines. In the following section, I describe some concerns, which should be adequately addressed.
  
  Major Comments:
  
  (1) The manuscript currently presents only one example in which different pipelines were tested to generate probes (targeting ARF4). While the images suggest that both TrueProbes and Stellaris outperform the other pipelines, the comparison is potentially misleading because the number of probes used differs substantially. I recommend that the authors include at least three independent examples in which an equal number of probes are designed across pipelines, so that signal-to-noise can be assessed in a controlled and comparable way. This would allow the probe number to be held constant while directly evaluating performance.
  
  This is an important observation. We have already addressed this issue in Figures 3E-G and Supplementary Figure 4E-G, where we plotted the number of OFF-targets for each ON-target probe. If we select longer genes to ensure an equal number of designed probes with strong signals, we will still end up with the same number of ON-target probes. Consequently, Figures 3B-D and 3E-G would show similar trends, albeit with different values on the y-axis. Additionally, we will conduct an analysis using Stellaris at its highest probe design stringency setting to compare the software under its strictest design conditions. Additional experiments are outside the scope of the current manuscript.
  
  (2) It is also unclear how many biological replicates were performed for the ARF4 experiments. If only a single replicate was included, it is difficult to conclude that TrueProbes consistently outperforms other pipelines in a robust and reproducible manner. I suggest the authors include data from at least three biological replicates with appropriate statistical analysis, and ideally extend this to additional smFISH targets as outlined in Comment 1.
  
  Three biological replicates were utilized for the ARF4 experiments. As stated in the original submission, the average data from all three replicates is presented in Figure 4, while the data for each individual replicate can be found in Figure S5. Statistical analyses were conducted for both the pooled data in Figure 4 and the individual data in Figure S5. The results of all statistical calculations are detailed in Supplemental Table 1. We will update the text to clearly indicate the number of biological replicates and the outcomes of the statistical analysis.
  
  (3) No controls are presented to demonstrate that the TrueProbes-designed smFISH spots are specifically detecting ARF4. The current experiment primarily measures signal-to-noise, but it remains possible that some detected spots do not correspond to ARF4 mRNAs. Since one of the major criteria used by TrueProbes is to limit cross-hybridization, the authors should perform ARF4 knockdown experiments and demonstrate that nearly all ARF4 smFISH signal is lost. A similar approach should be applied to the additional examples recommended in Comment 1.
  
  Thank you for your suggestion. Currently, we lack the expertise in our lab to conduct such experiments, so they are beyond the scope of this manuscript. However, we will create additional supplementary figures to demonstrate that the likelihood of false positives is low, based on the assumption that current publicly available BLAST algorithms, genome annotations, and reference transcription expression data are accurate.
  
  We will include a comparison in our supplementary materials showing the off-target RNA that can bind the highest number of probes simultaneously for each software. Additionally, we will perform a correlation analysis to illustrate the relationship between spot intensity for different software and the number of probes they design. This will help us estimate how the number of probes bound to RNA correlates with expected spot intensity ranges.
  
  Using this information, along with autofluorescence background intensity measurements from no-probe controls, we will estimate the minimum number of probes that need to bind to targets to be detected as single spots. If this minimum is higher than the maximum number of simultaneous off-target probe bindings, we anticipate that the detected spot signal will primarily reflect ARF4 rather than other transcripts.
  
  (4) In the limitations of the study, the authors note that "RNA secondary and tertiary structures are not included, which may lead to inaccuracies if binding sites are structurally occluded." However, I am not convinced that this is a true limitation, since formamide in the smFISH protocol should denature secondary structures and allow oligo access to the RNA. I recommend that the authors comment on this point and clarify whether secondary structure poses a practical limitation in smFISH probe design.
  
  Thank you for pointing this out. We will revise the manuscript to clarify: "We did not include RNA secondary and tertiary structures in the model because the use of formamide in RNA-FISH experiments denatures these structures, allowing oligonucleotides to access the RNA."
  
  (5) The authors also correctly acknowledge in their limitations that "RNA-protein interactions, which can modulate accessibility of the transcript, are not modeled." I suggest referencing relevant studies on this issue, particularly Buxbaum et al. (2014, Science), which would provide important context.
  
  Thank you for highlighting the literature that supports this limitation. We will include Buxbaum et al. (2014, Science) and additional studies that discuss how RNA-protein interactions can affect RNA-FISH experiments.
  
  Reviewer #2 (Public review):
  
  Summary:
  
  Hughes et al present a new single-molecule RNA fluorescence in situ hybridization (smFISH) probe design software, termed "TrueProbes" in this manuscript. They claim that all existing smFISH (and variants) probe design software packages have limitations that ultimately impact experimental performance. The author's claim to address the majority of these limitations in TrueProbes by introducing multiple computational steps to ensure high-quality probe design. The manuscript's goal is clear, and the authors provide some evidence by designing and targeting one gene. Overall, the manuscript lacks rigorous evidence to support the claims, does not demonstrate its suitability for a variety of smFISH-type experiments, and some of the provided quantification data are unclear. While TrueProbes clearly has potential, more data is required, or the authors should tone down the claims.
  
  We appreciate the reviewer’s thoughtful feedback. We will revise the text to ensure that all claims are backed by computational or experimental evidence. For claims that do not have supporting results, we will relocate them to the discussion section as potential future extensions. Since our probe design is open access, both we and the community can further develop our codes as needed.
  
  Strengths:
  
  (1) The problem is well-articulated in the abstract and the introduction.
  
  (2) Figures 3 and 4 follow a consistent color scheme where each probe design method has its own color, which helps the reader visually compare methods.
  
  (3) The authors compared multiple probe design software packages both computationally and experimentally.
  
  (4) TrueProbes does produce visually and quantitatively better results when compared to 2 of the 4 existing smFISH probe design packages (Paintshop and MERFISH panel designer).
  
  (5) The authors introduce a comprehensive steady-state thermodynamic model to help optimally guide probe design.
  
  We like to thank the reviewer for pointing out the strength of the manuscript.
  
  Weaknesses:
  
  (1) The abstract describes the problem well and introduces the solution (the TrueProbes software), but fails to provide specific ways in which the TrueProbes software performs better. The authors state that "...[TrueProbes] consistently outperformed alternatives across multiple computational metrics and experimental validation assays", but specific, quantitative evidence of improved performance would strengthen the statement.
  
  Thank you for acknowledging the clarity of the abstract and introduction. We will revise the abstract to provide more specific details on how TrueProbes outperforms other software. Additionally, we will include specific computational and experimental metrics that demonstrate TrueProbes' improved performance compared to other software.
  
  (2) The text claims that TrueProbes outperforms all other probe design software, but Figure 3 indicates that TrueProbes has neither the greatest number of on-target binding nor the lowest number of off-target binding. The data in Figure 3 does not support the claims made in the text. Specifically, the authors claim that "RNA FISH Experimental Results Demonstrate that Off Target and Binding Affinity Inclusive Probe Design Improve RNA FISH Signal Discrimination" (lines 217-218). However, despite their claim that Stellaris and Oligostan-HT produce more off-target probes when evaluated with the TrueProbes framework, the experiment results are nearly identical. The authors should consider modifying their claims or performing new experiments that more clearly demonstrate their claims.
  
  In Figure 3, we aim to convey two main points.
  
  The first point is to compare the number of ON-target probes designed by each software using their most stringent design criteria (Figure 3A). Currently, we are using a medium strict design criterion for Stellaris (level 3). As shown in the new supplementary figure XX, when we apply the most stringent design criteria for Stellaris (level 5), the number of ON-target probes decreases to XX probes. This clearly indicates that, based on theoretical calculations, TrueProbes can design more probes than any of its competitors.
  
  The second point is to compare the number of OFF-targets produced by each probe design. To illustrate this, we used two different metrics. In Figures 3B-D, we compare the total number of probes bound to OFF-target RNA. However, since each software generates a different number of ON-target probes, the number of OFF-targets may vary simply due to the differences in ON-target probe counts. Therefore, we introduced a second metric to compare OFF-targets. In Figures 3E-G, we present the number of OFF-targets normalized by the number of ON-targets. Using this metric, TrueProbes shows the lowest number of OFF-targets. We will updat the manuscript to clarify this point.
  
  Regarding the experiments and their comparison to theoretical calculations: The theoretical calculations consider only the reference DNA and RNA genomes along with the oligonucleotide sequences for the probes. We then use a thermodynamic model to identify ON- and OFF-targets. Thus, these theoretical calculations represent an upper bound on the maximum possible number of ON-targets and the minimum number of OFF-targets. All other design software evaluated in this manuscript relies on the same or less reference data and makes certain assumptions. None of these methods quantitatively compare their computational designs with experimental results; they simply design probes based on unverified assumptions, conduct experiments, and present spot data to conclude that their probe designs are effective.
  
  We will update the manuscript to clarify the goals of the theoretical model and its relationship to the experiments. Future work will be necessary to enhance our theoretical model to fully account for additional aspects of RNA-FISH experiments (e.g., formaldehyde crosslinking, hybridization conditions, washing steps) to better predict the experimental data shown in Figure 4. We will also adjuste our claims to accurately reflect the current capabilities of our theoretical framework and its relation to experimental outcomes.
  
  (3) The bar graphs in Figure 3 do not seem to agree with the probability graphs in Figure 4. For example, Figure 3 indicates that Stellaris probes have higher off-target binding than TrueProbes; however, in Figure 4, their probability graphs lie almost on top of each other.
  
  The predictions in Figure 3 regarding the number of probe off-target binding events, based on reference gene expression data, do not necessarily encompass all the information required to predict RNA-FISH signal intensity. Therefore, these predictions should not be expected to translate directly into the experimental results shown in Figure 4, particularly concerning the background signal.
  
  While our software aims to minimize off-target probe binding, this does not automatically lead to a reduction in off-target background signal. Numerous other factors influence the spot background and overall signal-to-noise ratio (SNR) performance, beyond just probe-target binding interactions. Although we strive to minimize off-target background through probe binding, this approach is not designed to directly predict the SNR. Extending the computational analysis of probe binding dynamics to RNA-FISH signal intensity dynamics is beyond the scope of this study.
  
  We have revised our text to clearly separate computational results from experimental results into two distinct sections. We will use different terminology to describe the outcomes of computational performance versus experimental performance, reducing potential confusion between these two aspects. Additionally, we will clarify our conceptual overview in Figure 1 regarding traditional probe design limitations related to sensitivity and specificity. We will specify how the signal from the number of probes bound to ON-target RNA, relative to those bound to OFF-targets and cellular autofluorescence, translates—either linearly or non-linearly—into the signal-to-noise ratio.
  
  (4) The authors performed validation for only one gene (ARF4), because "...it had the highest gene expression (in TPM units) and the fewest isoforms among all candidate genes for the Jurkat cell line" (lines 176-177). While the results do look good, this is a minimal use case and does not really showcase the power of their method. One experiment that could be helpful would be two-color (or more) smFISH in tissue, where the chances for off-target binding contributing to higher errors are much greater than in an adherent cell line.
  
  Thank you for highlighting these valuable experiments. Currently, our lab lacks the expertise to generate tissue samples beyond culturing cells. Additionally, implementing a two-color probe design in tissues containing different cell types with unknown expression levels presents further challenges. Due to these limitations, designing and conducting two-color experiments in tissue samples is beyond the scope of the current manuscript, but we plan to pursue this in the future.
  
  (5) A common strategy for both smFISH and highly multiplexed methods is to use secondary DNA oligos with dye molecules instead of direct conjugation. Given that this is a primary design goal of PaintSHOP and the Zhuang lab's MERFISH probe design code, it would be helpful to demonstrate that TrueProbes can design a two-layer probe strategy for high-quality RNA-FISH labeling.
  
  Thank you for bringing this to our attention. TrueProbes is currently designed and tested specifically for primary smRNA-FISH probes. Our focus is on demonstrating a new approach to designing these probes without the added complexities of secondary probes and multiplexing. Future work will expand on this foundation to incorporate secondary probe detection and transcript multiplexing.
  
  (6) The authors claim, "For every probe set, TrueProbes can simulate expected smRNA FISH outcomes including optimal probe, RNA, and salt concentrations and optionally account for probe secondary structure, hybridization temperature, multiple targets, fluorophore choice, DNA, nascent RNA, and photon count statistics (Figures S2A, S2B). The model can be used to generate predictions for temperature and cell line sensitivity, multi-target discrimination, multiple fluorophore colocalization; when provided transcript expression levels and probe/background intensity, it can start to generate predictions for spot intensity, background, signal to noise ratio, and false negative rates (Figure S2C)." (lines 156-163). Figure S2 is a flow chart and does not provide evidence for any of these items. The authors should provide evidence for these claims, either as a figure or an example script in their software repository. If that is not possible, then it should be removed.
  
  The supplemental information of the article will be updated to include figures that illustrate predictions for each capability currently offered by TrueProbes, along with the scripts used to generate these predictions. Any capabilities that do not have corresponding scripts will be removed from this section and instead referred to as potential improvements or future additions to the TrueProbes framework in the discussion section.
  
  (7) All thermodynamic equations are performed at steady state. The authors do not justify this assumption, and there is no discussion of the potential impacts of either low molecule numbers or violations of the well-mixed assumption. Can the authors please include a discussion on the potential impacts non non-steady state dynamics?
  
  Thermodynamic equations are calculated at steady state because RNA-FISH hybridization reactions typically last from eight to twenty hours. This duration allows probes adequate time to localize to their targets and reach binding equilibrium, based on current estimates of DNA oligonucleotide association and dissociation rate constants. We will address the potential violation of the well-mixed assumption in the assumptions and limitations section, specifically discussing how RNA localization can affect the spatial distribution of both on-target and off-target probes within cells, which may disrupt the well-mixed condition.
  
  Low molecule numbers are not a significant concern, as probe DNA oligonucleotide concentrations in RNA-FISH protocols are much higher than the number of transcripts present in cells, by several orders of magnitude.
  
  The assumptions and limitations section will be revised to clearly state: “Probe hybridization reactions were computed at steady state because most RNA-FISH protocols utilize probe hybridization incubation steps lasting over eight hours, which should provide sufficient time to reach equilibrium based on current estimates of forward and reverse reaction rate constants. Predictions from the equilibrium model may be less accurate for RNA-FISH experiments with shorter hybridization times, where non-steady state dynamics can result in different transient outcomes depending on the duration of hybridization.”
  
  Reviewer #3 (Public review):
  
  Summary:
  
  This manuscript introduces a new platform termed "TrueProbes" for designing mRNA FISH probes. In comparison to existing design strategies, the authors incorporate a comprehensive thermodynamic and kinetic model to account for probe states that may contribute to nonspecific background. The authors validate their design pipeline using Jurkat cells and provide evidence of improved probe performance.
  
  Strengths:
  
  A notable strength of TrueProbes is the consideration of genome-wide binding affinities, which aims to minimize off-target signals. The work will be of interest to researchers employing mRNA FISH in certain human cell lines.
  
  Weaknesses:
  
  However, in my view, the experimental validation is not sufficient to justify the broad claims of the platform. Given the number of assumptions in the model, additional experimental comparisons across probe design methods, ideally targeting transcripts with different expression levels, would be necessary to establish the general superiority of this approach.
  
  We will revise our text to make our claims more specific and clearer, avoiding overgeneralizations and ensuring that all claims are adequately supported by the data we present.
  
  AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.14.670355v1
www.biorxiv.org www.biorxiv.org

Backward Conditioning Reveals Flexibility in Infralimbic Cortex Inhibitory Memories

5
1. Public_Reviews 09 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This set of experiments provides a valuable finding regarding the need for prior inhibitory training to recruit the infralimbic cortex in extinction learning. The multiple clever behavioral designs supply converging lines of evidence in a compelling manner, but several issues, such as the group sizes and appropriate analysis of data, render the overall strength of support incomplete. With these issues resolved, this manuscript will be of interest to behavioral neuroscientists, especially those interested in learning & memory and/or cortical function.
 
 Summary
2. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The manuscript reports a series of experiments designed to test whether optogenetic activation of infralimbic (IL) neurons facilitates extinction retrieval and whether this depends on animals' prior experience. In Experiment 1, rats underwent fear conditioning followed by either one or two extinction sessions, with IL stimulation given during the second extinction; stimulation facilitated extinction retrieval only in rats with prior extinction experience. Experiments 2 and 3 examined whether backward conditioning (CS presented after the US) could establish inhibitory properties that allowed IL stimulation to enhance extinction, and whether this effect was specific to the same stimulus or generalized to different stimuli. Experiments 5 - 7 extended this approach to appetitive learning: rats received backward or forward appetitive conditioning followed by extinction, and then fear conditioning, to determine whether IL stimulation could enhance extinction in contexts beyond aversive learning and across conditioning sequences. Across studies, the key claim is that IL activation facilitates extinction retrieval only when animals possess a prior inhibitory memory, and that this effect generalizes across aversive and appetitive paradigms.
 
 Strengths:
 
 (1) The design attempts to dissect the role of IL activity as a function of prior learning, which is conceptually valuable.
 
 (2) The experimental design of probing different inhibitory learning approaches to probe how IL activation facilitates extinction learning was creative and innovative.
 
 Weaknesses:
 
 (1) Non-specific manipulation.
 
 ChR2 was expressed in IL without distinction between glutamatergic and GABAergic populations. Without knowing the relative contribution of these cell types or the percentage of neurons affected, the circuit-level interpretation of the results is unclear.
 
 (2) Extinction retrieval test conflates processes
 
 The retrieval test included 8 tones. Averaging across this many tone presentations conflate extinction retrieval/expression (early tones) with further extinction learning (later tones). A more appropriate analysis would focus on the first 2-4 tones to capture retrieval only. As currently presented, the data do not isolate extinction retrieval.
 
 (3) Under-sampling and poor group matching.
 
 Sample sizes appear small, which may explain why groups are not well matched in several figures (e.g., 2b, 3b, 6b, 6c) and why there are several instances of unexpected interactions (protocol, virus, and period). This baseline mismatch raises concerns about the reliability of group differences.
 
 (4) Incomplete presentation of conditioning data.
 
 Figure 3 only shows a single conditioning session despite five days of training. Without the full dataset, it is difficult to evaluate learning dynamics or whether groups were equivalent before testing.
 
 (5) Interpretation stronger than evidence.
 
 The authors conclude that IL activation facilitates extinction retrieval only when an inhibitory memory has been formed. However, given the caveats above, the data are insufficient to support such a strong mechanistic claim. The results could reflect non-specific facilitation or disruption of behavior by broad prefrontal activation. Moreover, there is compelling evidence that optogenetic activation of IL during fear extinction does facilitate subsequent extinction retrieval without prior extinction training (Do-Monte et al 2015, Chen et al 2021), which the authors do not directly test in this study.
 
 Impact:
 
 The role of IL in extinction retrieval remains a central question in the fear learning literature. However, because the test used conflates extinction retrieval with new learning and the manipulations lack cell-type specificity, the evidence presented here does not convincingly support the main claims. The study highlights the need for more precise manipulations and more rigorous behavioral testing to resolve this issue.
 
 Review 1
3. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 In this manuscript, the authors examine the mechanisms by which stimulation of the infralimbic cortex (IL) facilitates the retention and retrieval of inhibitory memories. Previous work has shown that optogenetic stimulation of the IL suppresses freezing during extinction but does not improve extinction recall when extinction memory is probed one day later. When stimulation occurs during a second extinction session (following a prior stimulation-free extinction session), freezing is suppressed during the second extinction as well as during the tone test the following day. The current study was designed to further explore the facilitatory role of the IL in inhibitory learning and memory recall. The authors conducted a series of experiments to determine whether recruitment of IL extends to other forms of inhibitory learning (e.g., backward conditioning) and to inhibitory learning involving appetitive conditioning. Further, they assessed whether their effects could be explained by stimulus familiarity. The results of their experiments show that backward conditioning, another form of inhibitory learning, also enabled IL stimulation to enhance fear extinction. This phenomenon was not specific to aversive learning, as backward appetitive conditioning similarly allowed IL stimulation to facilitate extinction of aversive memories. Finally, the authors ruled out the possibility that IL facilitated extinction merely because of prior experience with the stimulus (e.g., reducing the novelty of the stimulus). These findings significantly advance our understanding of the contribution of IL to inhibitory learning. Namely, they show that the IL is recruited during various forms of inhibitory learning, and its involvement is independent of the motivational value associated with the unconditioned stimulus.
 
 Strengths:
 
 (1) Transparency about the inclusion of both sexes and the representation of data from both sexes in figures.
 
 (2) Very clear representation of groups and experimental design for each figure.
 
 (3) The authors were very rigorous in determining the neurobehavioral basis for the effects of IL stimulation on extinction. They considered multiple interpretations and designed experiments to address these possible accounts of their data.
 
 (4) The rationale for and the design of the experiments in this manuscript are clearly based on a wealth of knowledge about learning theory. The authors leveraged this expertise to narrow down how the IL encodes and retrieves inhibitory memories.
 
 Weaknesses:
 
 (1) In Experiment 1, although not statistically significant, it does appear as though the stimulation groups (OFF and ON) differ during Extinction 1. It seems like this may be due to a difference between these groups after the first forward conditioning. Could the authors have prevented this potential group difference in Extinction 1 by re-balancing group assignment after the first forward conditioning session to minimize the differences in fear acquisition (the authors do report a marginally significant effect between the groups that would undergo one vs. two extinction sessions in their freezing during the first conditioning session)?
 
 (2) Across all experiments (except for Experiment 1), the authors state that freezing during the initial conditioning increased across "days". The figures that correspond to this text, however, show that freezing changes across trials. In the methods, the authors report that backward conditioning occurred over 5 days. It would be helpful to understand how these data were analyzed and collated to create the final figures. Was the freezing averaged across the five days for each trial for analyses and figures?
 
 (3) In Experiment 3, the authors report a significant Protocol X Virus interaction. It would be useful if the authors could conduct post-hoc analyses to determine the source of this interaction. Inspection of Figure 4B suggests that freezing during the two different variants of backward conditioning differs between the virus groups. Did the authors expect to see a difference in backward conditioning depending on the stimulus used in the conditioning procedure (light vs. tone)? The authors don't really address this confounding interaction, but I do think a discussion is warranted.
 
 (4) In this same experiment, the authors state that freezing decreased during extinction; however, freezing in the Diff-EYFP group at the start of extinction (first bin of trials) doesn't look appreciably different than their freezing at the end of the session. Did this group actually extinguish their fear? Freezing on the tone test day also does not look too different from freezing during the last block of extinction trials.
 
 (5) The Discussion explored the outcomes of the experiments in detail, but it would be useful for the authors to discuss the implications of their findings for our understanding of circuits in which the IL is embedded that are involved in inhibitory learning and memory. It would also be useful for the authors to acknowledge in the Discussion that although they did not have the statistical power to detect sex differences, future work is needed to explore whether IL functions similarly in both sexes.
 
 Review 2
4. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 This is a really nice manuscript with different lines of evidence to show that the IL encodes inhibitory memories that can then be manipulated by optogenetic stimulation of these neurons during extinction. The behavioral designs are excellent, with converging evidence using extinction/re-extinction, backwards/forwards aversive conditioning, and backwards appetitive/forwards aversive conditioning. Additional factors, such as nonassociative effects of the CS or US, are also considered, and the authors evaluate the inhibitory properties of the CS with tests of conditioned inhibition.
 
 Strengths:
 
 The experimental designs are very rigorous with an unusual level of behavioral sophistication.
 
 Weaknesses:
 
 (1) More justification for parametric choices (number of days of backwards vs forwards conditioning) could be provided.
 
 (2) The current discussion could be condensed and could focus on broader implications for the literature.
 
 Review 3
5. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Author response:
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The manuscript reports a series of experiments designed to test whether optogenetic activation of infralimbic (IL) neurons facilitates extinction retrieval and whether this depends on animals' prior experience. In Experiment 1, rats underwent fear conditioning followed by either one or two extinction sessions, with IL stimulation given during the second extinction; stimulation facilitated extinction retrieval only in rats with prior extinction experience. Experiments 2 and 3 examined whether backward conditioning (CS presented after the US) could establish inhibitory properties that allowed IL stimulation to enhance extinction, and whether this effect was specific to the same stimulus or generalized to different stimuli. Experiments 5 - 7 extended this approach to appetitive learning: rats received backward or forward appetitive conditioning followed by extinction, and then fear conditioning, to determine whether IL stimulation could enhance extinction in contexts beyond aversive learning and across conditioning sequences. Across studies, the key claim is that IL activation facilitates extinction retrieval only when animals possess a prior inhibitory memory, and that this effect generalizes across aversive and appetitive paradigms.
 
 Strengths:
 
 (1) The design attempts to dissect the role of IL activity as a function of prior learning, which is conceptually valuable.
 
 We thank the Reviewer for their positive assessment.
 
 (2) The experimental design of probing different inhibitory learning approaches to probe how IL activation facilitates extinction learning was creative and innovative.
 
 We thank the Reviewer for their positive assessment.
 
 Weaknesses:
 
 (1) Non-specific manipulation.
 
 ChR2 was expressed in IL without distinction between glutamatergic and GABAergic populations. Without knowing the relative contribution of these cell types or the percentage of neurons affected, the circuit-level interpretation of the results is unclear.
 
 ChR2 was intentionally expressed in the infralimbic cortex (IL) without distinction between local neuronal populations for two reasons. First, this manuscript aimed to uncover some of the features characterizing the encoding of inhibitory memories in the IL, and this encoding likely engages interactions among various neuronal populations within the IL. Second, the hypotheses tested in the manuscript derived from findings that indiscriminately stimulated the IL using the GABAA receptor antagonist picrotoxin, which is best mimicked by the approach taken. We agree that it is also important to determine the respective contributions of distinct IL neuronal populations to inhibitory encoding; however, the global approach implemented in the present experiments represents a necessary initial step. This rationale will be incorporated into the revised manuscript, which will also make reference to the need to identify the relative contributions of the various neuronal populations within the IL.
 
 (2) Extinction retrieval test conflates processes
 
 The retrieval test included 8 tones. Averaging across this many tone presentations conflate extinction retrieval/expression (early tones) with further extinction learning (later tones). A more appropriate analysis would focus on the first 2-4 tones to capture retrieval only. As currently presented, the data do not isolate extinction retrieval.
 
 It is unclear when retrieval of what has been learned across extinction ceases and additional extinction learning occurs. In fact, it is only the first stimulus presentation that unequivocally permits a distinction between retrieval and additional extinction learning, as the conditions for this additional learning have not been fulfilled at that presentation. However, confining evidence for retrieval to the first stimulus presentation introduces concerns that other factors could influence performance. For instance, processing of the stimulus present at the start of the session may differ from that present at the end of the previous session, thereby affecting what is retrieved. Such differences between the stimuli present at the start and end of an extinction session have been long recognized as a potential explanation for spontaneous recovery (Estes, 1955). More importantly, whether the test data presented confound retrieval and additional extinction learning or not, the interpretation remains the same with respect to the effects of a prior history of inhibitory learning on enabling the facilitative effects of IL stimulation. Finally, it is unclear how these facilitative effects could occur in the absence of the subjects retrieving the extinction memory formed under the stimulation. Nevertheless, the revised manuscript will provide the trial-by-trial performance during the post-extinction retrieval tests and discuss this issue.
 
 (3) Under-sampling and poor group matching.
 
 Sample sizes appear small, which may explain why groups are not well matched in several figures (e.g., 2b, 3b, 6b, 6c) and why there are several instances of unexpected interactions (protocol, virus, and period). This baseline mismatch raises concerns about the reliability of group differences.
 
 Efforts were made to match group performance upon completion of each training stage and before IL stimulation. Unfortunately, these efforts were not completely successful due to exclusions following post-mortem analyses. However, we acknowledge that the unexpected interactions deserve further discussion, and this will be incorporated into the revised manuscript (see also comment from Reviewer 2). Although we cannot exclude that sample sizes may have contributed to some of these interactions, we remain confident about the reliability of the main findings reported, especially given their replication across the various protocols. Overall, the manuscript provides evidence that IL stimulation does not facilitate brief extinction in the absence of prior inhibitory experience in five different experiments, replicating previous findings (Lingawi et al., 2018; Lingawi et al., 2017). It also replicates these previous findings by showing that prior experience with either fear or appetitive extinction enables IL stimulation to facilitate subsequent fear extinction. Furthermore, the facilitative effects of such stimulation following fear or appetitive backward conditioning are replicated in the present manuscript.
 
 (4) Incomplete presentation of conditioning data.
 
 Figure 3 only shows a single conditioning session despite five days of training. Without the full dataset, it is difficult to evaluate learning dynamics or whether groups were equivalent before testing.
 
 We apologize, as we incorrectly labeled the X axis for the backward conditioning data set in Figures 3B, 4B, 4D and 5B. It should have indicated “Days” instead of “Trials”. This error will be corrected in the revised manuscript.
 
 (5) Interpretation stronger than evidence.
 
 The authors conclude that IL activation facilitates extinction retrieval only when an inhibitory memory has been formed. However, given the caveats above, the data are insufficient to support such a strong mechanistic claim. The results could reflect non-specific facilitation or disruption of behavior by broad prefrontal activation. Moreover, there is compelling evidence that optogenetic activation of IL during fear extinction does facilitate subsequent extinction retrieval without prior extinction training (Do-Monte et al 2015, Chen et al 2021), which the authors do not directly test in this study.
 
 As noted above, the revised manuscript will show that the interpretations of the main findings stand whether ore the test data confounds retrieval with additional extinction learning. The revised manuscript will also clarify the plotting of the data for the backward conditioning stages. We do agree that further discussion of the unexpected interactions is necessary, and this will also be incorporated into the revised manuscript. However, the various replications of the core findings provide strong evidence for their reliability and the interpretations advanced in the original manuscript. The proposal that the results reflect non-specific facilitation or disruption of behavior seems highly unlikely. Indeed, the present experiments and previous findings (Lingawi et al., 2018; Lingawi et al., 2017) provide multiple demonstrations that IL stimulation fails to produce any facilitation in the absence of prior inhibitory experience with the target stimulus. Although these demonstrations appear inconsistent with previous studies (Do-Monte et al., 2015; Chen et al., 2021), this inconsistency is likely explained by the fact that these studies manipulated activity in specific IL neuronal populations. Previous work has already revealed differences between manipulations targeting discrete IL neuronal populations as opposed to general IL activity (Kim et al., 2016). Importantly, as previously noted, the present manuscript aimed to generally explore inhibitory encoding in the IL that, as we will acknowledge, is likely to engage several neuronal populations within the IL. Adequate statements on these matters will be included in the revised manuscript.
 
 Impact:
 
 The role of IL in extinction retrieval remains a central question in the fear learning literature. However, because the test used conflates extinction retrieval with new learning and the manipulations lack cell-type specificity, the evidence presented here does not convincingly support the main claims. The study highlights the need for more precise manipulations and more rigorous behavioral testing to resolve this issue.
 
 As noted in our responses, the interpretations of the data presented remain identical whether the test data conflate extinction retrieval with additional extinction learning or not. Although we agree that it is important to establish the role of specific IL neuronal populations in extinction learning, this was beyond the scope of the manuscript and the findings reported remain valuable to our understanding of inhibitory encoding within the IL.
 
 Reviewer #2 (Public review):
 
 Summary:
 
 In this manuscript, the authors examine the mechanisms by which stimulation of the infralimbic cortex (IL) facilitates the retention and retrieval of inhibitory memories. Previous work has shown that optogenetic stimulation of the IL suppresses freezing during extinction but does not improve extinction recall when extinction memory is probed one day later. When stimulation occurs during a second extinction session (following a prior stimulation-free extinction session), freezing is suppressed during the second extinction as well as during the tone test the following day. The current study was designed to further explore the facilitatory role of the IL in inhibitory learning and memory recall. The authors conducted a series of experiments to determine whether recruitment of IL extends to other forms of inhibitory learning (e.g., backward conditioning) and to inhibitory learning involving appetitive conditioning. Further, they assessed whether their effects could be explained by stimulus familiarity. The results of their experiments show that backward conditioning, another form of inhibitory learning, also enabled IL stimulation to enhance fear extinction. This phenomenon was not specific to aversive learning, as backward appetitive conditioning similarly allowed IL stimulation to facilitate extinction of aversive memories. Finally, the authors ruled out the possibility that IL facilitated extinction merely because of prior experience with the stimulus (e.g., reducing the novelty of the stimulus). These findings significantly advance our understanding of the contribution of IL to inhibitory learning. Namely, they show that the IL is recruited during various forms of inhibitory learning, and its involvement is independent of the motivational value associated with the unconditioned stimulus.
 
 Strengths:
 
 (1) Transparency about the inclusion of both sexes and the representation of data from both sexes in figures.
 
 We thank the Reviewer for their positive assessment.
 
 (2) Very clear representation of groups and experimental design for each figure.
 
 We thank the Reviewer for their positive assessment.
 
 (3) The authors were very rigorous in determining the neurobehavioral basis for the effects of IL stimulation on extinction. They considered multiple interpretations and designed experiments to address these possible accounts of their data.
 
 We thank the Reviewer for their positive assessment.
 
 (4) The rationale for and the design of the experiments in this manuscript are clearly based on a wealth of knowledge about learning theory. The authors leveraged this expertise to narrow down how the IL encodes and retrieves inhibitory memories.
 
 We thank the Reviewer for their positive assessment.
 
 Weaknesses:
 
 (1) In Experiment 1, although not statistically significant, it does appear as though the stimulation groups (OFF and ON) differ during Extinction 1. It seems like this may be due to a difference between these groups after the first forward conditioning. Could the authors have prevented this potential group difference in Extinction 1 by re-balancing group assignment after the first forward conditioning session to minimize the differences in fear acquisition (the authors do report a marginally significant effect between the groups that would undergo one vs. two extinction sessions in their freezing during the first conditioning session)?
 
 As noted (see response to Reviewer 1), efforts were made daily to match group performance across the training stages, but these efforts were ultimately hampered by the necessary exclusions following post-mortem analyses. This will be made explicit in the revised manuscript. Regarding freezing during Extinction 1, as noted by the Reviewer, the difference, which was not statistically significant, was absent across trials during the subsequent forward fear conditioning stage. Likewise, the protocol difference observed during the initial forward fear conditioning was absent in subsequent stages. We are therefore confident that these initial differences (significant or not) did not impact the main findings at test. Importantly, these findings replicate previous work using identical protocols in which no differences were present during the training stages. These considerations will be addressed in the revised manuscript.
 
 (2) Across all experiments (except for Experiment 1), the authors state that freezing during the initial conditioning increased across "days". The figures that correspond to this text, however, show that freezing changes across trials. In the methods, the authors report that backward conditioning occurred over 5 days. It would be helpful to understand how these data were analyzed and collated to create the final figures. Was the freezing averaged across the five days for each trial for analyses and figures?
 
 We apologize, as noted above, we incorrectly labeled the X axis for the backward conditioning data sets in Figures 3B, 4B, 4D and 5B. It should have indicated “Days” instead of “Trials”. The data shown in these Figures use the average of all trials on a given day. This will be clarified in the methods section of the revised manuscript. The labeling errors on the Figures will be corrected.
 
 (3) In Experiment 3, the authors report a significant Protocol X Virus interaction. It would be useful if the authors could conduct post-hoc analyses to determine the source of this interaction. Inspection of Figure 4B suggests that freezing during the two different variants of backward conditioning differs between the virus groups. Did the authors expect to see a difference in backward conditioning depending on the stimulus used in the conditioning procedure (light vs. tone)? The authors don't really address this confounding interaction, but I do think a discussion is warranted.
 
 We agree with the Reviewer that further discussion of the Protocol x Virus interaction that emerged during the backward conditioning and forward conditioning stages of Experiment 3 is warranted. This will be provided in the revised manuscript. Briefly, during both stages, follow-up analyses did not reveal any differences (main effects or interactions) between the two groups trained with the light stimulus (Diff-EYFP and Diff-ChR2). By contrast, the ChR2 group trained with the tone (Back-ChR2) froze more overall than the EYFP group (Back-EYFP), but there were no other significant differences between the two groups. Based on these analyses, the Protocol x Virus interaction appears to be driven by greater freezing in the ChR2 group trained with the tone rather than a difference in the backward conditioning performance based on stimulus identity. Consistent with this, the statistical analyses did not reveal a main effect of Protocol during either the backward conditioning stage or the stimulus trials during the forward conditioning stage. Nevertheless, during this latter stage, a main effect of Protocol emerged during baseline performance, but once again, this seems to be driven by the Back-ChR2 group. Critically, it is unclear how greater stimulus freezing in the Back-ChR2 group during forward conditioning would lead to lower freezing during the post-extinction retrieval test.
 
 (4) In this same experiment, the authors state that freezing decreased during extinction; however, freezing in the Diff-EYFP group at the start of extinction (first bin of trials) doesn't look appreciably different than their freezing at the end of the session. Did this group actually extinguish their fear? Freezing on the tone test day also does not look too different from freezing during the last block of extinction trials.
 
 We confirm that overall, there was a significant decline in freezing across the extinction session shown in Figure 4B. The Reviewer is correct to point out that this decline was modest (if not negligible) in the Diff-EYFP group, which was receiving its first inhibitory training with the target tone stimulus. It is worth noting that across all experiments, most groups that did not receive infralimbic stimulation displayed a modest decline in freezing during the extinction session since it was relatively brief, involving only 6 or 8 tone alone presentations. This was intentional, as we aimed for the brief extinction session to generate minimal inhibitory learning and thereby to detect any facilitatory effect of infralimbic stimulation. This issue will be clarified and explained in the revised version of the manuscript.
 
 (5) The Discussion explored the outcomes of the experiments in detail, but it would be useful for the authors to discuss the implications of their findings for our understanding of circuits in which the IL is embedded that are involved in inhibitory learning and memory. It would also be useful for the authors to acknowledge in the Discussion that although they did not have the statistical power to detect sex differences, future work is needed to explore whether IL functions similarly in both sexes.
 
 In line with the Reviewer’s suggestion (see also Reviewer 3), the revised manuscript will include a discussion of the broader implications of the findings regarding inhibitory brain circuitry and will acknowledge the need to further explore sex differences and IL functions.
 
 Reviewer #3 (Public review):
 
 Summary:
 
 This is a really nice manuscript with different lines of evidence to show that the IL encodes inhibitory memories that can then be manipulated by optogenetic stimulation of these neurons during extinction. The behavioral designs are excellent, with converging evidence using extinction/re-extinction, backwards/forwards aversive conditioning, and backwards appetitive/forwards aversive conditioning. Additional factors, such as nonassociative effects of the CS or US, are also considered, and the authors evaluate the inhibitory properties of the CS with tests of conditioned inhibition.
 
 Strengths:
 
 The experimental designs are very rigorous with an unusual level of behavioral sophistication.
 
 We thank the Reviewer for their positive assessment.
 
 Weaknesses:
 
 (1) More justification for parametric choices (number of days of backwards vs forwards conditioning) could be provided.
 
 All experimental parameters were based on previously published experiments showing the capacity of the backward conditioning protocols to generate inhibitory learning and the forward conditioning protocols to produce excitatory learning. Although this was mentioned in the methods section, we acknowledge that further explanation is required to justify the need for multiple days of backward training. This will be provided in the revised manuscript.
 
 (2) The current discussion could be condensed and could focus on broader implications for the literature.
 
 The revised manuscript will make an effort to condense the discussion and focus on broader implications for the literature.
 
 References
 
 Chen, Y.-H., Wu, J.-L., Hu, N.-Y., Zhuang, J.-P., Li, W.-P., Zhang, S.-R., Li, X.-W., Yang, J.-M., & Gao, T.-M. (2021). Distinct projections from the infralimbic cortex exert opposing effects in modulating anxiety and fear. J Clin Invest, 131(14), e145692. https://doi.org/10.1172/JCI145692
 
 Do-Monte, F. H., Manzano-Nieves, G., Quiñones-Laracuente, K., Ramos-Medina, L., & Quirk, G. J. (2015). Revisiting the role of infralimbic cortex in fear extinction with optogenetics. J Neurosci, 35(8), 3607-3615. https://doi.org/10.1523/JNEUROSCI.3137-14.2015
 
 Estes, W. K. (1955). Statistical theory of spontaneous recovery and regression. Psychol Rev, 62(3), 145-154. https://doi.org/10.1037/h0048509
 
 Kim, H.-S., Cho, H.-Y., Augustine, G. J., & Han, J.-H. (2016). Selective Control of Fear Expression by Optogenetic Manipulation of Infralimbic Cortex after Extinction. Neuropsychopharmacology, 41(5), 1261-1273. https://doi.org/10.1038/npp.2015.276
 
 Lingawi, N. W., Holmes, N. M., Westbrook, R. F., & Laurent, V. (2018). The infralimbic cortex encodes inhibition irrespective of motivational significance. Neurobiol Learn Mem, 150, 64-74. https://doi.org/10.1016/j.nlm.2018.03.001
 
 Lingawi, N. W., Westbrook, R. F., & Laurent, V. (2017). Extinction and Latent Inhibition Involve a Similar Form of Inhibitory Learning that is Stored in and Retrieved from the Infralimbic Cortex. Cereb Cortex, 27(12), 5547-5556. https://doi.org/10.1093/cercor/bhw322
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

Review 2

Review 3

AuthorResponse

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.08.02.668258v1
www.biorxiv.org www.biorxiv.org

Enhancer-AAVs allow genetic access to oligodendrocytes and diverse populations of astrocytes across species

3
1. Public_Reviews 09 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This important study presents convincing findings on creating an exhaustive library of new enhancer-AAVs targeting astrocytes and oligodendrocytes with high potential for both basic and translational work, which will be of value to a large and growing community. However, the outdated description of glial biology in the Introduction, the overstated claims of utility in the Conclusion, and the loose stringency in the criteria used to assemble the library diminish the strengths of the claims. The work will be of interest to neuroscientists working on glial cell biology.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  The goal of this study was to generate a library of new enhancer-driven AAVs in order to selectively and efficiently target astrocytes and oligodendrocytes in rodents. The implied criteria are that such viral vectors should have high specificity for the intended cell type and effectively express in all astrocytes/oligos in the brain or, alternatively, be specific for defined brain regions, layers, or subtypes of astrocytes/oligos. In addition, they should be compatible with intravenous retro-orbital delivery to facilitate experimentation and brain-wide targeting (i.e., show organ specificity and high efficiency in the brain). Ideally, these new AAVs would also maintain their characteristics across disease contexts and show applicability in non-human primates. Tools with such characteristics are generally lacking in studying glial cells and would be extremely useful to scale up and accelerate glial research, allowing targeting of astrocytes/oligos with distinct molecular identity and intersectional strategies.
  
  At present, however, none of the enhancer-AAVs presented in the study seems to meet this combination of criteria, at least not with the level of stringency typically expected in the field. The main reason is that, in its current form, the study does not present one candidate AAV iteratively improved to meet all these criteria; instead, it presents a catalogue of new AAVs with various degrees of specificity, completeness, and mixed characteristics. Therefore, their utility should be interpreted cautiously. Moreover, the way specificity and completeness are intermixed in the analysis makes it difficult to evaluate the actual utility of any given AAV. The study might have been strengthened by focusing on a small set of the most promising candidates (i.e., AiE0890m_3x2C) and validating them thoroughly for expression specificity, completeness, effective cargo expression, ability to allow specific pan-astrocyte or astrocyte-subtype targeting in vivo, and preserved properties in NHPs and in disease, as this would encourage their adoption by the community. Currently, too many AAVs are assessed inconsistently against the desired criteria, with none being evaluated through and through.
  
  The impact of the catalogue is also greatly diminished by the fact that a suite of AAVs with outstanding specificity and efficiency is already available for the study of astrocytes (e.g., 4x6T AAVs) and was not utilized as a standard to benchmark the new library, making it difficult to appreciate the relative benefits of the new AAVs. The inclusion of expression data in NHPs is very significant, but benchmarking against established AAVs would also be needed to fully appreciate their value.
  
  Importantly, readers should also be aware that the study seems noticeably limited in its literacy with glial biology. The introduction and discussion frame the field in a way that seems outdated, creating the impression that the diverse roles of glia in health and disease have not yet been studied, which may inadvertently be perceived as dismissive and stigmatizing.
  
  In summary, the paper introduces potentially useful viral tools and lays the foundations for future multiplexed targeting of distinct glial cell subpopulations in rodents and in non-human primates, which are extremely important directions. Some of the regionally restricted or even sparsely expressed AAVs may prove valuable in enabling subpopulation-specific targeting or molecular profiling strategies, but currently lack full benchmarking. At present, the promises over the utility of the new tools seem overstated, and the library may not yet represent an actionable resource for targeting astrocytes and oligodendrocytes.
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Enhancer elements are regulatory DNA sequences that are capable of driving specific expression patterns. As these elements are generally short and context-independent, enhancers can be used in expression vectors (e.g., packaged in an adeno-associated virus, AAV) to limit expression to target cell populations. This approach was identified as a major strategy for cell-type-specific manipulation in the brain and has been pursued by both standard research studies as well as large-scale efforts led by the BRAIN Initiative. This manuscript describes a major effort to generate enhancer-AAVs targeting astrocytes and oligodendrocytes orchestrated by a large research team led by the Allen Institute for Brain Science. This manuscript parallels other recent publications describing sets of enhancer-AAVs, following rigorous, similar methods with relatively broad testing and application.
  
  To identify and screen candidate enhancers, the scientists prioritized candidates via analysis of single-nucleus accessibility and methylation datasets (i.e., snATAC-seq) and tested them in mice. The scientists prioritized candidate enhancers that exhibited specificity of accessibility in the target cell type. Following selection, the scientists cloned the candidate sequences into AAV vectors with a minimal promoter and reporter gene, packaged the virus, delivered it to the mouse brain, and screened for activity based on reporter expression. Candidates that passed initial screening were further characterized via imaging and sorting, followed by single-cell RNA-seq. This process had around a 50% success rate and yielded 25 astrocyte and 21 oligodendrocyte enhancer-AAVs with the targeted cell-type-specific expression patterns.
  
  The scientists went on to test for subtype-specific activity patterns, finding wide diversity in astrocyte activities across sub-populations and conversely, homogenous oligodendrocyte activation. They optimized a few of these via concatenating the enhancer core sequence to increase expression levels of the reporter gene and showed strong specificity and completeness of cell targeting for a set of these enhancer-AAVs. Following characterization and validation, they then deployed these enhancer-AAVs in a number of demonstration applications to show the utility for basic and translational science. All the constructs developed here are available for public use via Addgene, ensuring that these new tools can be used by other researchers.
  
  There really are no obvious weaknesses in the work presented here, from the generation of the enhancer-AAVs to use in sophisticated validation studies. The enhancer-AAV testing is rigorous and provides critical information necessary for other scientists to select and use these constructs. The applications demonstrate the power of enhancer-AAV approaches. The toolbox presented here may not enable specific targeting of all relevant cellular subtypes or activity states for astrocytes and oligodendrocytes, and future work will be needed to fully understand the activity of the enhancers, identity of the target cell types, and context-dependent utility of these constructs. However, the set of enhancer-AAVs developed here should be transformative for researchers working on accessing and manipulating these cell types and have a major impact on the field.
  
  Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2023.09.20.558718v2
www.biorxiv.org www.biorxiv.org

Functional connectivity, structural connectivity, and inter-individual variability in Drosophila melanogaster

4
1. Public_Reviews 09 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This paper presents a collection of analyses relating structure and function in the whole-brain Drosophila EM connectome and whole-brain calcium imaging data. The linkage of detailed anatomical structure with population activity is of broad interest in circuit neuroscience in light of increasingly detailed brain maps, but the analysis methods used made the evidence incomplete. The conclusions are useful for specific network observations, but a more thorough analysis of the anatomical and functional data is needed to support the overall claims.
 
 Summary
2. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 In this paper, the authors analyze connectome data from Drosophila and compare the physical wiring with functional connectivity estimated from calcium imaging data. They quantify structure-function relationships as a correlation of the two connectivity modalities. They report correlations roughly comparable to what has been described in the literature on sc/fc relationships in mammalian connectome data at the meso-scale. They then repeat their analysis, focusing on segregated versus unsegregated synapses. They derive separate connectomes using one or the other class of synapse. They show differential contributions to the sc/fc relationships by segregated versus unsegregated synapses.
 
 Strengths:
 
 There is nice synthesis of multimodal imaging data (Ca and EM data from flies and meso-scale data from human and marmoset).
 
 Weaknesses:
 
 (1) The paper is written in an unusual way. The introduction intermingles results with background, making it hard to figure out what precisely is being tested.
 
 (2) There are also major methodological gaps. Though the mammalian connectomes are used as a point of reference, no descriptions of their origins or processing are included.
 
 (3) A major weakness stems from the actual calculation of the sc/fc correlation. In general, SC is sparse. In the case of the EM connectomes, it is *exceptionally* sparse (most neural elements are not connected to one another). The authors calculated sc/fc coupling by correlating the off-diagonal elements of sc (the logarithm of its edge weights) and fc matrices with one another. The logarithmic transformation yields a value of infinity for all zero entries. The authors simply impute these elements with 0. This makes no sense and, depending on whether these zero elements are distributed systematically versus uniformly random, could either inflate or deflate the sc/fc correlations. Care must be taken here.
 
 (4) Further, in constructing the segregated versus unsegregated connectomes, they use absolute thresholds for collecting synapses. It is unclear, however, whether similar numbers of synapses were included in both matrices. If the number is different, that might explain the differential relationship with fc; one matrix has more non-zero entries (and as noted earlier, those zero entries are problematic).
 
 (5) There was also considerable text (in the results) describing the processing of the Ca data. In this section, the authors frequently refer to some pipelines as "better" or "worse" (more or less effective). But it is not clear what measures they adopted to assess the effectiveness of a pipeline.
 
 Review 1
3. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 Okuno et al. investigate the structure-function relationship in the fruit fly Drosophila melanogaster. To do so, they combine published data from two recent synapse-level connectomes ("hemibrain" and "FlyWire") with a dataset comprising functional whole-brain calcium imaging and behavioural data. First, they investigate the applicability of fMRI pre-processing techniques on data from calcium imaging. They then cross-correlate this pre-processed functional data with structural data extracted from the connectomes, including a comparison to humans. The authors proceed to compare the two connectomes and find significant differences, which they attribute to differences in the accuracy of the synapse detections. Next, they present a novel algorithm to quantify whether neurons are segregated (pre- and postsynapses are spatially separate) or unsegregated (pre- and postsynapses are mixed). Using this approach, they find that unsegregated neurons may contribute more to function than segregated neurons. Applying a general linear model to the functional dataset suggests that activity in two brain areas (Wedge and AVLP) is suppressed during walking. The authors identify a GABAergic neuron in the connectome that could be responsible for this effect and suggest it may provide feedback to the fly's "compass" in the central complex.
 
 Strengths:
 
 The study tackles a relevant question in connectomics by exploring the relationship between structural and functional connectivity in the Drosophila brain. The authors apply a range of established and adapted analytical methods, including fMRI-style preprocessing and a novel synaptic segregation index. The effort to integrate multiple datasets and to compare across species reflects a broad and methodical approach.
 
 Weaknesses:
 
 The manuscript would benefit from a clearer overarching narrative to unify the various analyses, which currently appear somewhat disjointed. While the technical methods are extensive, the writing is often convoluted and lacks crucial details, making it difficult to follow the logic and interpret key findings. Additionally, the conclusions are relatively incremental and lack a compelling conceptual advance, limiting the overall impact of the work.
 
 (1) The introduction currently contains a number of findings and conclusions that would be better placed in the results and discussion to clearly delineate past findings from new results and speculations.
 
 (2) The narrative would benefit greatly from some clear statements along the lines of "we wanted to find out X, therefore we did Y".
 
 (3) More concise terminology would be helpful. For example, the connectomes are currently referred to as either "hemibrain", "FlyEM", "whole-brain", or "FlyWire".
 
 (4) The abstract claims "a new, more robust method to quantify the degree of pre- and post-synaptic segregation". However, the study fails to provide evidence that this method is indeed more robust than existing methods.
 
 (5) The authors define unsegregated neurons as having mixed pre- and postsynapses in the same space. However, this ignores the neurons' topology: a neuron can exhibit a clearly defined dendrite with (mostly) postsynapses and a clearly defined axon with (mostly) presynapses, which then occupy the same space. This is different from genuinely unsegregated neurons with no distinct dendritic and axonal compartments, such as CT1.
 
 (6) It is not entirely clear where the marmoset dataset originates from. Was it generated for this study? If not, why is there a note in the Ethics Declaration?
 
 (7) On the differences between hemibrain and FlyWire: What is the "18.8 million post-synapses" for FlyWire referring to? The (thresholded) FlyWire synapse table has 130M connections (=postsynapses). Subsetting that synapse cloud to the hemibrain volume still gives ~47M synapses. Further subsetting to only connections between proofread neurons inside the hemibrain volume gives 19.4M - perhaps the authors did something like that? Similarly, the hemibrain synapse table contains 64M postsynapses. Do the 21M "FlyEM" post-synapses refer to proofread neurons only? If the authors indeed used only (post-)synapses from proofread neurons, they need to make that explicit in results and methods, and account for differences in reconstruction status when making any comparisons. For example, the mushroom body in the hemibrain got a lot more attention than in FlyWire, which would explain the differences reported here. For that reason, connection weights are often expressed as, e.g., a fraction of the target's inputs instead of the total number of synapses when comparing connectivity across connectomic datasets. Furthermore, in Figure 3b, it looks like the FlyWire synapse cloud was not trimmed to the exact hemibrain boundaries: for example, the trimmed FlyWire synapse cloud seems to extend further into the optic lobes than the hemibrain volume does.
 
 Review 2
4. Public_Reviews 09 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:
 
 In this manuscript, Okuno et al. re-analyze whole-brain imaging data collected in another paper (Brezovec et al., 2024) in the context of the two currently available Drosophila connectome datasets: the partial "FlyEM" (hemibrain) dataset (Scheffer et al., 2020) and the whole-brain "FlyWire" dataset (Dorkenwald et al., 2024). They apply existing fMRI signal processing algorithms to the fly imaging data and compute function-structure correlations across a variety of post-processing parameters (noise reduction methods, ROI size), demonstrating an inverse relationship between ROI size and FC-SC correlation. The authors go on to look at structural connectivity amongst more polarized or less polarized neurons, and suggest that stronger FC-SC correlations are driven by more polarized neurons.
 
 Strengths:
 
 (1) The result that larger mesoscale ROIs have a higher correlation with structural data is interesting. This has been previously discussed in Drosophila in Turner et al., 2021, but here it is quantified more extensively.
 
 (2) The quantification of neuron polarization (PPSSI) as applied to these structural data is a promising approach for quantifying differences in spatial synapse distribution.
 
 Weaknesses:
 
 One should not score noise/nuisance removal methods solely by their impact on FC-SC correlation values, because we do not know a priori that direct structural connections correspond with strong functional correlations. In fact, work in C. elegans, where we have access to both a connectome and neuron-resolution functional data, suggests that this relationship is weak (Yemini et al., 2021; Randi et al., 2023). Similarly, I don't think it's appropriate to tune the confidence scores on the EM datasets using FC-SC correlations as an output metric.
 
 Any discussion of FC-SC comparisons should include an analysis of excitatory/inhibitory neurotransmitters, which are available in the fly connectome dataset. However, here the authors do not perform any analyses with neurotransmitter information. Comparisons between fly and human MRI data are also premature here. Firstly, the fly connectomes, which are derived from neuron-scale EM reconstructions, are a qualitatively different kind of data from human connectomes, which are derived from DSI imaging of large-scale tracts. Likewise, calcium data and fMRI data are very different functional data acquisition methods-the fact that similar processing steps can be used on time-series data does not make them surprisingly similar, and does not in my view, constitute evidence of "similar design concepts."
 
 The comparison of FlyEM/FlyWire connectomes concludes that differences are more likely a result of data processing than of inter-individual variability. If this is the case, the title should not claim that the manuscript covers individual variability. The analysis of the wedge-AVLP neuron strikes me as highly speculative, given that the alignment precision between the connectome and the functional data is around 5 microns (Brezovec* et al, PNAS 2024).
 
 Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.07.01.662601v1
www.biorxiv.org www.biorxiv.org

The RAB27A effector SYTL5 regulates mitophagy and mitochondrial metabolism

4
1. Public_Reviews 09 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This study by Lapao et al. uncovers a novel role for the Rab27A effector SYTL5 in regulating mitochondrial function and mitophagy under hypoxic conditions. Using a range of imaging and functional assays, the authors demonstrate that SYTL5 localizes to mitochondria in a Rab27A-dependent manner and impacts mitochondrial respiration and metabolic reprogramming. While the findings are solid and valuable in the area of cancer biology, further mechanistic clarity and improved imaging would strengthen the conclusions.
  
  Summary
2. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  In this study, Ana Lapao et al. investigated the roles of Rab27 effector SYTL5 in cellular membrane trafficking pathways. The authors found that SYTL5 localizes to mitochondria in a Rab27A-dependent manner. They demonstrated that SYTL5-Rab27A positive vesicles containing mitochondrial material are formed under hypoxic conditions, thus they speculate that SYTL5 and Rab27A play roles in mitophagy. They also found that both SYTL5 and Rab27A are important for normal mitochondrial respiration. Cells lacking SYTL5 undergo a shift from mitochondrial oxygen consumption to glycolysis which is a common process known as the Warburg effect in cancer cells. Based on cancer patient database, the author noticed that low SYTL5 expression is related to reduced survival for adrenocortical carcinoma patients, indicating SYTL5 could be a negative regulator of the Warburg effect and potentially tumorigenesis.
  
  Strengths:
  
  The authors take advantages of multiple techniques and novel methods to perform the experiments.
  
  (1) Live-cell imaging revealed that stably inducible expression of SYTL5 co-localized with filamentous structures positive for mitochondria. This result was further confirmed by using correlative light and EM (CLEM) analysis and western blotting from purified mitochondrial fraction.
  
  (2) In order to investigate whether SYTL5 and RAB27A are required for mitophagy in hypoxic conditions, two established mitophagy reporter U2OS cell lines were used to analyze the autophagic flux.
  
  Weaknesses:
  
  This study revealed a potential function of SYTL5 in mitophagy and mitochondrial metabolism. However, the mechanistic evidence that establishes the relationship between SYTL5/Rab27A and mitophagy is insufficient. The involvement of SYTL5 in ACC needs more investigation. Furthermore, images and results supporting the major conclusions need to be improved.
  
  Comments on revisions: The authors did not revise the paper as suggested.
  
  Review 1
3. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The authors provide convincing evidence that Rab27 and STYL5 work together to regulate mitochondrial activity and homeostasis.
  
  Strengths:
  
  The development of models which allow the function to be dissected, and the rigous approach and testing of mitochondrial activity.
  
  This work is carefully done, and supports the importance of the roles of Rab27A and STYL5.
  
  Review 2
4. Public_Reviews 09 Oct 2025
  
  in eLife
  
  Reviewer #3 (Public review):
  
  In the manuscript by Lapao et al., the authors uncover a role for the RAB27A effector protein SYTL5 in regulating mitochondrial function and apparent selective turnover of mitochondrial components. The authors find that SYTL5 localizes to mitochondria in a RAB27A dependent way and that loss of SYTL5 (or RAB27A) impairs lysosomal turnover of MTCO1 (but not a matrix-based reporter/other mitochondrial proteins). The authors go on to show that loss of SYTL5 impacts mitochondrial respiration and ECAR and as such may influence the Warburg effect and tumorigenesis. Of relevance here, the authors go on to show that SYTL5 expression is reduced in adrenocortical carcinomas and this correlates with reduced survival rates.
  
  As previously reviewed, this is a very intriguing body of work and reveals a new role for SYTL5/RAB27A at the mitochondria. Unfortunately, it appears that SYTL5 is challenging protein to detect endogenously and the authors' cell lines "comprise a heterogenous pool with high variability", which means that a lot of my original concerns remain. It is still also not clear if the conventional autophagy machinery is required for this pathway, especially if SYTL5/RAB27A mitochondrial recruitment is upstream of this. Hopefully, in future work, the authors (and/or others) will be able to address this and build on the mechanisms of this interesting and potentially important pathway.
  
  Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.12.30.630740v2
www.biorxiv.org www.biorxiv.org

Layers of immunity: Deconstructing the Drosophila effector response

3
1. Public_Reviews 08 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This work provides one of the first important attempts to look at Drosophila immune responses against bacterial, viral, and fungal pathogens in a way that combines the roles of four major arms in immunity (Imd signaling, Toll signaling, phagocytosis, and melanization) rather than studying them separately. The findings are compelling and the tools provided can be used as they are, or built upon, in various contexts.
  
  Summary
2. Public_Reviews 08 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The innate immune system serves as the first line of defense against invading pathogens. Four major immune-specific modules-the Toll pathway, the Imd pathway, melanization, and phagocytosis-play critical roles in orchestrating the immune response. Traditionally, most studies have focused on the function of individual modules in isolation. However, in recent years, it has become increasingly evident that effective immune defense requires intricate interactions among these pathways.
  
  Despite this growing recognition, the precise roles, timing, and interconnections of these immune modules remain poorly understood. Moreover, addressing these questions represents a major scientific undertaking.
  
  Strengths:
  
  In this manuscript, Ryckebusch et al. systematically evaluate both the individual and combined contributions of these four immune modules to host defense against a range of pathogens. Their findings significantly enhance our understanding of the layered architecture of innate immunity.
  
  Review 1
3. Public_Reviews 08 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  In this work, the authors take a holistic view at the Drosophila immunity by selecting four major components of fly immunity often studied separately (Toll signaling, Imd signaling, phagocytosis and melanization), and studying their combinatory effects on the efficiency of the immune response. They achieve this by using fly lines mutant for one of these components, or modules, as well as for a combination of them, and testing the survival of these flies upon infection with a plethora of pathogens (bacterial, viral and fungal).
  
  Strengths:
  
  It is clear that this manuscript has required a large amount of hands-on work, considering the number of pathogens, mutations and timepoints tested. In my opinion, this work is a very welcome addition to the literature on fly immune responses, which obviously do not occur one type of a response at a time, but in parallel, subsequently and/or are interconnected. I find that the major strength of this work is the overall concept, which is made possible by the mutations designed to target the specific immune function of each module, without effects on other functions. I believe that the combinatory mutants will be of use for the fly community and enable further studies of interplay of these components of immune response in various settings.
  
  To control for the effects arising from the genetic variation other than the intended mutations, the mutants have been backcrossed into a widely used, isogenized Drosophila strain called w1118. Therefore, the differences accounted for by the genotype are controlled.
  
  I also appreciate that the authors have investigated the two possible ways of dealing with an infection: tolerance and resistance, and how the modules play into those.
  
  Weaknesses:
  
  While controlling for the background effects is vital, the w1118 background is problematic (an issue not limited to this manuscript) because of the wide effects of the white mutation on several phenotypes (also other than eye color/eyesight). It is a possibility that the mutation influences the functionality of the immune response components. I acknowledge that it is not reasonable to ask for data in different backgrounds better representing a "wild type" fly, but I think this matter should be brought up and discussed.
  
  The whole study has been conducted on male flies. Immune responses show quite extensive sex-specific variation across a variety of species studied, also in the fly. But the reasons for this variation are not fully understood. Therefore, I suggest that the authors would conduct a subset of experiments on female flies to see if the findings apply to both sexes, especially the infection-specificity of the module combinations.
  
  Comments on the revised manuscript:
  
  I appreciate the author's responses to the points I raised and the additional work they have conducted. The authors have now discussed the possible background effect and added an experiment on female flies showing that the module function is applicable to both sexes.
  
  Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.03.28.645959v2
www.biorxiv.org www.biorxiv.org

Emergence of ion-channel mediated electrical oscillations in Escherichia coli biofilms

4
1. Public_Reviews 08 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This potentially valuable study presents claims of evidence for coordinated membrane potential oscillations in E. coli biofilms that can be linked to a putative K+ channel and that may serve to enhance photo-protection. The finding of waves of membrane potential would be of interest to a wide audience from molecular biology to microbiology and physical biology. Unfortunately, a major issue is that it is unclear whether the dye used can act as a Nernstian membrane potential dye in E. coli. The arguments of the authors, who largely ignore previously published contradictory evidence, are not adequate in that they do not engage with the fact that the dye behaves in their hands differently than in the hands of others. In addition, the lack of proper validation of the experimental method including key control experiments leaves the evidence incomplete.
 
 Summary
2. Public_Reviews 08 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public Review):
 
 (1) Significance of the findings:
 
 Cell-to-cell communication is essential for higher functions in bacterial biofilms. Electrical signals have proven effective in transmitting signals across biofilms. These signals are then used to coordinate cellular metabolisms or to increase antibiotic tolerance. Here, the authors have reported for the first time coordinated oscillation of membrane potential in E. coli biofilms that may have a functional role in photoprotection.
 
 (2) Strengths of the manuscript:
 
 - The authors report original data. - For the first time, they showed that coordinated oscillations in membrane potential occur in E. Coli biofilms. - The authors revealed a complex two-phase dynamic involving distinct molecular response mechanisms. - The authors developed two rigorous models inspired by 1) Hodgkin-Huxley model for the temporal dynamics of membrane potential and 2) Fire-Diffuse-Fire model for the propagation of the electric signal. - Since its discovery by comparative genomics, the Kch ion channel has not been associated with any specific phenotype in E. coli. Here, the authors proposed a functional role for the putative gated-voltage-gated K+ ion channel (Kch channel) : enhancing survival under photo-toxic conditions.
 
 (3) Weakness:
 
 - Contrarily to what is stated in the abstract, the group of B. Maier has already reported collective electrical oscillations in the Gram-negative bacterium Neisseria gonorrhoeae (Hennes et al., PLoS Biol, 2023). - The data presented in the manuscript are not sufficient to conclude on the photo-protective role of the Kch channel. The authors should perform the appropriate control experiments related to Fig4D,E, i.e. reproduce these experiments without ThT to rule out possible photo-conversion effects on ThT that would modify its toxicity. In addition, it looks like the data reported on Fig 4E are extracted from Fig 4D. If this is indeed the case, it would be more conclusive to report the percentage of PI-positive cells in the population for each condition. This percentage should be calculated independently for each replicate. The authors should then report the average value and standard deviation of the percentage of dead cells for each condition. - Although Fig 4A clearly shows that light stimulation has an influence on the dynamics of ThT signal in the biofilm, it is important to rule out possible contributions of other environmental variations that occur when the flow is stopped at the onset of light stimulation. I understand that for technical reasons, the flow of fresh medium must be stopped for the sake of imaging. Therefore, I suggest to perform control experiments consisting in stopping the flow at different time intervals before image acquisition (30min or 1h before). If there is no significant contribution from environmental variations due to medium perfusion arrest, the dynamics of ThT signal must be unchanged regardless of the delay between flow stop and the start of light stimulation. - To precise the role of K+ in the habituation response, I suggest using the ionophore valinomycin at sub-inhibitory concentrations (5 or 10µM). It should abolish the habituation response. In addition, the Kch complementation experiment exhibits a sharp drop after the first peak but on a single point. It would be more convincing to increase the temporal resolution (1min->10s) to show that there are indeed a first and a second peak. Finally, the high concentration (100µM) of CCCP used in this study completely inhibits cell activity. Therefore, it is not surprising that no ThT dynamics was observed upon light stimulation at such concentration of CCCP. - Since TMRM signal exhibits a linear increase after the first response peak (Supp Fig1D), I recommend to mitigate the statement at line 78. - Electrical signal propagation is an important aspect of the manuscript. However, a detailed quantitative analysis of the spatial dynamics within the biofilm is lacking. At minima, I recommend to plot the spatio-temporal diagram of ThT intensity profile averaged along the azimuthal direction in the biofilm. In addition, it is unclear if the electrical signal propagates within the biofilm during the second peak regime, which is mediated by the Kch channel: I have plotted the spatio-temporal diagram for Video S3 and no electrical propagation is evident at the second peak. In addition, the authors should provide technical details of how R^2(t) is measured in the first regime (Fig 7E). - In the series of images presented in supplementary Figure 4A, no wavefront is apparent. Although the microscopy technics used in this figure differs from other images (like in Fig2), the wavefront should be still present. In addition, there is no second peak in confocal images as well (Supp Fig4B) . - Many important technical details are missing (e.g. biofilm size, R^2, curvature and 445nm irradiance measurements). The description of how these quantitates are measured should be detailed in the Material & Methods section. - Fig 5C: The curve in Fig 5D seems to correspond to the biofilm case. Since the model is made for single cells, the curve obtained by the model should be compared with the average curve presented in Fig 1B (i.e. single cell experiments). - For clarity, I suggest to indicate on the panels if the experiments concern single cell or biofilm experiments. Finally, please provide bright-field images associated to ThT images to locate bacteria. - In Fig 7B, the plateau is higher in the simulations than in the biofilm experiments. The authors should add a comment in the paper to explain this discrepancy.
 
 Review 1
3. Public_Reviews 08 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public Review):
 
 The authors use ThT dye as a Nernstian potential dye in E. coli. Quantitative measurements of membrane potential using any cationic indicator dye are based on the equilibration of the dye across the membrane according to Boltzmann's law.
 
 Ideally, the dye should have high membrane permeability to ensure rapid equilibration. Others have demonstrated that E.coli cells in the presence of ThT do not load unless there is blue light present, that the loading profile does not look like it is expected for a cationic Nernstian dye. They also show that the loading profile of the dye is different for E.coli cells deleted for the TolC pump. I, therefore, objected to interpreting the signal from the ThT as a Vm signal when used in E.coli. Nothing the authors have said has suggested that I should be changing this assessment.
 
 Specifically, the authors responded to my concerns as follows:
 
 (1) 'We are aware of this study, but believe it to be scientifically flawed. We do not cite the article because we do not think it is a particularly useful contribution to the literature.' This seems to go against ethical practices when it comes to scientific literature citations. If the authors identified work that handles the same topic they do, which they believe is scientifically flawed, the discussion to reflect that should be included.
 
 (2)'The Pilizota group invokes some elaborate artefacts to explain the lack of agreement with a simple Nernstian battery model. The model is incorrect not the fluorophore.' It seems the authors object to the basic principle behind the usage of Nernstian dyes. If the authors wish to use ThT according to some other model, and not as a Nernstian indicator, they need to explain and develop that model. Instead, they state 'ThT is a Nernstian voltage indicator' in their manuscript and expect the dye to behave like a passive voltage indicator throughout it.
 
 (3)'We think the proton effect is a million times weaker than that due to potassium i.e. 0.2 M K+ versus 10-7 M H+. We can comfortably neglect the influx of H+ in our experiments.' I agree with this statement by the authors. At near-neutral extracellular pH, E.coli keeps near-neutral intracellular pH, and the contribution from the chemical concentration gradient to the electrochemical potential of protons is negligible. The main contribution is from the membrane potential. However, this has nothing to do with the criticism to which this is the response of the authors. The criticism is that ThT has been observed not to permeate the cell without blue light. The blue light has been observed to influence the electrochemical potential of protons (and given that at near-neutral intracellular and extracellular pH this is mostly the membrane potential, as authors note themselves, we are talking about Vm effectively). Thus, two things are happening when one is loading the ThT, not just expected equilibration but also lowering of membrane potential. The electrochemical potential of protons is coupled via the membrane potential to all the other electrochemical potentials of ions, including the mentioned K+.
 
 (4) 'The vast majority of cells continue to be viable. We do not think membrane damage is dominating.' In response to the question on how the authors demonstrated TMRM loading and in which conditions (and while reminding them that TMRM loading profile in E.coli has been demonstrated in Potassium Phosphate buffer). The request was to demonstrate TMRM loading profile in their condition as well as to show that it does not depend on light. Cells could still be viable, as membrane permeabilisation with light is gradual, but the loading of ThT dye is no longer based on simple electrochemical potential (of the dye) equilibration.
 
 (5) On the comment on the action of CCCP with references included, authors include a comment that consists of phrases like 'our understanding of the literature' with no citations of such literature. Difficult to comment further without references.
 
 (6) 'Shielding would provide the reverse effect, since hyperpolarization begins in the dense centres of the biofilms. For the initial 2 hours the cells receive negligible blue light. Neither of the referee's comments thus seem tenable.' The authors have misunderstood my comment. I am not advocating shielding (I agree that this is not it) but stating that this is not the only other explanation for what they see (apart from electrical signaling). The other I proposed is that the membrane has changed in composition and/or the effective light power the cells can tolerate. The authors comment only on the light power (not convincingly though, giving the number for that power would be more appropriate), not on the possible changes in the membrane permeability.
 
 (7) 'The work that TolC provides a possible passive pathway for ThT to leave cells seems slightly niche. It just demonstrates another mechanism for the cells to equilibrate the concentrations of ThT in a Nernstian manner i.e. driven by the membrane voltage.' I am not sure what the authors mean by another mechanism. The mechanism of action of a Nernstian dye is passive equilibration according to the electrochemical potential (i.e. until the electrochemical potential of the dye is 0).
 
 (8) 'In the 70 years since Hodgkin and Huxley first presented their model, a huge number of similar models have been proposed to describe cellular electrophysiology. We are not being hyperbolic when we state that the HH models for excitable cells are like the Schrödinger equation for molecules. We carefully adapted our HH model to reflect the currently understood electrophysiology of E. coli.'
 
 I gave a very concrete comment on the fact that in the HH model conductivity and leakage are as they are because this was explicitly measured. The authors state that they have carefully adopted their model based on what is currently understood for E.coli electrophysiology. It is not clear how. HH uses gKn^4 based on Figure2 here https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1392413/pdf/jphysiol01442-0106.pdf, i.e. measured rise and fall of potassium conductance on msec time scales. I looked at the citation the authors have given and found a resistance of an entire biofilm of a given strain at 3 applied voltages. So why n^4 based on that? Why does unknown current have gqz^4 form? Sodium conductance in HH is described by m^3hgNa (again based on detailed conductance measurements), so why unknown current in E.coli by gQz^4? Why leakage is in the form that it is, based on what measurement?
 
 Throughout their responses, the authors seem to think that collapsing the electrochemical gradient of protons is all about protons, and this is not the case. At near neutral inside and outside pH, the electrochemical potential of protons is simply membrane voltage. And membrane voltage acts on all ions in the cell.
 
 Authors have started their response to concrete comments on the usage of ThT dye with comments on papers from my group that are not all directly relevant to this publication. I understand that their intention is to discredit a reviewer but given that my role here is to review this manuscript, I will only address their comments to the publications/part of publications that are relevant to this manuscript and mention what is not relevant.
 
 Publications in the order these were commented on.
 
 (1) In a comment on the paper that describes the usage of ThT dye as a Nernstian dye authors seem to talk about a model of an entire active cell. 'Huge oscillations occur in the membrane potentials of E. coli that cannot be described by the SNB model.' The two have nothing to do with each other. Nernstian dye equilibrates according to its electrochemical potential. Once that happens it can measure the potential (under the assumption that not too much dye has entered and thus lowered too much the membrane potential under measurement). The time scale of that is important, and the dye can only measure processes that are slower than that equilibration. If one wants to use a dye that acts under a different model, first that needs to be developed, and then coupled to any other active cell model.
 
 (2) The part of this paper that is relevant is simply the usage of TMRM dye. It is used as Nernstian dye, so all the above said applies. The rest is a study of flagellar motor.
 
 (3) The authors seem to not understand that the electrochemical potential of protons is coupled to the electrochemical potentials of all other ions, via the membrane potential. In the manuscript authors talk about, PMF~Vm, as DeltapH~0. Other than that this publication is not relevant to their current manuscript.
 
 (4) The manuscript in fact states precisely that PMF cannot be generated by protons only and some other ions need to be moved out for the purpose. In near neutral environment it stated that these need to be cations (K+ e.g.). The model used in this manuscript is a pump-leak model. Neither is relevant for the usage of ThT dye.
 
 Further comments include, along the lines of:
 
 'The editors stress the main issue raised was a single referee questioning the use of ThT as an indicator of membrane potential. We are well aware of the articles by the Pilizota group and we believe them to be scientifically flawed. The authors assume there are no voltage-gated ion channels in E. coli and then attempt to explain motility data based on a simple Nernstian battery model (they assume E. coli are unexcitable matter). This in turn leads them to conclude the membrane dye ThT is faulty, when in fact it is a problem with their simple battery model.'
 
 The only assumption made when using a cationic Nernstian dye is that it equilibrates passively across the membrane according to its electrochemical potential. As it does that, it does lower the membrane potential, which is why as little as possible is added so that this is negligible. The equilibration should be as fast as possible, but at the very least it should be known, as no change in membrane potential can be measured that is faster than that.
 
 This behaviour should be orthogonal to what the cell is doing, it is a probe after all. If the cell is excitable, a Nernstian dye can be used, as long as it's still passively equilibrating and doing so faster than any changes in membrane potential due to excitations of the cells. There are absolutely no assumptions made on the active system that is about to be measured by this expected behaviour of a Nernstian dye. And there shouldn't be, it is a probe. If one wants to use a dye that is not purely Nernstian that behaviour needs to be described and a model proposed. As far as I can find, authors do no such thing.
 
 There is a comment on the use of a flagellar motor as a readout of PMF, stating that the motor can be stopped by YcgR citing the work from 2023. Indeed, there is a range of references such as https://doi.org/10.1016/j.molcel.2010.03.001 that demonstrate this (from around 2000-2010 as far as I am aware). The timescale of such slowdown is hours (see here Figure 5 https://www.cell.com/cell/pdf/S0092-8674(10)00019-X.pdf). Needless to say, the flagellar motor when used as a probe, needs to stay that in the conditions used. Thus one should always be on the lookout at any other such proteins that could slow it down and we are not aware of yet or make the speed no longer proportional to the PMF. In the papers my group uses the motor the changes are fast, often reversible, and in the observation window of 30min. They are also the same with DeltaYcgR strain, which we have not included as it seemed given the time scales it's obvious, but certainly can in the future (as well as stay vigilant on any conditions that would render the motor a no longer suitable probe for PMF).
 
 Review 2
4. Public_Reviews 08 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public Review):
 
 This manuscript by Akabuogu et al. investigates membrane potential dynamics in E. coli. Membrane potential fluctuations have been observed in bacteria by several research groups in recent years, including in the context of bacterial biofilms where they have been proposed to play a role in cellular communication. Here, these authors investigate membrane potential in E. coli, in both single cells and biofilms. I have reviewed the revised manuscript provided by the authors, as well as their responses to the initial reviews; my opinion about the manuscript is largely unchanged. I have focused my public review on those issues that I believe to be most pressing, with additional comments included in the review to authors. Although these authors are working in an exciting research area, the evidence they provide for their claims is inadequate, and several key control experiments are still missing. In some cases, the authors allude to potentially relevant data in their responses to the initial reviews, but unfortunately these data are not shown. Furthermore, I cannot identify any traveling wavefronts in the data included in this manuscript. In addition to the challenges associated with the use of Thioflavin-T (ThT) raised by the second reviewer, these caveats make the work presented in this manuscript difficult to interpret.
 
 First, some of the key experiments presented in the paper lack required controls:
 
 (1) This paper asserts that the observed ThT fluorescence dynamics are induced by blue light. This is a fundamental claim in the paper, since the authors go on to argue that these dynamics are part of a blue light response. This claim must be supported by the appropriate negative control experiment measuring ThT fluorescence dynamics in the absence of blue light- if this idea is correct, these dynamics should not be observed in the absence of blue light exposure. If this experiment cannot be performed with ThT since blue light is used for its excitation, TMRM can be used instead.
 
 In response to this, the authors wrote that "the fluorescent baseline is too weak to measure cleanly in this experiment." If they observe no ThT signal above noise in their time lapse data in the absence of blue light, this should be reported in the manuscript- this would be a satisfactory negative control. They then wrote that "It appears the collective response of all the bacteria hyperpolarization at the same time appears to dominate the signal." I am not sure what they mean by this- perhaps that ThT fluorescence changes strongly only in response to blue light? This is a fundamental control for this experiment that ought to be presented to the reader.
 
 (2) The authors claim that a ∆kch mutant is more susceptible to blue light stress, as evidenced by PI staining. The premise that the cells are mounting a protective response to blue light via these channels rests on this claim. However, they do not perform the negative control experiment, conducting PI staining for WT the ∆kch mutant in the absence of blue light. In the absence of this control it is not possible to rule out effects of the ∆kch mutation on overall viability and/or PI uptake. The authors do include a growth curve for comparison, but planktonic growth is a very different context than surface-attached biofilm growth. Additionally, the ∆kch mutation may have impacts on PI permeability specifically that are not addressed by a growth curve. The negative control experiment is of key importance here.
 
 Second, the ideas presented in this manuscript rely entirely on analysis of ThT fluorescence data, specifically a time course of cellular fluorescence following blue light treatment. However, alternate explanations for and potential confounders of the observed dynamics are not sufficiently addressed:
 
 (1) Bacterial cells are autofluorescent, and this fluorescence can change significantly in response to stress (e.g. blue light exposure). To characterize and/or rule out autofluorescence contributions to the measurement, the authors should present time lapse fluorescence traces of unstained cells for comparison, acquired under the same imaging conditions in both wild type and ∆kch mutant cells. In their response to reviewers the authors suggested that they have conducted this experiment and found that the autofluorescence contribution is negligible, which is good, but these data should be included in the manuscript along with a description of how these controls were conducted.
 
 (2) Similarly, in my initial review I raised a concern about the possible contributions of photobleaching to the observed fluorescence dynamics. This is particularly relevant for the interpretation of the experiment in which catalase appears to attenuate the decay of the ThT signal; this attenuation could alternatively be due to catalase decreasing ThT photobleaching. In their response, the authors indicated that photobleaching is negligible, which would be good, but they do not share any evidence to support this claim. Photobleaching can be assessed in this experiment by varying the light dosage (illumination power, frequency, and/or duration) and confirming that the observed fluorescence dynamics are unaffected.
 
 Third, the paper claims in two instances that there are propagating waves of ThT fluorescence that move through biofilms, but I do not observe these waves in any case:
 
 (1) The first wavefront claim relates to small cell clusters, in Fig. 2A and Video S2 and S3 (with Fig. 2A and Video S2 showing the same biofilm.) I simply do not see any evidence of propagation in either case- rather, all cells get brighter and dimmer in tandem. I downloaded and analyzed Video S3 in several ways (plotting intensity profiles for different regions at different distances from the cluster center, drawing a kymograph across the cluster, etc.) and in no case did I see any evidence of a propagating wavefront. (I attempted this same analysis on the biofilm shown in Fig. 2A and Video S2 with similar results, but the images shown in the figure panels and especially the video are still both so saturated that the quantification is difficult to interpret.) If there is evidence for wavefronts, it should be demonstrated explicitly by analysis of several clusters. For example, a figure of time-to-peak vs. position in the cluster demonstrating a propagating wave would satisfy this. Currently, I do not see any wavefronts in this data.
 
 (2) The other wavefront claim relates to biofilms, and the relevant data is presented in Fig. S4 (and I believe also in what is now Video S8, but no supplemental video legends are provided, and this video is not cited in text.) As before, I cannot discern any wavefronts in the image and video provided; Reviewer 1 was also not able to detect wave propagation in this video by kymograph. Some mean squared displacements are shown in Fig. 7. As before, the methods for how these were obtained are not clearly documented either in this manuscript or in the BioRXiv preprint linked in the initial response to reviewers, and since wavefronts are not evident in the video it is hard to understand what is being measured here- radial distance from where? (The methods section mentions radial distance from the substrate, this should mean Z position above the imaging surface, and no wavefronts are evident in Z in the figure panels or movie.) Thus, clear demonstration of these wavefronts is still missing here as well.
 
 Fourth, I have some specific questions about the study of blue light stress and the use of PI as a cell viability indicator:
 
 (1) The logic of this paper includes the premise that blue light exposure is a stressor under the experimental conditions employed in the paper. Although it is of course generally true that blue light can be damaging to bacteria, this is dependent on light power and dosage. The control I recommended above, staining cells with PI in the presence and absence of blue light, will also allow the authors to confirm that this blue light treatment is indeed a stressor- the PI staining would be expected to increase in the presence of blue light if this is so.
 
 (2) The presence of ThT may complicate the study of the blue light stress response, since ThT enhances the photodynamic effects of blue light in E. coli (Bondia et al. 2021 Chemical Communications). The authors could investigate ThT toxicity under these conditions by staining cells with PI after exposing them to blue light with or without ThT staining.
 
 (3) In my initial review, I wrote the following: "In Figures 4D - E, the interpretation of this experiment can be confounded by the fact that PI uptake can sometimes be seen in bacterial cells with high membrane potential (Kirchhoff & Cypionka 2017 J Microbial Methods); the interpretation is that high membrane potential can lead to increased PI permeability. Because the membrane potential is largely higher throughout blue light treatment in the ∆kch mutant (Fig. 3[BC]), this complicates the interpretation of this experiment." In their response, the authors suggested that these results are not relevant in this case because "In our experiment methodology, cell death was not forced on the cells by introducing an extra burden or via anoxia." However, the logic of the paper is that the cells are in fact dying due to an imposed external stressor, which presumably also confers an increased burden as the cells try to deal with the stress. Instead, the authors should simply use a parallel method to confirm the results of PI staining. For example, the experiment could be repeated with other stains, or the viability of blue light-treated cells could be addressed more directly by outgrowth or colony-forming unit assays.
 
 The CFU assay suggested above has the additional advantage that it can also be performed on planktonic cells in liquid culture that are exposed to blue light. If, as the paper suggests, a protective response to blue light is being coordinated at the biofilm level by these membrane potential fluctuations, the WT strain might be expected to lose its survival advantage vs. the ∆kch mutant in the absence of a biofilm.
 
 Fifth, in several cases the data are presented in a way that are difficult to interpret, or the paper makes claims that are different to observe in the data:
 
 (1) The authors suggest that the ThT and TMRM traces presented in Fig. S1D have similar shapes, but this is not obvious to me- the TMRM curve has very little decrease after the initial peak and only a modest, gradual rise thereafter. The authors suggest that this is due to increased TMRM photobleaching, but I would expect that photobleaching should exacerbate the signal decrease after the initial peak. Since this figure is used to support the use of ThT as a membrane potential indicator, and since this is the only alternative measurement of membrane potential presented in text, the authors should discuss this discrepancy in more detail.
 
 (2) The comparison of single cells to microcolonies presented in figures 1B and D still needs revision:
 
 First, both reviewer 1 and I commented in our initial reviews that the ThT traces, here and elsewhere, should not be normalized- this will help with the interpretation of some of the claims throughout the manuscript.
 
 Second, the way these figures are shown with all traces overlaid at full opacity makes it very difficult to see what is being compared. Since the point of the comparison is the time to first peak (and the standard deviation thereof), histograms of the distributions of time to first peak in both cases should be plotted as a separate figure panel. Third, statistical significance tests ought to be used to evaluate the statistical strength of the comparisons between these curves. The authors compare both means and standard deviations of the time to first peak, and there are appropriate statistical tests for both types of comparisons.
 
 (3) The authors claim that the curve shown in Fig. S4B is similar to the simulation result shown in Fig. 7B. I remain unconvinced that this is so, particularly with respect to the kinetics of the second peak- at least it seems to me that the differences should be acknowledged and discussed. In any case, the best thing to do would be to move Fig. S4B to the main text alongside Fig. 7B so that the readers can make the comparison more easily.
 
 (4) As I wrote in my first review, in the discussion of voltage-gated calcium channels, the authors refer to "spiking events", but these are not obvious in Figure S3E. Although the fluorescence intensity changes over time, these fluctuations cannot be distinguished from measurement noise. A no-light control could help clarify this.
 
 (5) In the lower irradiance conditions in Fig. 4A, the ThT dynamics are slower overall, and it looks like the ThT intensity is beginning to rise at the end of the measurement. The authors write that no second peak is observed below an irradiance threshold of 15.99 µW/mm2. However, could a more prominent second peak be observed in these cases if the measurement time was extended? Additionally, the end of these curves looks similar to the curve in Fig. S4B, in which the authors write that the slow rise is evidence of the presence of a second peak, in contrast to their interpretation here.
 
 Additional considerations:
 
 (1) The analysis and interpretation of the first peak, and particularly of the time-to-fire data is challenging throughout the manuscript the time resolution of the data set is quite limited. It seems that a large proportion of cells have already fired after a single acquisition frame. It would be ideal to increase the time resolution on this measurement to improve precision. This could be done by imaging more quickly, but that would perhaps necessitate more blue light exposure; an alternative is to do this experiment under lower blue light irradiance where the first spike time is increased (Figure 4A).
 
 (2) The authors suggest in the manuscript that "E. coli biofilms use electrical signalling to coordinate long-range responses to light stress." In addition to the technical caveats discussed above, I am missing a discussion about what these responses might be. What constitutes a long-range response to light stress, and are there known examples of such responses in bacteria?
 
 (3) The presence of long-range blue light responses can also be interrogated experimentally, for example, by repeating the Live/Dead experiment in planktonic culture or the single-cell condition. If the protection from blue light specifically emerges due to coordinated activity of the biofilm, the ∆kch mutant would not be expected to show a change in Live/Dead staining in non-biofilm conditions. The CFU experiment I mentioned above could also implicate coordinated long-range responses specifically, if biofilms and liquid culture experiments can be compared (although I know that recovering cells from biofilms is challenging.)
 
 4. At the end of the results section, the authors suggest a critical biofilm size of only 4 μm for wavefront propagation (not much larger than a single cell!) The authors show responses for various biofilm sizes in Fig. 2C, but these are all substantially larger (and this figure also does not contain wavefront information.) Are there data for cell clusters above and below this size that could support this claim more directly?
 
 (5) In Fig. 4C, the overall trajectories of extracellular potassium are indeed similar, but the kinetics of the second peak of potassium are different than those observed by ThT (it rises minutes earlier)- is this consistent with the idea that Kch is responsible for that peak? Additionally, the potassium dynamics also include the first ThT peak- is this surprising given that the Kch channel has no effect on this peak according to the model?
 
 Detailed comments:
 
 Why are Fig. 2A and Video S2 called a microcluster, whereas Video S3, which is smaller, is called a biofilm?
 
 "We observed a spontaneous rapid rise in spikes within cells in the center of the biofilm" (Line 140): What does "spontaneous" mean here?
 
 "This demonstrates that the ion-channel mediated membrane potential dynamics is a light stress relief process.", "E. coli cells employ ion-channel mediated dynamics to manage ROS-induced stress linked to light irradiation." (Line 268 and the second sentence of the Fig. 4F legend): This claim is not well-supported. There are several possible interpretations of the catalase experiment (which should be discussed); this experiment perhaps suggests that ROS impacts membrane potential but does not indicate that these membrane potential fluctuations help the cells respond to blue light stress. The loss of viability in the ∆kch mutant might indicate a link between these membrane potential experiments and viability, but it is hard to interpret without the no light controls I mention above.
 
 "The model also predicts... the external light stress" (Lines 338-341): Please clarify this section. Where does this prediction arise from in the modeling work? Second, I am not sure what is meant by "modulates the light stress" or "keeps the cell dynamics robust to the intensity of external light stress" (especially since the dynamics clearly vary with irradiance, as seen in Figure 4A).
 
 "We hypothesized that E. coli not only modulates the light-induced stress but also handles the increase of the ROS by adjusting the profile of the membrane potential dynamics" (Line 347): I am not sure what "handles the ROS by adjusting the profile of the membrane potential dynamics" means. What is meant by "handling" ROS? Is the hypothesis that membrane potential dynamics themselves are protective against ROS, or that they induce a ROS-protective response downstream, or something else? Later the authors write that changes in the response to ROS in the model agree with the hypothesis, but just showing that ROS impacts the membrane potential does not seem to demonstrate that this has a protective effect against ROS.
 
 "Mechanosensitive ion channels (MS) are vital for the first hyperpolarization event in E. coli." (Line 391): This is misleading- mechanosensitive ion channels totally ablate membrane potential dynamics, they don't have a specific effect on the first hyperpolarization event. The claim that mechanonsensitive ion channels are specifically involved in the first event also appears in the abstract.
 
 Also, the apparent membrane potential is much lower even at the start of the experiment in these mutants (Fig. 6C-D)- is this expected? This seems to imply that these ion channels also have a blue light-independent effect.
 
 Throughout the paper, there are claims that the initial ThT spike is involved in "registering the presence of the light stress" and similar. What is the evidence for this claim?
 
 "We have presented much better quantitative agreement of our model with the propagating wavefronts in E. coli biofilms..." (Line 619): It is not evident to me that the agreement between model and prediction is "much better" in this work than in the cited work (reference 57, Hennes et al. 2023). The model in Figure 4 of ref. 57 seems to capture the key features of their data.
 
 In methods, "Only cells that are hyperpolarized were counted in the experiment as live" (Line 745): what percentage of cells did not hyperpolarize in these experiments?
 
 Some indication of standard deviation (error bars or shading) should be added to all figures where mean traces are plotted.
 
 Video S8 is very confusing- why does the video play first forwards and then backwards? It is easy to misinterpret this as a rise in the intensity at the end of the experiment.
 
 Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2022.11.09.515771v3
www.biorxiv.org www.biorxiv.org

Single-cell profiling of trabecular meshwork identifies mitochondrial dysfunction in a glaucoma model that is protected by vitamin B3 treatment

4
1. Public_Reviews 08 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This is a fundamental study that provides a detailed single-cell transcriptomic and epigenomic map of the mouse trabecular meshwor, identifying three distinct trabecular meshwor subtypes with specific functional roles. It links the glaucoma-associated transcription factor LMX1B to mitochondrial regulation in TM3 cells and demonstrates that nicotinamide treatment prevents IOP elevation in Lmx1bV265D/+ mutant mice, highlighting a potential metabolic therapeutic strategy for glaucoma. This convincing work would be further supported by data that link the transcriptional data with mitochondrial functional assays.
 
 Summary
2. Public_Reviews 08 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This study provides a comprehensive single-cell and multiomic characterization of trabecular meshwork (TM) cells in the mouse eye, a structure critical to intraocular pressure (IOP) regulation and glaucoma pathogenesis. Using scRNA-seq, snATAC-seq, immunofluorescence, and in situ hybridization, the authors identify three transcriptionally and spatially distinct TM cell subtypes. The study further demonstrates that mitochondrial dysfunction, specifically in one subtype (TM3), contributes to elevated IOP in a genetic mouse model of glaucoma carrying a mutation in the transcription factor Lmx1b. Importantly, treatment with nicotinamide (vitamin B3), known to support mitochondrial health, prevents IOP elevation in this model. The authors also link their findings to human datasets, suggesting the existence of analogous TM3-like cells with potential relevance to human glaucoma.
 
 Strengths:
 
 The study is methodologically rigorous, integrating single-cell transcriptomic and chromatin accessibility profiling with spatial validation and in vivo functional testing. The identification of TM subtypes is consistent across mouse strains and institutions, providing robust evidence of conserved TM cell heterogeneity. The use of a glaucoma model to show subtype-specific vulnerability, combined with a therapeutic intervention-gives the study strong mechanistic and translational significance. The inclusion of chromatin accessibility data adds further depth by implicating active transcription factors such as LMX1B, a gene known to be associated with glaucoma risk. The integration with human single-cell datasets enhances the potential relevance of the findings to human disease.
 
 Weaknesses:
 
 Although the LMX1B transcription factor is implicated as a key regulator in TM3 cells, its role in directly controlling mitochondrial gene expression is not fully explored. Additional analysis of motif accessibility or binding enrichment near relevant target genes could substantiate this mechanistic link. The therapeutic effect of vitamin B3 is clearly demonstrated phenotypically, but the underlying cellular and molecular mechanisms remain somewhat underdeveloped - for instance, changes in mitochondrial function, oxidative stress markers, or NAD+ levels are not directly measured. While the human relevance of TM3 cells is suggested through marker overlap, more quantitative approaches, such as cell identity mapping or gene signature scoring in human datasets, would strengthen the translational connection.
 
 Overall, this is a compelling and carefully executed study that offers significant advances in our understanding of TM cell biology and its role in glaucoma. The integration of multimodal data, disease modeling, and therapeutic testing represents a valuable contribution to the field. With additional mechanistic depth, the study has the potential to become a foundational resource for future research into IOP regulation and glaucoma treatment.
 
 Review 1
3. Public_Reviews 08 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public review):
 
 Summary:
 
 This elegant study by Tolman and colleagues provides fundamental findings that substantially advance our knowledge of the major cell types within the limbus of the mouse eye, focusing on the aqueous humor outflow pathway. The authors used single-cell and single-nuclei RNAseq to very clearly identify 3 subtypes of the trabecular meshwork (TM) cells in the mouse eye, with each subtype having unique markers and proposed functions. The U. Columbia results are strengthened by an independent replication in a different mouse strain at a separate laboratory (Duke). Bioinformatics analyses of these expression data were used to identify cellular compartments, molecular functions, and biological processes. Although there were some common pathways among the 3 subtypes of TM cells (e.g., ECM metabolism), there also were distinct functions. For example:
 
 • TM1 cell expression supports heavy engagement in ECM metabolism and structure, as well as TGFβ2 signaling.
 
 • TM2 cells were enriched in laminin and pathways involved in phagocytosis, lysosomal function, and antigen expression, as well as End3/VEGF/angiopoietin signaling.
 
 • TM3 cells were enriched in actin binding and mitochondrial metabolism.
 
 They used high-resolution immunostaining and in situ hybridization to show that these 3 TM subtypes express distinct markers and occupy distinct locations within the TM tissue. The authors compared their expression data with other published scRNAseq studies of the mouse as well as the human aqueous outflow pathway. They used ATAC-seq to map open chromatin regions in order to predict transcription factor binding sites. Their results were also evaluated in the context of human IOP and glaucoma risk alleles from published GWAS data, with interesting and meaningful correlations. Although not discussed in their manuscript, their expression data support other signaling pathways/ proteins/ genes that have been implicated in glaucoma, including: TGFβ2, BMP signaling (including involvement of ID proteins), MYOC, actin cytoskeleton (CLANs), WNT signaling, etc.
 
 In addition to these very impressive data, the authors used scRNAseq to examine changes in TM cell gene expression in the mouse glaucoma model of mutant Lmxb1-induced ocular hypertension. In man, LMX1B is associated with Nail-Patella syndrome, which can include the development of glaucoma, demonstrating the clinical relevance of this mouse model. Among the gene expression changes detected, TM3 cells had altered expression of genes associated with mitochondrial metabolism. The authors used their previous experience using nicotinamide to metabolically protect DBA2/J mice from glaucomatous damage, and they hypothesized that nicotinamide supplementation of mutant Lmx1b mice would help restore normal mitochondrial metabolism in the TM and prevent Lmx1b-mediated ocular hypertension. Adding nicotinamide to the drinking water significantly prevented Lmxb1 mutant mice from developing high intraocular pressure. This is a laudable example of dissecting the molecular pathogenic mechanisms responsible for a disease (glaucoma) and then discovering and testing a potential therapy that directly intervenes in the disease process and thereby protects from the disease.
 
 Strengths: There are numerous strengths in this comprehensive study including: • Deep scRNA sequencing that was confirmed by an independent dataset in another mouse strain at another university. • Identification and validation of molecular markers for each mouse TM cell subset along with localization of these subsets within the mouse aqueous outflow pathway. • Rigorous bioinformatics analysis of these data as well as comparison of the current data with previously published mouse and human scRNAseq data. • Correlating their current data with GWAS glaucoma and IOP "hits". • Discovering gene expression changes in the 3 TM subgroups in the mouse mutant Lmx1b model of glaucoma. • Further pursuing the indication of dysfunctional mitochondrial metabolism in TM3 cells from Lmx1b mutant mice to test the efficacy of dietary supplementation with nicotinamide. The authors nicely demonstrate the disease modifying efficacy of nicotinamide in preventing IOP elevation in these Lmx1b mutant mice, preventing the development of glaucoma. These results have clinical implications for new glaucoma therapies.
 
 Weaknesses: • Occasional over-interpretation of data. The authors have used changes in gene expression (RNAseq) to implicate functions and signaling pathways. For example: they have not directly measured "changes in metabolism", "mitochondrial dysfunction" or "activity of Lmx1b". • In their very thorough data set, there is enrichment of or changes in gene expression that support other pathways that have been previously reported to be associated with glaucoma (such as TGFβ2, BMP signaling, actin cytoskeletal organization (CLANs), WNT signaling, ossification, etc. that appears to be a lost opportunity to further enhance the significance of this work.
 
 Review 2
4. Public_Reviews 08 Oct 2025
 
 in eLife
 
 Reviewer #3 (Public review):
 
 Summary:In this study, the authors perform multimodal single-cell transcriptomic and epigenomic profiling of 9,394 mouse TM cells, identifying three transcriptionally distinct TM subtypes with validated molecular signatures. TM1 cells are enriched for extracellular matrix genes, TM2 for secreted ligands supporting Schlemm's canal, and TM3 for contractile and mitochondrial/metabolic functions. The transcription factor LMX1B, previously linked to glaucoma, shows the highest expression in TM3 cells and appears to regulate mitochondrial pathways. In Lmx1bV265D mutant mice, TM3 cells exhibit transcriptional signs of mitochondrial dysfunction associated with elevated IOP. Notably, vitamin B3 treatment significantly mitigates IOP elevation, suggesting a potential therapeutic avenue.
 
 This is an excellent and collaborative study involving investigators from two institutions, offering the most detailed single-cell transcriptomic and epigenetic profiling of the mouse limbal tissues-including both TM and Schlemm's canal (SC), from wild-type and Lmx1bV265D mutant mice. The study defines three TM subtypes and characterizes their distinct molecular signatures, associated pathways, and transcriptional regulators. The authors also compare their dataset with previously published murine and human studies, including those by Van Zyl et al., providing valuable cross-species insights.
 
 Strengths:
 
 (1) Comprehensive dataset with high single-cell resolution (2) Use of multiple bioinformatic and cross-comparative approaches (3) Integration of 3D imaging of TM and SC for anatomical context (4) Convincing identification and validation of three TM subtypes using molecular markers.
 
 Weaknesses:
 
 (1) Insufficient evidence linking mitochondrial dysfunction to TM3 cells in Lmx1bV265D mice: While the identification of TM3 cells as metabolically specialized and Lmx1b-enriched is compelling, the proposed link between Lmx1b mutation and mitochondrial dysfunction remains underdeveloped. It is unclear whether mitochondrial defects are a primary consequence of Lmx1b-mediated transcriptional dysregulation or a secondary response to elevated IOP. Additional evidence is needed to clarify whether Lmx1b directly regulates mitochondrial genes (e.g., via ChIP-seq, motif analysis, or ATAC-seq), or whether mitochondrial changes are downstream effects. Furthermore, the protective effects of nicotinamide (NAM) are interpreted as evidence of mitochondrial involvement, but no direct mitochondrial measurements (e.g., immunostaining, electron microscopy, OCR assays) are provided. It is essential to validate mitochondrial dysfunction in TM3 cells using in vivo functional assays to support the central conclusion of the paper. Without this, the claim that mitochondrial dysfunction drives IOP elevation in Lmx1bV265D mice remains speculative. Alternatively, authors should consider revising their claims that mitochondrial dysfunction in these mice is a central driver of TM dysfunction.
 
 (2) Mechanism of NAM-mediated protection is unclear: The manuscript states that NAM treatment prevents IOP elevation in Lmx1bV265D mice via metabolic support, yet no data are shown to confirm that NAM specifically rescues mitochondrial function. Do NAM-treated TM3 cells show improved mitochondrial integrity? Are reactive oxygen species (ROS) reduced? Does NAM also protect RGCs from glaucomatous damage? Addressing these points would clarify whether the therapeutic effects of NAM are indeed mitochondrial.
 
 (3) Lack of direct evidence that LMX1B regulates mitochondrial genes: While transcriptomic and motif accessibility analyses suggest that LMX1B is enriched in TM3 cells and may influence mitochondrial function, no mechanistic data are provided to demonstrate direct regulation of mitochondrial genes. Including ChIP-seq data, motif enrichment at mitochondrial gene loci, or perturbation studies (e.g., Lmx1b knockout or overexpression in TM3 cells) would greatly strengthen this central claim.
 
 (4)Focus on LMX1B in Fig. 5F lacks broader context: Figure 5F shows that several transcription factors (TFs)-including Tcf21, Foxs1, Arid3b, Myc, Gli2, Patz1, Plag1, Npas2, Nr1h4, and Nfatc2-exhibit stronger positive correlations or motif accessibility changes than LMX1B. Yet the manuscript focuses almost exclusively on LMX1B. The rationale for this focus should be clarified, especially given LMX1B's relatively lower ranking in the correlation analysis. Were the functions of these other highly ranked TFs examined or considered in the context of TM biology or glaucoma? Discussing their potential roles would enhance the interpretation of the transcriptional regulatory landscape and demonstrate the broader relevance of the findings.
 
 Other weaknesses:
 
 (1) In abstract, they say a number of 9,394 wild-type TM cell transcriptomes. The number of Lmx1bV265D/+ TM cell transcriptomes analyzed is not provided. This information is essential for evaluating the comparative analysis and should be clearly stated in the Abstract and again in the main text (e.g., lines 121-123). Including both wild-type and mutant cell counts will help readers assess the balance and robustness of the dataset.
 
 (2) Did the authors monitor mouse weight or other health parameters to assess potential systemic effects of treatment? It is known that the taste of compounds in drinking water can alter fluid or food intake, which may influence general health. Also, does Lmx1bV265D/+ have mice exhibit non-ocular phenotypes, and if so, does nicotinamide confer protection in those tissues as well? Additionally, starting the dose of the nicotinamide at postnatal day 2, how long the mice were treated with water containing nicotinamide, and after how many days or weeks IOP was reduced, and how long the decrease in the IOP was sustained. (3) While the IOP reduction observed in NAM-treated Lmx1bV265D/+ mice appears statistically significant, it is unclear whether this reflects meaningful biological protection. Several untreated mice exhibit very high IOP values, which may skew the analysis. The authors should report the mean values for IOP in both untreated and NAM-treated groups to clarify the magnitude and variability of the response. (4) Additionally, since NAM has been shown to protect RGCs in other glaucoma models directly, the authors should assess whether RGCs are preserved in NAM-treated Lmx1b V265D/+ mice. Demonstrating RGC protection would support a synergistic effect of NAM through both IOP reduction and direct neuroprotection, strengthening the translational relevance of the treatment. (5) Can the authors add any other functional validation studies to explore to understand the pathways enriched in all the subtypes of TM1, TM2, and TM3 cells, in addition to the ICH/IF/RNAscope validation? (6) The authors should include a representative image of the limbal dissection. While Figure S1 provides a schematic, mouse eyes are very small, and dissecting unfixed limbal tissue is technically challenging. It is also difficult to reconcile the claim that the majority of cells in the limbal region are TM and endothelium. As shown in Figure S6, DAPI staining suggests a much higher abundance of scleral cells compared to TM cells within the limbal strip. Additional clarification or visual evidence would help validate the dissection strategy and cellular composition of the captured region.
 
 Review 3
Visit annotations in context

Tags

Summary

Review 1

Review 3

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.11.01.621152v1
www.biorxiv.org www.biorxiv.org

Heterogeneity of Genetic Sequence within Quasi-species of Influenza Virus Revealed by Single-Molecule Sequencing

3
1. Public_Reviews 08 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This is a valuable methodological contribution towards accurate characterization of viral genetic diversity using long-read sequencing and unique molecular identifiers (UMIs). However, the methods are currently incomplete and the sensitivity is not rigorously demonstrated. Addressing these gaps would strengthen the manuscript and make it a key addition to the field.
  
  Summary
2. Public_Reviews 08 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Tamao et al. aimed to quantify the diversity and mutation rate of the influenza (PR8 strain) in order to establish a high-resolution method for studying intra-host viral evolution. To achieve this, the authors combined RNA sequencing with single-molecule unique molecular identifiers (UMIs) to minimize errors introduced during technical processing. They proposed an in vitro infection model with a single viral particle to represent biological genetic diversity, alongside a control model using in vitro transcribed RNA for two viral genes, PB2 and HA.
  
  Through this approach, the authors demonstrated that UMIs reduced technical errors by approximately tenfold. By analyzing four viral populations and comparing them to in vitro transcribed RNA controls, they estimated that ~98.1% of observed mutations originated from viral replication rather than technical artifacts. Their results further showed that most mutations were synonymous and introduced randomly. However, the distribution of mutations suggested selective pressures that favored certain variants. Additionally, comparison with a closely related influenza strain (A/Alaska/1935) revealed two positively selected mutations, though these were absent in the strain responsible for the most recent pandemic (CA01).
  
  Overall, the study is well-designed, and the interpretations are strongly supported by the data. However, the following clarifications are recommended:
  
  (1) The methods section is overly brief. Even if techniques are cited, more experimental details should be included. For example, since the study focuses heavily on methodology, details such as the number of PCR cycles in RT-PCR or the rationale for choosing HA and PB2 as representative in vitro transcripts should be provided.
  
  (2) Information on library preparation and sequencing metrics should be included. For example, the total number of reads, any filtering steps, and quality score distributions/cutoff for the analyzed reads.
  
  (3) In the Results section (line 115, "Quantification of error rate caused by RT"), the mutation rate attributed to viral replication is calculated. However, in line 138, it is unclear whether the reported value reflects PB2, HA, or both, and whether the comparison is based on the error rate of the same viral RNA or the mean of multiple values (as shown in Figure 3A). Please clarify whether this number applies universally to all influenza RNAs or provide the observed range.
  
  (4) Since the T7 polymerase introduced errors are only applied to the in vitro transcription control, how were these accounted for when comparing mutation rates between transcribed RNA and cell-culture-derived virus?
  
  (5) Figure 2 shows that a UMI group size of 4 has an error rate of zero, but this group size is not mentioned in the text. Please clarify.
  
  Review 1
3. Public_Reviews 08 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  This manuscript presents a technically oriented application of UMI-based long-read sequencing to study intra-host diversity in influenza virus populations. The authors aim to minimize sequencing artifacts and improve the detection of rare variants, proposing that this approach may inform predictive models of viral evolution. While the methodology appears robust and successfully reduces sequencing error rates, key experimental and analytical details are missing, and the biological insight is modest. The study includes only four samples, with no independent biological replicates or controls, which limits the generalizability of the findings. Claims related to rare variant detection and evolutionary selection are not fully supported by the data presented.
  
  Strengths:
  
  The study addresses an important technical challenge in viral genomics by implementing a UMI-based long-read sequencing approach to reduce amplification and sequencing errors. The methodological focus is well presented, and the work contributes to improving the resolution of low-frequency variant detection in complex viral populations.
  
  Weaknesses:
  
  The application of UMI-based error correction to viral population sequencing has been established in previous studies (e.g., in HIV), and this manuscript does not introduce a substantial methodological or conceptual advance beyond its use in the context of influenza.
  
  The study lacks independent biological replicates or additional viral systems that would strengthen the generalizability of the conclusions. Potential sources of technical error are not explored or explicitly controlled. Key methodological details are missing, including the number of PCR cycles, the input number of molecules, and UMI family size distributions. These are essential to support the claimed sensitivity of the method.
  
  The assertion that variants at {greater than or equal to}0.1% frequency can be reliably detected is based on total read count rather than the number of unique input molecules. Without information on UMI diversity and family sizes, the detection limit cannot be reliably assessed.
  
  Although genetic variation is described, the functional relevance of observed mutations in HA and NA is not addressed or discussed in the context of known antigenic or evolutionary features of influenza. The manuscript is largely focused on technical performance, with limited exploration of the biological implications or mechanistic insights into influenza virus evolution.
  
  The experimental scale is small, with only four viral populations derived from single particles analyzed. This limited sample size restricts the ability to draw broader conclusions about quasispecies dynamics or evolutionary pressures.
  
  Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.06.19.658006v2
www.biorxiv.org www.biorxiv.org

Elevated Ubiquitin Phosphorylation by PINK1 Contributes to Proteasomal Impairment and Promotes Neurodegeneration

2
1. Public_Reviews 07 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This study provides important insights into the role of polyUbiquitination in neurodegenerative diseases, elucidating how pUb promotes neurodegeneration by affecting proteasomal function. The findings not only offer a new perspective on the pathophysiology of neurodegenerative diseases but also provide potential targets for developing new therapeutic strategies. The experiments in the revised submission provide solid evidence to support the conclusions.
  
  Summary
2. Public_Reviews 07 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The manuscript discusses the role of phosphorylated ubiquitin (pUb) by PINK1 kinase in neurodegenerative diseases. It reveals that elevated levels of pUb are observed in aged human brains and those affected by Parkinson's disease (PD), as well as in Alzheimer's disease (AD), aging, and ischemic injury. The study shows that increased pUb impairs proteasomal degradation, leading to protein aggregation and neurodegeneration. The authors also demonstrate that PINK1 knockout can mitigate protein aggregation in aging and ischemic mouse brains, as well as in cells treated with a proteasome inhibitor. While this study provided some interesting data, several important points should be addressed before being further consideration.
  
  Strengths:
  
  (1) Reveals a novel pathological mechanism of neurodegeneration mediated by pUb, providing a new perspective on understanding neurodegenerative diseases.
  
  (2) The study covers not only a single disease model but also various neurodegenerative diseases such as Alzheimer's disease, aging, and ischemic injury, enhancing the breadth and applicability of the research findings.
  
  Comments on revisions:
  
  This study, through a systematic experimental design, reveals the crucial role of pUb in forming a positive feedback loop by inhibiting proteasome activity in neurodegenerative diseases. The data are comprehensive and highly innovative. However, some of the results are not entirely convincing, particularly the staining results in Figure 1.
  
  In Figure 1A, the density of DAPI staining differs significantly between the control patient and the AD patient, making it difficult to conclusively demonstrate a clear increase in PINK1 in AD patients. Quantitative analysis is needed. In Fig 1C, the PINK1 staining in the mouse brain appears to resemble non-specific staining.
  
  Review 1
Visit annotations in context

Tags

Summary

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.10.18.619025v2
www.biorxiv.org www.biorxiv.org

Life-cycle-related gene expression patterns in the brown algae

2
1. Public_Reviews 07 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This manuscript presents an in-depth analysis of gene expression across multiple brown algal species with differing life histories, providing convincing evidence for the conservation of life cycle-specific gene expression. While largely descriptive, the study is an important step forward in understanding the core cellular processes that differ between life cycle phases, and its findings will be of broad interest to developmental and evolutionary biologists.
  
  Summary
2. Public_Reviews 07 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  The manuscript by Ratchinski et al presents a comprehensive analysis of developmental and life history gene expression patterns in brown algal species. The manuscript shows that the degree of generation bias or generation-specific gene expression correlates with the degree of dimorphism. It also reports conservation of life cycle features within generations and marked changes in gene expression patterns in Ectocarpus in the transition between gamete and early sporophyte. The manuscript also reports considerable conservation of gene expression modules between two representative species, particularly in genes associated with conserved functional characteristics.
  
  Strengths:
  
  The manuscript represents a considerable "tour de force" dataset and analytical effort. While the data presented is largely descriptive, it is likely to provide a very useful resource for studies of brown algal development and for comparative studies with other developmental and life cycle systems.
  
  Comments on revisions
  
  The authors have provided in their response (point 1) a good clarification for their rationale in excluding fucoid algae from the study, based on the diploid nature of the fucoid life cycle. Similarly, they have noted (point 2) that "the relationship between changes in gene expression during very early sporophyte development and during alternation of life cycle generations could be investigated further using a highlydimorphic kelp model system such as Saccharina latissima." For the benefit of the reader who may not be too familiar with the different life cycles in brown algae, I would recommend that these clarifications are included in the Discussion.
  
  Otherwise the authors have addressed my previous comments adequately.
  
  Review 1
Visit annotations in context

Tags

Summary

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2025.04.25.649966v3
osf.io osf.io

Nocebo effects are stronger and more persistent than placebo effects in healthy individuals

4
1. Public_Reviews 07 Oct 2025
 
 in eLife (unscoped)
 
 eLife Assessment
 
 In this preregistered study, Kunkel and colleagues set out to compare the magnitude and duration of placebo versus nocebo effects in healthy volunteers, and also to examine the different factors contributing to these effects. The authors follow a rigorous methodology in a within-subjects design, taking into consideration standard conventions for manipulation of expectations, and using an appropriate sham condition. They present compelling evidence of long-lasting placebo and nocebo effects, with nocebo responses demonstrating consistently greater strength. These valuable results have the potential for a great impact in the field of experimental and clinical pain.
 
 Summary
2. Public_Reviews 07 Oct 2025
 
 in eLife (unscoped)
 
 Reviewer #1 (Public review):
 
 Summary:
 
 The study aimed to: (1) assess the magnitude of placebo and nocebo effects immediately after induction through verbal instructions and conditioning, (2) examine the persistence of these effects one week later, and (3) identify predictors of sustained placebo and nocebo responses over time.
 
 Strengths:
 
 An innovation was to use sham TENS stimulation as the expectation manipulation. This expectation manipulation was reinforced not only by the change in pain stimulus intensity, but also by delivery of non-painful electrical stimulation, labelled as TENS stimulation.
 
 Questionnaire-based treatment expectation ratings were collected before conditioning and after conditioning, and after the test session, which provided an explicit measure of participant's expectations about the manipulation.
 
 The finding that placebo and nocebo effects are influenced by recent experience provides a novel insight into a potential moderator of individual placebo effects.
 
 Weaknesses:
 
 There are a limited number of trials per test condition (10) which means that the trajectory of responses to the manipulation may not be explored, which would be an interesting future study.
 
 The differences between the nocebo and control condition in pain ratings during conditioning could be explained by differing physiological effects of the different stimulus intensities, so it is difficult to make any claims about the expectation effects here. A a randomisation error meant that 25 participants received an unbalanced number 448 of trials per condition (i.e., 10 x VAS 40, 14 x VAS 60, 12 x VAS 80), although the authors accounted for this during analysis so it is not of major concern.
 
 This manuscript presents a study on expectation manipulation to induce placebo and nocebo effects in healthy participants. The study follows standard placebo experiment conventions with use of TENS stimulation as the placebo manipulation. The authors were able to achieve their aims. A key finding is that placebo and nocebo effects were predicted by recent experience, which is a novel contribution to the literature. The findings provide insights into the differences between placebo and nocebo effects and the potential moderators of these effects.
 
 Comments on revisions:
 
 I am satisfied with the author's revisions to the manuscript and have no further comments.
 
 Review 1
3. Public_Reviews 07 Oct 2025
 
 in eLife (unscoped)
 
 Reviewer #2 (Public review):
 
 Summary:
 
 Kunkel et al aim to answer a fundamental question: Do placebo and nocebo effects differ in magnitude or longevity? To address this question, they used a powerful within-participants design, with a very large sample size (n=104), in which they compared placebo and nocebo effects - within the same individuals - across verbal expectations, conditioning, testing phase, and a 1-week follow-up. With elegant analyses, they establish that different mechanisms underlie the learning of placebo vs nocebo effects, with the latter being acquired faster and extinguished slower. This is an important finding for both the basic understanding of learning mechanisms in humans and for potential clinical applications to improve human health.
 
 Strengths:
 
 Beyond the above - the paper is well-written and very clear. It lays out nicely the need for the current investigation and what implications it holds. The design is elegant, and the analyses are rich, thoughtful, and interesting. The sample size is large which is highly appreciated, considering the longitudinal, in-lab study design. The question is super important and well-investigated, and the entire manuscript is very thoughtful with analyses closely examining the underlying mechanisms of placebo versus nocebo effects.
 
 Comments on revisions:
 
 The authors have addressed all of my concerns and comments - one point for them to verify is that indeed analyses that have not been preregistered will be flagged as such. The provided pre-registration link doesn't specify much about the analysis plans and specific tests used.
 
 Review 2
4. Public_Reviews 07 Oct 2025
 
 in eLife (unscoped)
 
 Author response:
 
 The following is the authors’ response to the original reviews
 
 Public Reviews:
 
 Reviewer #1 (Public review):
 
 Summary:
 
 This manuscript presents a study on expectation manipulation to induce placebo and nocebo effects in healthy participants. The study follows standard placebo experiment conventions with the use of TENS stimulation as the placebo manipulation. The authors were able to achieve their aims. A key finding is that placebo and nocebo effects were predicted by recent experience, which is a novel contribution to the literature. The findings provide insights into the differences between placebo and nocebo effects and the potential moderators of these effects.
 
 Specifically, the study aimed to:
 
 (1) assess the magnitude of placebo and nocebo effects immediately after induction through verbal instructions and conditioning
 
 (2) examine the persistence of these effects one week later, and
 
 (3) identify predictors of sustained placebo and nocebo responses over time.
 
 Strengths:
 
 An innovation was to use sham TENS stimulation as the expectation manipulation. This expectation manipulation was reinforced not only by the change in pain stimulus intensity, but also by delivery of non-painful electrical stimulation, labelled as TENS stimulation.
 
 Questionnaire-based treatment expectation ratings were collected before conditioning and after conditioning, and after the test session, which provided an explicit measure of participants' expectations about the manipulation.
 
 The finding that placebo and nocebo effects are influenced by recent experience provides a novel insight into a potential moderator of individual placebo effects.
 
 We thank the reviewer for their thorough evaluation of our manuscript and for highlighting the novelty and originality of our study.
 
 Weaknesses:
 
 There are a limited number of trials per test condition (10), which means that the trajectory of responses to the manipulation may not be adequately explored.
 
 We appreciate the reviewer’s comment regarding the number of trials in the test phase. The trial number was chosen to ensure comparability with previous studies addressing similar research questions with similar designs (e.g. Colloca et al., 2010). Our primary objective was to directly compare placebo and nocebo effects within a within-subject design and to examine their persistence one week after the first test session. While we did not specifically aim to investigate the trajectory of responses within a single testing session, we fully agree that a comprehensive analysis of the trajectories of expectation effects on pain would be a valuable extension of our work. We have now acknowledged this limitation and future direction in the revised manuscript.
 
 The paragraph reads as follows: “It is important to note that our study was designed in alignment with previous studies addressing similar questions (e.g., Colloca et al., 2010). Our primary aim was to directly compare placebo and nocebo effects in a within-subject design and assess their persistence of these effects one week following the first test session. One limitation of our approach is the relatively short duration of each session, which may have limited our ability to examine the trajectory of responses within a single session. Future studies could address this limitation by increasing the number of trials for a more comprehensive analysis.”
 
 On day 8, one stimulus per stimulation intensity (i.e., VAS 40, 60, and 80) was applied before the start of the test session to re-familiarise participants with the thermal stimulation. There is a potential risk of revealing the manipulation to participants during the re-familiarization process, as they were not previously briefed to expect the painful stimulus intensity to vary without the application of sham TENS stimulation.
 
 We thank the reviewer for the opportunity to clarify this point. Participants were informed at the beginning of the experiment that we would use different stimulation intensities to re-familiarize them with the stimuli before the second test session. We are therefore confident that participants perceived this step as part of a recalibration rather than associating it with the experimental manipulation. We have added this information to the revised version of the manuscript.
 
 The paragraph now reads as follows: “On day 8, one stimulus per stimulation intensity (i.e., VAS 40, 60 and 80) was applied before the start of the test session to re-familiarise participants with the thermal stimulation. Note that participants were informed that these pre-test stimuli were part of the recalibration and refamiliarization procedure conducted prior to the second test session.”
 
 The differences between the nocebo and control conditions in pain ratings during conditioning could be explained by the differing physiological effects of the different stimulus intensities, so it is difficult to make any claims about expectation effects here.
 
 We appreciate the reviewer’s comment and agree that, despite the careful calibration of the three pain stimuli, we cannot entirely rule out the possibility that temporal dynamics during the conditioning session were influenced by differential physiological effects of the varying stimulus intensities (e.g., intensity-dependent habituation or sensitization). We have addressed this in the revision of the manuscript, but we would like to emphasize that the stronger nocebo effects during the test phase are statistically controlled for any differences in the conditioning session.
 
 The paragraph now reads: “This asymmetry is noteworthy in and of itself because it occurred despite the equidistant stimulus calibration relative to the control condition prior to conditioning. It may be the result of different physiological effects of the stimuli over time or amplified learning in the nocebo condition, consistent with its heightened biological relevance, but it could also be a stronger effect of the verbal instructions in this condition.”
 
 A randomisation error meant that 25 participants received an unbalanced number of 448 trials per condition (i.e., 10 x VAS 40, 14 x VAS 60, 12 x VAS 80).
 
 We agree that this is indeed unfortunate. However, we would like to point out that all analyses reported in the manuscript have been controlled for the VAS ratings in the conditioning session, i.e., potential effects of the conditioned placebo and nocebo stimuli. Moreover, we have now conducted additional analyses, presented here in our response to the reviewers, to demonstrate that this imbalance did not systematically bias the results. Importantly, the key findings observed during the test phase remain robust despite this issue.
 
 Specifically, when excluding these 25 participants from the analyses, the reported stronger nocebo compared to placebo effects in the test session on day 1 remain unchanged. Likewise, the comparison of placebo and nocebo effects between days 1 and 8 shows the same pattern when excluding the participants in question. The only exception is the interaction between effect (placebo vs nocebo) x session (day 1 vs day 8), which changed from a borderline significant result (p = .049) to insignificant (p = .24). However, post hoc tests continued to show the same pattern as originally reported: a significant reduction in the nocebo effect from day 1 to day 8 and no significant change in the placebo effect.
 
 Reviewer #2 (Public review):
 
 Summary:
 
 Kunkel et al aim to answer a fundamental question: Do placebo and nocebo effects differ in magnitude or longevity? To address this question, they used a powerful within-participants design, with a very large sample size (n=104), in which they compared placebo and nocebo effects - within the same individuals - across verbal expectations, conditioning, testing phase, and a 1-week follow-up. With elegant analyses, they establish that different mechanisms underlie the learning of placebo vs nocebo effects, with the latter being acquired faster and extinguished slower. This is an important finding for both the basic understanding of learning mechanisms in humans and for potential clinical applications to improve human health.
 
 Strengths:
 
 Beyond the above - the paper is well-written and very clear. It lays out nicely the need for the current investigation and what implications it holds. The design is elegant, and the analyses are rich, thoughtful, and interesting. The sample size is large which is highly appreciated, considering the longitudinal, in-lab study design. The question is super important and well-investigated, and the entire manuscript is very thoughtful with analyses closely examining the underlying mechanisms of placebo versus nocebo effects.
 
 We thank the reviewer for their positive evaluation of our manuscript and for acknowledging the methodological rigor and the significant implications for clinical applications and the broader research field.
 
 Weaknesses:
 
 There were two highly addressable weaknesses in my opinion:
 
 (1) I could not find the preregistration - this is crucial to verify what analyses the authors have committed to prior to writing the manuscript. Please provide a link leading directly to the preregistration - searching for the specified number in the suggested website yielded no results.
 
 We thank the reviewer for pointing this out. We included a link to the preregistration in the revised manuscript. This study was pre-registered with the German Clinical Trial Register (registration number: DRKS00029228; https://drks.de/search/de/trial/DRKS00029228).
 
 (2) There is a recurring issue which is easy to address: because the Methods are located after the Results, many of the constructs used, analyses conducted, and even the main placebo and nocebo inductions are unclear, making it hard to appreciate the results in full. I recommend finding a way to detail at the beginning of the results section how placebo and nocebo effects have been induced. While my background means I am familiar with these methods, other readers will lack that knowledge. Even a short paragraph or a figure (like Figure 4) could help clarify the results substantially. For example, a significant portion of the results is devoted to the conditioning part of the experiment, while it is unknown which part was involved (e.g., were temperatures lowered/increased in all trials or only in the beginning).
 
 We thank the reviewer for their helpful comment and agree that the Results section requires additional information that would typically be provided by the Methods section if it directly followed the Introduction. In response, we have moved the former Figure 4 from the Methods section to the beginning of the Results section as a new Figure 1, to improve clarity. Further, we have revised the Methods section to explicitly state that all trials during the conditioning phase were manipulated in the same way.
 
 Recommendations for the Authors:
 
 Reviewer #1 (Recommendations for the authors):
 
 (1) Given that the authors are claiming (correctly) that there is only limited work comparing placebo/nocebo effects, there are some papers missing from their citations:
 
 Nocebo responses are stronger than placebo responses after subliminal pain conditioning - - Jensen, K., Kirsch, I., Odmalm, S., Kaptchuk, T. J. & Ingvar, M. Classical conditioning of analgesic and hyperalgesic pain responses without conscious awareness. Proc. Natl. Acad. Sci. USA 112, 7863-7 (2015)
 
 We thank the reviewer and have now included this relevant publication into the introduction of the revised manuscript.
 
 Hird, E.J., Charalambous, C., El-Deredy, W. et al. Boundary effects of expectation in human pain perception. Sci Rep 9, 9443 (2019). https://doi.org/10.1038/s41598-019-45811-x
 
 We thank the reviewer for suggesting this relevant publication. We have now included it into the discussion of the revised manuscript by adding the following paragraph:
 
 “Recent work using a predictive coding framework further suggests that nocebo effects may be less susceptible to prediction error than placebo effects (Hird et al., 2019), which could contribute to their greater persistence and strength in our study.”
 
 (2) The trial-by-trial pain ratings could have been usefully modelled with a computational model, such as a Bayesian model (this is especially pertinent given the reference to Bayesian processing in the discussion). A multilevel model could also be used to increase the power of the analysis. This is a tentative suggestion, as I appreciate it would require a significant investment of time and work - alternatively, the authors could acknowledge it in the Discussion as a useful future avenue for investigation, if this is preferred.
 
 We thank the reviewer for this thoughtful suggestion. While we agree that computational modelling approaches could provide valuable insights into individual learning, our study was not designed with this in mind and the relatively small number of trials per condition and the absence of trial-by-trial expectancy ratings limit the applicability of such models. We have therefore chosen not to pursue such analysis but highlight it in the discussion as a promising direction for future research.
 
 “Notably, the most recent experience was the most predictive in all three analyses; for instance, the placebo effect on day 8 was predicted by the placebo effect on day 1, not by the initial conditioning. This finding supports the Bayesian inference framework, where recent experiences are weighted more heavily in the process of model updating because they are more likely to reflect the current state of the environment, providing the most relevant and immediate information needed to guide future actions and predictions24. Interestingly, while a change in pain predicted subsequent nocebo effects, it seemed less influential than for placebo effects. This aligns with findings that longer conditioning enhanced placebo effects, while it did not affect nocebo responses10 and the conclusion that nocebo instruction may be sufficient to trigger nocebo responses. Using Bayesian modeling, future studies could identify individual differences in the development of placebo and nocebo effects by integrating prior experiences and sensory inputs, providing a probabilistic framework for understanding the underlying mechanisms.”
 
 (3) The paper is missing any justification of sample size, i.e. power analysis - please include this.
 
 We apologize for the missing information on our a priori power analysis. As there is a lack of prior studies investigating within-subjects comparisons of placebo and nocebo effects that could inform precise effect size estimates for our research question, we based our calculation on the ability detect small effects. Specifically, the study was powered to detect effect sizes in the range of d = 0.2 - 0.25 with α = .05 and power = .9, yielding a required sample size of N = 83-129. We have now added this information to the methods section of the revised manuscript.
 
 (4) "On day 8, one stimulus per stimulation intensity (i.e., VAS 40, 60 and 80) was applied before the start of the test session to re-familiarise participants with the thermal stimulation."
 
 What were the instructions about this? Was it before the electrode was applied? This runs the risk of unblinding participants, as they only expect to feel changes in stimulus intensity due to the TENS stimulation.
 
 We thank the reviewer for pointing out the potential risk of unblinding participants due to the re-familiarization process prior to the second test session. We would like to clarify that we followed specific procedures to prevent participants from associating this process with the experimental manipulation. The re-familiarisation with the thermal stimuli was conducted after the electrode had been applied and re-tested to ensure that both stimulus modalities were re-introduced in a consistent and neutral context. Participants were explicitly informed that both procedures were standard checks prior to the actual test session (“We will check both once again before we begin the actual measurement.”). For the thermal stimuli, we informed participants that they would experience three different intensities to allow the skin to acclimate (e.g., “...we will test the heat stimuli in 3 trials with different temperatures, allowing your skin to acclimate to the stimuli. …”), without implying any connection to the experimental conditions.
 
 Importantly, this re-familiarization procedure mirrored what participants had already experienced during the initial calibration session on day 1. We therefore assume that participants interpreted as a routine technical step rather than part of the experimental manipulation. We have now clarified this procedure in the methods section of the revised manuscript.
 
 (5) "For a comparison of pain intensity ratings between time-points, an ANOVA with the within-subject factors Condition (placebo, nocebo, control) and Session (day 1, day 8) was carried out. For the comparison of placebo and nocebo effects between the two test days, an ANOVA with the with-subject factors Effect (placebo effect, nocebo effect) and Session (day 1, day 8) was used."
 
 It seems that one ANOVA is looking at raw pain scores and one is looking at difference scores, but this is a bit confusing - please rephrase/clarify this, and explain why it is useful to include both.
 
 We thank the reviewer for highlighting this point. Our primary analyses focus on placebo and nocebo effects, which we define as the difference in pain intensity ratings between the control and the placebo condition (placebo effect) and the nocebo and the control condition (nocebo effect), respectively.
 
 To examine whether condition effects were present at each time-point, we first conducted two separate repeated measures ANOVAs - one for day 1 and one for day 8 - with the within-subject factor CONDITION (placebo, nocebo, control).
 
 To compare the magnitude and persistence of placebo and nocebo effects over time, we then calculated the above-mentioned difference scores and submitted these to a second ANOVA with within-subject factors EFFECT (placebo vs. nocebo effect) and SESSION (day 1 vs. day 8). We have now clarified this approach on page 19 of the revised manuscript. To avoid confusion, the Condition x Session ANOVA has been removed from the manuscript.
 
 (6) Please can the authors provide a figure illustrating trial-by-trial ratings during test trials as well as during conditioning trials?
 
 In response to the reviewer’s point, we now provide the trial-by-trial ratings of the test phases on days 1 and 8 as an additional figure in the Supplement (Figure S1) and would like to clarify that trial-by-trial pain intensity ratings of the conditioning phase are displayed in Figure 2C of the manuscript,
 
 (7) "Separate multiple linear regression analyses were performed to examine the influence of expectations (GEEE ratings) and experienced effects (VAS ratings) on subsequent placebo and nocebo effects. For day 1, the placebo effect was entered as the dependent variable and the following variables as potential predictors: (i) expected improvement with placebo before conditioning, (ii) placebo effect during conditioning and (iii) the expected improvement with placebo before the test session at day 1"
 
 The term "placebo effect during conditioning" is a bit confusing - I believe this is just the effect of varying stimulus intensities - please could the authors be more explicit on the terminology they use to describe this? NB changes in pain rating during the conditioning trials do not count as a placebo/nocebo effect, as most of the change in rating will reflect differences in stimulation intensity.
 
 We agree with the reviewer that the cited paragraph refers to the actual application of lower or higher pain stimuli during the conditioning session, rather than genuinely induced placebo or nocebo effect. We thank the reviewer for this helpful observation and have revised the terminology, accordingly, now referring to these as “pain relief during conditioning” and “pain worsening during conditioning”.
 
 (8) Supplementary materials: "The three temperature levels were perceived as significantly different (VAS ratings; placebo condition: M= 32.90, SD= 16.17; nocebo condition: M= 56.62, SD= 17.09; control condition: M= 80.84, SD= 12.18"
 
 This suggests that the VAS rating for the control condition was higher than for the nocebo condition. Please could the authors clarify/correct this?
 
 We thank the reviewer for spotting this error. The values for the control and the nocebo condition had accidentally been swapped. This has now been corrected in the manuscript: control condition: M= 56.62, SD= 17.09; nocebo condition: M= 80.84, SD= 12.18.
 
 (9) "To predict placebo responses a week later (VAScontrol - VASplacebo at day 8), the same independent variables were entered as for day 1 but with the following additional variables (i) the placebo effect at day 1 and (ii) the expected improvement with placebo before the test session at day 8."
 
 Here it would be much clearer to say 'pain ratings during test trials at day 1".
 
 We agree with the reviewer and have revised the manuscript as suggested.
 
 (10) For completeness, please present the pain intensity ratings during conditioning as well as calibration/test trials in the figure.
 
 Please see our answer to comment (6).
 
 (11) In Figure 1a, it looks like some participants had rated the control condition as zero by day 8. If so, it's inappropriate to include these participants in the analysis if they are not responding to the stimulus. Were these the participants who were excluded due to pain insensitivity?
 
 On day 8, the lowest pain intensity ratings observed were VAS 3 in the placebo condition and VAS 2 in the control condition, both from the same participant. All other participants reported minimum values of VAS 11 or higher (all on a scale from 0-100). Thus, no participant provided a pain rating of VAS 0, and all ratings indicated some level of pain perception in response to the stimulus. We did not define an exclusion criterion based on day 8 pain ratings in our preregistration, and we did not observe any technical issues with the stimulation procedure. To avoid post-hoc exclusions and maintain consistency with our preregistered analysis plan, we therefore decided to include all participants in the analysis.
 
 (12) "Comparison of day 1 and day 8. A direct comparison of placebo and nocebo effects on day 1 and day 8 pain intensity ratings showed a main effect of Effect with a stronger nocebo effect (F(1,97)= 53.93, 131 p< .001, η2= .36) but no main effect of Day (F(1,97)= 2.94, p= .089, η2 = .029). The significant Effect x Session interaction indicated that the placebo effect and the nocebo effect developed differently over time (F(1,97)= 3.98, p= .049, η2 = .039)"
 
 This is confusing as it talks about a main effect of "day" and then interaction with "session" - are they two different models? The authors need to clarify.
 
 We thank the reviewer for pointing this out. In our analysis, “Session” is the correct term for the experimental factor, which has two factor levels, “day 1” and “day 8”. This has now been corrected in the revised manuscript.
 
 Reviewer #2 (Recommendations for the authors):
 
 (1) More information on how "size of the effect" in Figures 1b and 2b was calculated is needed; this can be in the legend. If these are differences between control and each condition, then they were reversed for one condition (nocebo?), which is ok - but this should be clearly explained.
 
 We agree with the reviewer and have now revised the figure legends to improve clarity. The legends now read:
 
 1b: “Figure 1. Pain intensity ratings and placebo and nocebo effects during calibration and test sessions. (A) Mean pain intensity ratings in the placebo, nocebo and control condition during calibration, and during the test sessions at day 1 and day 8. (B) Placebo effect (control condition - placebo condition, i.e., positive value of difference) and nocebo effect (nocebo condition - control condition, i.e., positive value of difference) on day 1 and day 8. Error bars indicate the standard error of the mean, circles indicate mean ratings of individual participants. *: p < .001, : p < .01, n.s.: non-significant.”
 
 2b: “Figure 2. Mean and trial-by-trial pain intensity ratings, placebo and nocebo effects during conditioning. (A) Mean pain intensity ratings of the placebo, nocebo and control condition during conditioning. (B) Placebo effect (control condition - placebo condition, i.e., positive value of difference) and nocebo effect (nocebo condition - control condition, i.e., positive value of difference) during conditioning. (C) Trial-by-trial pain intensity ratings (with confidence intervals) during conditioning. Error bars indicate the standard error of the mean, circles indicate mean ratings of individual participants. ***: p < .001.”
 
 (2) In the methods, I was missing a clear understanding of how many trials there were in the conditioning phase, and then how many in the other testing phases. Also, how long did the experiment last in total?
 
 We apologize that the exact number of trials in the testing phases was not clear in the original manuscript. We now indicate on page 18 of the revised manuscript that we used 10 trials per condition in the test sessions. We have also added information on the duration of each test day (i.e., three hours on day 1 and one hour on day 8) on page 15.
 
 (3) In expectancy ratings, line 186 - are improvement and worsening expectations different from expected pain relief? It is implied that these are two different constructs - it would be helpful to clarify that.
 
 We agree that this is indeed confusing and would like to clarify that both refer to the same construct. We used the Generic rating scale for previous treatment experiences, treatment expectations, and treatment effects (GEEE questionnaire, Rief et al. 2021) that discriminates between expected symptom improvement, expected symptom worsening, and expected side effects due to a treatment. We now use the terms “expected pain relief” and “expected pain worsening” throughout the whole manuscript.
 
 (4) In the last section of the Results, somatosensory amplification comes out of nowhere - and could be better introduced (see point 2 above).
 
 We agree with the reviewer that introducing the concept of somatosensory amplification and its potential link to placebo/nocebo effects only in the Methods is unhelpful, given that this section appears at the end of the manuscript. We therefore now introduce the relevant publication (Doering et al., 2015) before reporting our findings on this concept.
 
 (5) In line 169, if the authors want to specify what portion of the variance was explained by expectancy, they could conduct a hierarchical regression, where they first look at R2 without the expectancy entered, and only then enter it to obtain the R2 change.
 
 We fully agree that hierarchical regression can be a useful approach for isolating the contribution of variables. However, in our case, expectancy was assessed at different time points (e.g., before conditioning and before the test session on day 1), and there was no principled rationale for determining the order in which these different expectancy-related variables should be entered into a hierarchical model.
 
 That said, in response to the reviewer’s suggestion, we have now conducted hierarchical regression analyses in which all expectancy-related variables were entered together as a single block (see below). These analyses largely confirmed the findings reported so far and are provided here in the response to the reviewers below. Given the exploratory nature of this grouping and the lack of an a priori hierarchy, we feel that the standard multiple regression models remain the most appropriate for addressing our research question because it allows us to evaluate the total contribution of expectancy-related predictors while also examining the individual contribution of each variable within the block. We would therefore prefer to retain these as the primary analyses in the manuscript.
 
 Results of the hierarchical regression analyses:
 
 Day 1 - Placebo response: In step 1, we entered the difference in pain intensity ratings between the control and the placebo condition during conditioning as a predictor. In step 2, we added the two variables reflecting expectations (i.e., expected improvement with placebo (i) before conditioning and (ii) before the test session on day 1). This allowed us to assess whether expectation-related variables explained additional variance beyond the effect of conditioning.
 
 The overall regression model at step 1 was significant, F(1, 102) = 13.42, p < .001, explaining 11.6% of the variance in the dependent variable (R2 = .116). Adding the expectancy-related predictors in step 2 did not lead to a significant increase in explained variance, ΔR2 = .007, F(2, 100) = 0.384, p = .682. Thus, the conditioning response significantly predicted placebo-related pain reduction on day 1, but additional information on expectations did not account for further variance.
 
 Day 1 - Nocebo response: The equivalent analysis was run for the nocebo response on day 1. In step 1, the pain intensity difference between the nocebo and the control condition was entered as a predictor before adding the two expectancy ratings (i.e., expected worsening with nocebo (i) before conditioning and (ii) before the test session on day 1).
 
 In step 1, the regression model was not statistically significant, F(1, 102) = 2.63, p = .108, and explained only 2.5% of the variance in nocebo response (R2 = .025). Adding the expectation-related predictors in Step 2 slightly increased the explained variance by ΔR2 = .027, but this change was also non-significant, F(2, 100) = 1.41, p = .250. The overall variance explained by the full model remained low (R2 = .052). These results suggest that neither conditioning nor expectation-related variables reliably predicted nocebo-related pain increases on day 1.
 
 Day 8 - Placebo response: For the prediction of the placebo effect on day 8, the following variables reflecting perceived effects were entered as predictors in step 1: the difference in pain intensity ratings between the control and the placebo condition (i) during conditioning and (ii) on day 1. In step 2, the variables reflecting expectations were added: the expected improvement with placebo (i) before conditioning, (ii) before the test session on day 1 and (iii) before the test session on day 8.
 
 In step 1, the model was statistically significant, F(3, 95) = 14.86, p < .001, explaining 23.8% of the variance in the placebo response (R2 = .238, Adjusted R2 = .222). In step 2, the addition of the expectation-related predictors resulted in a non-significant improvement in model fit, ΔR2 = .051, F(3, 92) = 2.21, p = .092. The overall variance explained by the full model increased modestly to 29.0%.
 
 Day 8 - Nocebo response: For the equivalent analyses of nocebo responses on day 8, the following variables were included in step 1: the difference in pain intensity ratings between the nocebo and the control condition (i) during conditioning and (ii) on day 1. In step 2, we entered the variables reflecting nocebo expectations including expected worsening with nocebo (i) before conditioning, (ii) before the test session on day 1 and (iii) before the test session on day 8. In step 1, the model significantly predicted the day 8 nocebo response, F(3, 95) = 6.04, p = .003, accounting for 11.3% of the variance (R2 = .113, Adjusted R2 = .094). However, the addition of expectation-related predictors in Step 2 resulted in only a negligible and non-significant improvement, ΔR2 = .006, F(3, 92) = 0.215, p = .886. The full model explained just 11.9% of the variance (R2 = .119).
 
 Typos:
 
 (6) Abstract - 104 heathy xxx (word missing).
 
 (7) Line 61 - reduce or decrease - I think you meant increase.
 
 Thank you, we have now corrected both sentences.
 
 References
 
 Colloca L, Petrovic P, Wager TD, Ingvar M, Benedetti F. How the number of learning trials affects placebo and nocebo responses. Pain. 2010
 
 Doering BK, Nestoriuc Y, Barsky AJ, Glaesmer H, Brähler E, Rief W. Is somatosensory amplification a risk factor for an increased report of side effects? Reference data from the German general population. J Psychosom Res. 2015
 
 AuthorResponse
Visit annotations in context

Tags

Review 2

Review 1

AuthorResponse

Summary

Annotators

Public_Reviews

URL

osf.io/preprints/psyarxiv/68wcy_v2
www.biorxiv.org www.biorxiv.org

A Deep Learning Pipeline for Mapping in situ Network-level Neurovascular Coupling in Multi-photon Fluorescence Microscopy

4
1. Public_Reviews 07 Oct 2025
 
 in eLife
 
 eLife Assessment
 
 This work describes a highly complex automated algorithm for analyzing vascular imaging data from two-photon microscopy. This tool has the potential to be extremely valuable to the field and to fill gaps in knowledge of hemodynamic activity across a regional network. The solid biological application provides a demonstration of their pipeline's capabilities and suggests intriguing hypotheses around prolonged vascular tone changes, but will need to be followed up by further experiments to be conclusively demonstrated.
 
 Summary
2. Public_Reviews 07 Oct 2025
 
 in eLife
 
 Reviewer #1 (Public Review):
 
 Summary:
 
 In this manuscript, the authors describe a new pipeline to measure changes in vasculature diameter upon opt-genetic stimulation of neurons.
 
 The work is interesting and the topic is quite relevant to better understand the hemodynamic response on the graph/network level.
 
 Strengths:
 
 The manuscript provides a pipeline that allows for the detection of changes in the vessel diameter as well as simultaneously allowing for the location of the neurons driven by stimulation.
 
 The resulting data could provide interesting insights into the graph-level mechanisms of regulating activity-dependent blood flow.
 
 The interesting findings include that vessel radius changes depend on depth from the cortical surface and that dilations on average happen closer to the activated neurons.
 
 Review 1
3. Public_Reviews 07 Oct 2025
 
 in eLife
 
 Reviewer #2 (Public Review):
 
 Summary:
 
 The authors develop a highly detailed pipeline to analyze hemodynamic signals from in vivo two-photon fluorescence microscopy. This includes motion correction, segmentation of the vascular network, diameter measurements across time, mapping neuronal position relative to the vascular network, and analyzing vascular network properties (interactions between different vascular segments). For the segmentation, the authors use a Convolution Neural Network to identify vessel (or neural) and background pixels and train it using ground truth images based on semi-automated mapping followed by human correction/annotation. Considerable processing was done on the segmented images to improve accuracy, extract vessel center lines, and compute frame-by-frame diameters. The model was tested with artificial diameter increases and Gaussian noise and proved robust to these manipulations.
 
 Network-level properties include Assortativity - a measure of how similar a vessel's response is to nearby vessels - and Efficiency - the ease of flow through the network (essentially, the combined resistance of a path based on diameter and vessel length between two points).
 
 Strengths:
 
 This is a very powerful tool for cerebral vascular biologists as many of these tasks are labor intensive, prone to subjectivity, and often not performed due to the complexity of collecting and managing volumes of vascular signals. Modelling is not my specialty so I cannot speak too specifically, but the model appears to be well-designed and robust to perturbations. It has many clever features for processing the data.
 
 The authors rightly point out that there is a real lack in the field of knowledge of vascular network activity at single-vessel resolution. Network anatomy has been studied, but hemodynamics are typically studied either with coarse resolution or in only one or a few vessels at a time. This pipeline has the potential to change that.
 
 [Editors' note: this work has been through three rounds of revisions, and most recently the authors have added caveats to the discussion. This version of the paper has been assessed by the editors and the weaknesses identified previously remain with earlier versions of the work.]
 
 Review 2
4. Public_Reviews 07 Oct 2025
 
 in eLife
 
 Author response:
 
 The following is the authors’ response to the previous reviews
 
 Reviewer #1 (Public review):
 
 Summary:
 
 In the manuscript the authors describe a new pipeline to measure changes in vasculature diameter upon optogenetic stimulation of neurons. The work is useful to better understand the hemodynamic response on a network /graph level.
 
 Strengths:
 
 The manuscript provides a pipeline that allows to detect changes in the vessel diameter as well as simultaneously allows to locate the neurons driven by stimulation.
 
 The resulting data could provide interesting insights into the graph level mechanisms of regulating activity dependent blood flow.
 
 Weaknesses:
 
 (1) The manuscript contains (new) wrong statements and (still) wrong mathematical formulas.
 
 The symbols in these formulas have been updated to disambiguate them, and the accompanying statements have been adjusted for clarity.
 
 (2) The manuscript does not compare results to existing pipelines for vasculature segmentation (opensource or commercial). Comparing performance of the pipeline to a random forest classifier (illastik) on images that are not preprocessed (i.e. corrected for background etc.) seems not a particularly useful comparison.
 
 We’ve now included comparisons to Imaris (a commercial) for segmentation and VesselVio (open-source) for graph extraction software.
 
 For the ilastik comparison, the images were preprocessed prior to ilastik segmentation, specifically by doing intensity normalization.
 
 Example segmentations utilizing Imaris have now been included. Imaris leaves gaps and discontinuities in the segmentation masks, as shown in Supplementary Figure 10. The Imaris segmentation masks also tend to be more circular in cross-section despite irregularities on the surface of the vessels observable in the raw data and identified in manual segmentation. This approach also requires days to months to generate per image stack.
 
 A comparison to VesselVio has now also been generated, and results are visualized in Supplementary Figure 11. VesselVio generates individual graphs for each time point, resulting in potential discrepancies in the structure of the graphs from different time points. Furthermore, Vesselvio uses distance transformation to estimate the vascular radius, which renders the vessel radius estimates highly susceptible to variation in the user selected methodology used to obtain segmentation results; while our approach uses intensity gradient-based boundary detection from centerlines in the image instead mitigating this bias. We have added the following paragraph to the Discussion section on the comparisons with the two methods:
 
 “Comparison with commercial and open-source vascular analysis pipelines
 
 To compare our results with those achievable on these data with other pipelines for segmentation and graph network extraction, we compared segmentation results qualitatively with Imaris version 9.2.1 (Bitplane) and vascular graph extraction with VesselVio [1]. For the Imaris comparison, three small volumes were annotated by hand to label vessels. Example slices of the segmentation results are shown in Supplementary Figure 10. Imaris tended to either over- or under-segment vessels, disregard fine details of the vascular boundaries, and produce jagged edges in the vascular segmentation masks. In addition to these issues with segmentation mask quality, manual segmentation of a single volume took days for a rater to annotate. To compare to VesselVio, binary segmentation masks (one before and one after photostimulation) generated with our deep learning models were loaded into VesselVio for graph extraction, as VesselVio does not have its own method for generating segmentation masks. This also facilitates a direct comparison of the benefits of our graph extraction pipeline to VesselVio. Visualizations of the two graphs are shown in Supplementary Figure 11. Vesselvio produced many hairs at both time points, and the total number of segments varied considerably between the two sequential stacks: while the baseline scan resulted in 546 vessel segments, the second scan had 642 vessel segments. These discrepancies are difficult to resolve in post-processing and preclude a direct comparison of individual vessel segments across time. As the segmentation masks we used in graph extraction derive from the union of multiple time points, we could better trace the vasculature and identify more connections in our extracted graph. Furthermore, VesselVio relies on the distance transform of the user supplied segmentation mask to estimate vascular radii; consequently, these estimates are highly susceptible to variations in the input segmentation masks.We repeatedly saw slight variations between boundary placements of all of the models we utilized (ilastik, UNet, and UNETR) and those produced by raters. Our pipeline mitigates this segmentation method bias by using intensity gradient-based boundary detection from centerlines in the image (as opposed to using the distance transform of the segmentation mask, as in VesselVio).”
 
 (3) The manuscript does not clearly visualize performance of the segmentation pipeline (e.g. via 2d sections, highlighting also errors etc.). Thus, it is unclear how good the pipeline is, under what conditions it fails or what kind of errors to expect.
 
 On reviewer’s comment, 2D slices have been added in the Supplementary Figure 4.
 
 (4) The pipeline is not fully open-source due to use of matlab. Also, the pipeline code was not made available during review contrary to the authors claims (the provided link did not lead to a repository). Thus, the utility of the pipeline was difficult to judge.
 
 All code has been uploaded to Github and is available at the following location: https://github.com/AICONSlab/novas3d
 
 The Matlab code for skeletonization is better at preserving centerline integrity during the pruning of hairs from centerlines than the currently available open-source methods.
 
 - Generalizability: The authors addressed the point of generalizability by applying the pipeline to other data sets. This demonstrates that their pipeline can be applied to other data sets and makes it more useful. However, from the visualizations it's unclear to see the performance of the pipeline, where the pipelines fails etc. The 3d visualizations are not particularly helpful in this respect . In addition, the dice measure seems quite low, indicating roughly 20-40% of voxels do not overlap between inferred and ground truth. I did not notice this high discrepancy earlier. A thorough discussion of the errors appearing in the segmentation pipeline would be necessary in my view to better assess the quality of the pipeline.
 
 2D slices from the additional datasets have been added in the Supplementary Figure 13 to aid in visualizing the models’ ability to generalize to other datasets.
 
 The dice range we report on (0.7-0.8) is good when compared to those (0.56-86) of 3D segmentations of large datasets in microscopy [2], [3], [4], [5], [6]. Furthermore, we had two additional raters segment three images from the original training set. We found that the raters had a mean inter class correlation of 0.73 [7]. Our model outperformed this Dice score on unseen data: Dice scores from our generalizability tests on C57 mice and Fischer rats on par or higher than this baseline.
 
 Reviewer #2 (Public review):
 
 The authors have addressed most of my concerns sufficiently. There are still a few serious concerns I have. Primarily, the temporal resolution of the technique still makes me dubious about nearly all of the biological results. It is good that the authors have added some vessel diameter time courses generated by their model. But I still maintain that data sampling every 42 seconds - or even 21 seconds - is problematic. First, the evidence for long vascular responses is lacking. The authors cite several papers:
 
 Alarcon-Martinez et al. 2020 show and explicitly state that their responses (stimulus-evoked) returned to baseline within 30 seconds. The responses to ischemia are long lasting but this is irrelevant to the current study using activated local neurons to drive vessel signals.
 
 Mester et al. 2019 show responses that all seem to return to baseline by around 50 seconds post-stimulus.
 
 In Mester et al. 2019, diffuse stimulations with blue light showed a return to baseline around 50 seconds post-stimulus (cf. Figure 1E,2C,2D). However, focal stimulations where the stimulation light is raster scanned over a small region focused in the field of view show longer-lasting responses (cf. Figure 4) that have not returned to baseline by 70 seconds post-stimulus [8]. Alarcon-Martinez et al. do report that their responses return baseline within 30 seconds; however, their physiological stimulation may lead to different neuronal and vessel response kinetics than those elicited by the optogenetic stimulations as in current work.
 
 O'Herron et al. 2022 and Hartmann et al. 2021 use opsins expressed in vessel walls (not neurons as in the current study) and directly constrict vessels with light. So this is unrelated to neuronal activity-induced vascular signals in the current study.
 
 We agree that optogenetic activation of vessel-associated cells is distinct from optogenetic activation of neurons, but we do expect the effects of such perturbations on the vasculature to have some commonalities.
 
 There are other papers including Vazquez et al 2014 (PMID: 23761666) and Uhlirova et al 2016 (PMID: 27244241) and many others showing optogenetically-evoked neural activity drives vascular responses that return back to baseline within 30 seconds. The stimulation time and the cell types labeled may be different across these studies which can make a difference. But vascular responses lasting 300 seconds or more after a stimulus of a few seconds are just not common in the literature and so are very suspect - likely at least in part due to the limitations of the algorithm.
 
 The photostimulation in Vazquez et al. 2014 used diffuse photostimulation with a fiberoptic probe similar to Mester et al. 2019 as opposed to raster scanning focal stimulation we used in this study and in the study by Mester et al. 2019 where we observed the focal photostimulation to elicited longer than a minute vascular responses. Uhlirova et al. 2016 used photostimulation powers between 0.7 and 2.8 mW, likely lower than our 4.3 mW/mm2 photostimulation. Further, even with focal photostimulation, we do see light intensity dependence of the duration of the vascular responses. Indeed, in Supplementary Figure 2, 1.1 mW/mm2 photostimulation leads to briefer dilations/constrictions than does 4.3 mW/mm2; the 1.1 mW/mm2 responses are in line, duration wise, with those in Uhlirova et al. 2016.
 
 Critically, as per Supplementary Figure 2, the analysis of the experimental recordings acquired at 3-second temporal resolution did likewise show responses in many vessels lasting for tens of seconds and even hundreds of seconds in some vessels.
 
 Another major issue is that the time courses provided show that the same vessel constricts at certain points and dilates later. So where in the time course the data is sampled will have a major effect on the direction and amplitude of the vascular response. In fact, I could not find how the "response" window is calculated. Is it from the first volume collected after the stimulation - or an average of some number of volumes? But clearly down-sampling the provided data to 42 or even 21 second sampling will lead to problems. If the major benefit to the field is the full volume over large regions that the model can capture and describe, there needs to be a better way to capture the vessel diameter in a meaningful way.
 
 In the main experiment (i.e. excluding the additional experiments presented in the Supplementary Figure 2 that were collected over a limited FOV at 3s per stack), we have collected one stack every 42 seconds. The first slice of the volume starts following the photostimulation, and the last slice finishes at 42 seconds. Each slice takes ~0.44 seconds to acquire. The data analysis pipeline (as demonstrated by the Supplementary Figure 2) is not in any way limited to data acquired at this temporal resolution and - provided reasonable signal-to-noise ratio (cf. Figure 5) - is applicable, as is, to data acquired at much higher sampling rates.
 
 It still seems possible that if responses are bi-phasic, then depth dependencies of constrictors vs dilators may just be due to where in the response the data are being captured - maybe the constriction phase is captured in deeper planes of the volume and the dilation phase more superficially. This may also explain why nearly a third of vessels are not consistent across trials - if the direction the volume was acquired is different across trials, different phases of the response might be captured.
 
 Alternatively, like neuronal responses to physiological stimuli, the vascular responses elicited by increases in neuronal activity may themselves be variable in both space and time.
 
 I still have concerns about other aspects of the responses but these are less strong. Particularly, these bi-phasic responses are not something typically seen and I still maintain that constrictions are not common. The authors are right that some papers do show constriction. Leaving out the direct optogenetic constriction of vessels (O'Herron 2022 & Hartmann 2021), the Alarcon-Martinez et al. 2020 paper and others such as Gonzales et al 2020 (PMID: 33051294) show different capillary branches dilating and constricting. However, these are typically found either with spontaneous fluctuations or due to highly localized application of vasoactive compounds. I am not familiar with data showing activation of a large region of tissue - as in the current study - coupled with vessel constrictions in the same region. But as the authors point out, typically only a few vessels at a time are monitored so it is possible - even if this reviewer thinks it unlikely - that this effect is real and just hasn't been seen.
 
 Uhlirova et al. 2016 (PMID: 27244241) observed biphasic responses in the same vessel with optogenetic stimulation in anesthetized and unanesthetized animals (cf Fig 1b and Fig 2, and section “OG stimulation of INs reproduces the biphasic arteriolar response”). Devor et al. (2007) and Lindvere et al. (2013) also reported on constrictions and dilations being elicited by sensory stimuli.
 
 I also have concerns about the spatial resolution of the data. It looks like the data in Figure 7 and Supplementary Figure 7 have a resolution of about 1 micron/pixel. It isn't stated so I may be wrong. But detecting changes of less than 1 micron, especially given the noise of an in vivo prep (brain movement and so on), might just be noise in the model. This could also explain constrictions as just spurious outputs in the model's diameter estimation. The high variability in adjacent vessel segments seen in Figure 6C could also be explained the same way, since these also seem biologically and even physically unlikely.
 
 Thank you for your comment. To address this important issue, we performed an additional validation experiment where we placed a special order of fluorescent beads with a known diameter of 7.32 ± 0.27um, imaged them following our imaging protocol, and subsequently used our pipeline to estimate their diameter. Our analysis converged on the manufacturer-specified diameters, estimating the diameter to be 7.34 ± 0.32. The manuscript has been updated to detail this experiment, as below:
 
 Methods section insert
 
 “Second, our boundary detection algorithm was used to estimate the diameters of fluorescent beads of a known radius imaged under similar acquisition parameters. Polystyrene microspheres labelled with Flash Red (Bangs Laboratories, inc, CAT# FSFR007) with a nominal diameter of 7.32um and a specified range of 7.32 ± 0.27um as determined by the manufacturer using a Coulter counter were imaged on the same multiphoton fluorescence microscope set-up used in the experiment (identical light path, resonant scanner, objective, detector, excitation wavelength and nominal lateral and axial resolutions, with 5x averaging). The images of the beads had a higher SNR than our images of the vasculature, so Gaussian noise was added to the images to degrade the SNR to the same level of that of the blood vessels. The images of the beads were segmented with a threshold, centroids calculated for individual spheres, and planes with a random normal vector extracted from each bead and used to estimate the diameter of the beads. The same smoothing and PSF deconvolution steps were applied in this task. We then reported the mean and standard deviation of the distribution of the diameter estimates. A variety of planes were used to estimate the diameters.”
 
 Results Section Insert
 
 “Our boundary detection algorithm successfully estimated the radius of precisely specified fluorescent beads. The bead images had a signal-to-noise ratio of 6.79 ± 0.16 (about 35% higher than our in vivo images): to match their SNR to that of in vivo vessel data, following deconvolution, we added Gaussian noise with a standard deviation of 85 SU to the images, bringing the SNR down to 5.05 ± 0.15. The data processing pipeline was kept unaltered except for the bead segmentation, performed via image thresholding instead of our deep learning model (trained on vessel data). The bead boundary was computed following the same algorithm used on vessel data: i.e., by the average of the minimum intensity gradients computed along 36 radial spokes emanating from the centreline vertex in the orthogonal plane. To demonstrate an averaging-induced decrease in the uncertainty of the bead radius estimates on a scale that is finer than the nominal resolution of the imaging configuration, we tested four averaging levels in 289 beads. Three of these averaging levels were lower than that used on the vessels, and one matched that used on the vessels (36 spokes per orthogonal plane and a minimum of 10 orthogonal planes per vessel). As the amount of averaging increased, the uncertainty on the diameter of the beads decreased, and our estimate of the bead's diameter converged upon the manufacturer's Coulter counter-based specifications (7.32 ± 0.27um), as tabulated below in Table 1.”
 
 Bibliography
 
 (1) J. R. Bumgarner and R. J. Nelson, “Open-source analysis and visualization of segmented vasculature datasets with VesselVio,” Cell Rep. Methods, vol. 2, no. 4, Apr. 2022, doi: 10.1016/j.crmeth.2022.100189.
 
 (2) G. Tetteh et al., “DeepVesselNet: Vessel Segmentation, Centerline Prediction, and Bifurcation Detection in 3-D Angiographic Volumes,” Front. Neurosci., vol. 14, Dec. 2020, doi: 10.3389/fnins.2020.592352.
 
 (3) N. Holroyd, Z. Li, C. Walsh, E. Brown, R. Shipley, and S. Walker-Samuel, “tUbe net: a generalisable deep learning tool for 3D vessel segmentation,” Jul. 24, 2023, bioRxiv. doi: 10.1101/2023.07.24.550334.
 
 (4) W. Tahir et al., “Anatomical Modeling of Brain Vasculature in Two-Photon Microscopy by Generalizable Deep Learning,” BME Front., vol. 2020, p. 8620932, Dec. 2020, doi: 10.34133/2020/8620932.
 
 (5) R. Damseh, P. Delafontaine-Martel, P. Pouliot, F. Cheriet, and F. Lesage, “Laplacian Flow Dynamics on Geometric Graphs for Anatomical Modeling of Cerebrovascular Networks,” ArXiv191210003 Cs Eess Q-Bio, Dec. 2019, Accessed: Dec. 09, 2020. (Online). Available: http://arxiv.org/abs/1912.10003
 
 (6) T. Jerman, F. Pernuš, B. Likar, and Ž. Špiclin, “Enhancement of Vascular Structures in 3D and 2D Angiographic Images,” IEEE Trans. Med. Imaging, vol. 35, no. 9, pp. 2107–2118, Sep. 2016, doi: 10.1109/TMI.2016.2550102.
 
 (7) T. B. Smith and N. Smith, “Agreement and reliability statistics for shapes,” PLOS ONE, vol. 13, no. 8, p. e0202087, Aug. 2018, doi: 10.1371/journal.pone.0202087.
 
 (8) J. R. Mester et al., “In vivo neurovascular response to focused photoactivation of Channelrhodopsin-2,” NeuroImage, vol. 192, pp. 135–144, May 2019, doi: 10.1016/j.neuroimage.2019.01.036.
 
 AuthorResponse
Visit annotations in context

Tags

Summary

Review 1

AuthorResponse

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.01.24.577045v4
www.biorxiv.org www.biorxiv.org

Early experience affects foraging behavior of wild fruit-bats more than their original behavioral predispositions

2
1. Public_Reviews 06 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This paper provides important insight into how early life experience shapes adult behavior in fruit bats. The authors raised juvenile bats either in an impoverished or enriched environment and studied their foraging behaviors. The evidence is convincing that bats raised in enriched environments are more active, bold, and exploratory. The work will be of interest to ethologists and developmental psychologists.
  
  Summary
2. Public_Reviews 06 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  The authors show that early life experience of juvenile bats shape their outdoor foraging behaviors. They achieve this by raising juvenile bats either in an impoverished or enriched environment. They subsequently test the behavior of bats indoors and outdoors. The authors show that behavioral measures outdoors were more reliable in delineating the effect of early life experiences as the bats raised in enriched environments were more bold, active and exhibit higher exploratory tendencies.
  
  Strengths:
  
  The major strength of the study is providing a quantitative study of animal "personality" and how it is likely shaped by innate and environmental conditions. The other major strength is the ability to do reliable long term recording of bats in the outdoors giving researchers the opportunity to study bats in their natural habitat. To this point, the study also shows that the behavioral variables measured indoors do not correlate to that measured outdoor, thus providing a key insight into the importance of test animal behaviors in their natural habitat.
  
  Weaknesses were in the first round of review:
  
  It is not clear from the analysis presented in the paper how persistent those environmentally induced changes, do they remain with the bats till end of their lives.
  
  Comments on revisions:
  
  The authors have addressed those weaknesses and the paper is much stronger.
  
  Review 1
Visit annotations in context

Tags

Summary

Review 1

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.10.10.617636v2
www.biorxiv.org www.biorxiv.org

Tonotopy is not preserved in a descending stage of auditory cortex

3
1. Public_Reviews 06 Oct 2025
  
  in eLife
  
  eLife Assessment
  
  This revised manuscript presents an important characterization of mouse auditory cortex receptive field organization, utilizing two-photon imaging of specific subpopulations. They demonstrate a degradation of tonotopic organization from the input to the output neurons. The strength of the evidence is convincing.
  
  Summary
2. Public_Reviews 06 Oct 2025
  
  in eLife
  
  Reviewer #1 (Public review):
  
  Summary:
  
  In this study, Gu et al., employed novel viral strategies, combined with in vivo two-photon imaging, to map the tone response properties of two groups of cortical neurons in A1 - The thalamocortical recipient (TR neurons) and the corticothalamic (CT neurons). They observed a clear tonotopic gradient among TR neurons but not in CT neurons. Moreover, CT neurons exhibited high heterogeneity of their frequency tuning and broader bandwidth, suggesting increased synaptic integration in these neurons. By parsing out different projecting-specific neurons within A1, this study provides insight into how neurons with different connectivity can exhibit different frequency response-related topographic organization.
  
  Strengths:
  
  This study reveals the importance of studying neurons with projection specificity rather than layer specificity since neurons within the same layer have very diverse molecular, morphological, physiological, and connectional features. By utilizing a newly developed rabies virus CSN-N2c GCaMP-expressing vector, the authors can label and image specifically the neurons (CT neurons) in A1 that project to the MGB. To compare, they used an anterograde trans-synaptic tracing strategy to label and image neurons in A1 that receive input from MGB (TR neurons).
  
  Weaknesses:
  
  - Perhaps as cited in the introduction, it is well known that tonotopic gradient is well preserved across all layers within A1, but I feel if the authors want to highlight the specificity of their virus tracing strategy and the populations that they imaged in L2/3 (TR neurons) and L6 (CT neurons), they should perform control groups where they image general excitatory neurons in the two depths and compare to TR and CT neurons, respectively. This will show that it's not their imaging/analysis or behavioral paradigms that are different from other labs.
  
  - Fig 1D and G, the y-axis is Distance from pia (%). I'm not exactly sure what this means. How does % translate to real cortical thickness?
  
  - For Fig. 2G and H, is each circle a neuron or an animal? Why are they staggered on top of each other on the x-axis? If x-axis is thedistance from caudal to rostral, each neuron should have a different distance? Also,it seems like it's because Fig. 2H has more circles, that's why it has morevariation thus not significant (for example, at 600 or 900um, 2G seems to haveless circles than 2H).
  
  - Similar in Fig 2J and L, why are the circles staggered onthe y-axis now? And is each circle now a neuron or a trial? It seems they havemuch more circles than Fig 2G and 2H. Also I don't think doing a correlation isthe proper stats for this type of plot (this point applies to Fig. 3H and 3J)
  
  - What does inter-quartile range of BF (IQRBF, in octaves) imply? What's the interpretation of this analysis? I am confused why TR neurons showhigh IQR in HF areas compared to LF areas mean homogeneity among TR neurons (line 213 - 216). On the same note, how is this different from the BF variability? Isn't higher IQR = tohigher variability?
  
  - Fig. 4A-B, there's no clear critieria on how the authors categorize V, I, and O Shape. The descriptions in the Methods (line 721 - 725) are also very vague.
  
  Comments on revisions:
  
  The authors have addressed all my questions in the previous round.
  
  Review 1
3. Public_Reviews 06 Oct 2025
  
  in eLife
  
  Reviewer #2 (Public review):
  
  Summary:
  
  Gu and Liang et. al investigated how auditory information is mapped and transformed as it enters and exits a auditory cortex. They use anterograde transsynaptic tracers to label and perform calcium imaging of thalamorecipient neurons in A1 and retrograde tracers to label and perform calcium imaging of corticothalamic output neurons. They demonstrate a degradation of tonotopic organization from the input to output neurons.
  
  Strengths:
  
  The experiments appear well executed, well described, and analyzed.
  
  Weaknesses:
  
  (1) Given that the CT and TR neurons were imaged at different depths, the question as to whether not these differences could otherwise be explained by layer-specific differences is still not 100% resolved. Control measurements would be needed either by recording 1) CT neurons upper layers 2) TR in deeper layers 3) non-CT in deeper layers and/or 4) non-TR in upper layers.
  
  (2) What percent of the neurons at the depths being are CT neurons? Similar questions for TR neurons?
  
  (3) V-shaped, I-shaped, or O-shaped is not an intuitively understood nomenclature, consider changing. Further, the x/y axis for Figure 4a is not labeled, so it's not clear what the heat maps are supposed to represent.
  
  (4). Many references about projection neurons and cortical circuits are based on studies from visual or somatosensory cortex. Auditory cortex organization is not necessarily the same as other sensory areas. Auditory cortex references should be used specifically, and not sources reporting on S1, V1.
  
  Comments on revisions:
  
  The authors have fully addressed my concerns.
  
  Review 2
Visit annotations in context

Tags

Summary

Review 1

Review 2

Annotators

Public_Reviews

URL

biorxiv.org/content/10.1101/2024.05.25.595883v2

Public_Reviews

Annotations: 10,000

Joined: March 17, 2021

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators