10,000 Matching Annotations
  1. Dec 2024
    1. Reviewer #2 (Public Review):

      Summary:

      The goal of this paper is to present a new method, termed MINT, for decoding behavioral states from neural spiking data. MINT is a statistical method which, in addition to outputting a decoded behavioral state, also provides soft information regarding the likelihood of that behavioral state based on the neural data. The innovation in this approach is neural states are assumed to come from sparsely distributed neural trajectories with low tangling, meaning that neural trajectories (time sequences of neural states) are sparse in the high-dimensional space of neural spiking activity and that two dissimilar neural trajectories tend to correspond to dissimilar behavioral trajectories. The authors support these assumptions through analysis of previously collected data, and then validate the performance of their method by comparing it to a suite of alternative approaches. The authors attribute the typically improved decoding performance by MINT to its assumptions being more faithfully aligned to the properties of neural spiking data relative to assumptions made by the alternatives.

      Strengths:

      The paper did an excellent job critically evaluating common assumptions made by neural analytical methods, such as neural state being low-dimensional relative to the number of recorded neurons. The authors made strong arguments, supported by evidence and literature, for potentially high-dimensional neural states and thus the need for approaches that do not rely on an assumption of low dimensionality.

      The paper was thorough in considering multiple datasets across a variety of behaviors, as well as existing decoding methods, to benchmark the MINT approach. This provided a valuable comparison to validate the method. The authors also provided nice intuition regarding why MINT may offer performance improvement in some cases and in which instances MINT may not perform as well.

      In addition to providing a philosophical discussion as to the advantages of MINT and benchmarking against alternatives, the authors also provided a detailed description of practical considerations. This included training time, amount of training data, robustness to data loss or changes in the data, and interpretability. These considerations not only provided objective evaluation of practical aspects but also provided insights to the flexibility and robustness of the method as they relate back to the underlying assumptions and construction of the approach.

      Impact:

      This work is motivated by brain-computer interfaces applications, which it will surely impact in terms of neural decoder design. However, this work is also broadly impactful for neuroscientific analysis to relate neural spiking activity to observable behavioral features. Thus, MINT will likely impact neuroscience research generally. The methods are made publicly available, and the datasets used are all in public repositories, which facilitates adoption and validation of this method within the greater scientific community.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Summary of reviewers’ comments and our revisions: 

      We thank the reviewers for their thoughtful feedback. This feedback has motivated multiple revisions and additions that, in our view, have greatly improved the manuscript. This is especially true with regard to a major goal of this study: clearly defining existing scientific perspectives and delineating their decoding implications. In addition to building on this conceptual goal, we have expanded existing analyses and have added a new analysis of generalization using a newly collected dataset. We expect the manuscript will be of very broad interest, both to those interested in BCI development and to those interested in fundamental properties of neural population activity and its relationship with behavior.

      Importantly, all reviewers were convinced that MINT provided excellent performance, when benchmarked against existing methods, across a broad range of standard tasks:

      “their method shows impressive performance compared to more traditional decoding approaches” (R1) 

      “The paper was thorough in considering multiple datasets across a variety of behaviors, as well as existing decoding methods, to benchmark the MINT approach. This provided a valuable comparison to validate the method.” (R2) 

      “The fact that performance on stereotyped tasks is high is interesting and informative…” (R3)

      This is important. It is challenging to design a decoder that performs consistently across multiple domains and across multiple situations (including both decoding and neural state estimation). MINT does so. MINT consistently outperformed existing lightweight ‘interpretable’ decoders, despite being a lightweight interpretable decoder itself. MINT was very competitive with expressive machine-learning methods, yet has advantages in flexibility and simplicity that more ‘brute force’ methods do not. We made a great many comparisons, and MINT was consistently a strong performer. Of the many comparisons we made, there was only one where MINT was at a modest disadvantage, and it was for a dataset where all methods performed poorly. No other method we tested was as consistent. For example, although the GRU and the feedforward network were often competitive with MINT (and better than MINT in the one case mentioned above), there were multiple other situations where they performed less well and a few situations where they performed poorly. Moreover, no other existing decoder naturally estimates the neural state while also readily decoding, without retraining, a broad range of behavioral variables.

      R1 and R2 were very positive about the broader impacts of the study. They stressed its impact both on decoder design, and on how our field thinks, scientifically, about the population response in motor areas: 

      “This paper presents an innovative decoding approach for brain-computer interfaces” (R1)

      “presents a substantial shift in methodology, potentially revolutionizing the way BCIs interpret and predict neural behaviour” (R1)

      “the paper's strengths, particularly its emphasis on a trajectory-centric approach and the simplicity of MINT, provide a compelling contribution to the field” (R1)

      “The authors made strong arguments, supported by evidence and literature, for potentially high-dimensional neural states and thus the need for approaches that do not rely on an assumption of low dimensionality” (R2)

      “This work is motivated by brain-computer interfaces applications, which it will surely impact in terms of neural decoder design.” (R2)

      “this work is also broadly impactful for neuroscientific analysis... Thus, MINT will likely impact neuroscience research generally.” (R2)

      We agree with these assessments, and have made multiple revisions to further play into these strengths. As one example, the addition of Figure 1b (and 6b) makes this the first study, to our knowledge, to fully and concretely illustrate this emerging scientific perspective and its decoding implications. This is important, because multiple observations convince us that the field is likely to move away from the traditional perspective in Figure 1a, and towards that in Figure 1b. We also agree with the handful of weaknesses R1 and R2 noted. The manuscript has been revised accordingly. The major weakness noted by R1 was the need to be explicit regarding when we suspect MINT would (and wouldn’t) work well in other brain areas. In non-motor areas, the structure of the data may be poorly matched with MINT’s assumptions. We agree that this is likely to be true, and thus agree with the importance of clarifying this topic for the reader. The revision now does so. R1 also wished to know whether existing methods might benefit from including trial-averaged data during training, something we now explore and document (see detailed responses below). R2 noted two weaknesses: 1) The need to better support (with expanded analysis) the statement that neural and behavioral trajectories are non-isometric, and 2) The need to more rigorously define the ‘mesh’. We agree entirely with both suggestions, and the revision has been strengthened by following them (see detailed responses below).

      R3 also saw strengths to the work, stating that:

      “This paper is well-structured and its main idea is clear.” 

      “The fact that performance on stereotyped tasks is high is interesting and informative, showing that these stereotyped tasks create stereotyped neural trajectories.” 

      “The task-specific comparisons include various measures and a variety of common decoding approaches, which is a strength.”

      However, R3 also expressed two sizable concerns. The first is that MINT might have onerous memory requirements. The manuscript now clarifies that MINT has modest memory requirements. These do not scale unfavorably as the reviewer was concerned they might. The second concern is that MINT is: 

      “essentially a table-lookup rather than a model.”

      Although we don’t agree, the concern makes sense and may be shared by many readers, especially those who take a particular scientific perspective. Pondering this concern thus gave us the opportunity to modify the manuscript in ways that support its broader impact. Our revisions had two goals: 1) clarify the ways in which MINT is far more flexible than a lookup-table, and 2) better describe the dominant scientific perspectives and their decoding implications.

      The heart of R3’s concern is the opinion that MINT is an effective but unprincipled hack suitable for situations where movements are reasonably stereotyped. Of course, many tasks involve stereotyped movements (e.g. handwriting characters), so MINT would still be useful. Nevertheless, if MINT is not principled, other decode methods would often be preferable because they could (unlike MINT in R3’s opinion) gain flexibility by leveraging an accurate model. Most of R3’s comments flow from this fundamental concern: 

      “This is again due to MINT being a lookup table with a library of stereotyped trajectories rather than a model.”

      “MINT models task-dependent neural trajectories, so the trained decoder is very task-dependent and cannot generalize to other tasks.”

      “Unlike MINT, these works can achieve generalization because they model the neural subspace and its association to movement.”

      “given that MINT tabulates task-specific trajectories, it will not generalize to tasks that are not seen in the training data even when these tasks cover the exact same space (e.g., the same 2D computer screen and associated neural space).”

      “For proper training, the training data should explore the whole movement space and the associated neural space, but this does not mean all kinds of tasks performed in that space must be included in the training set (something MINT likely needs while modeling-based approaches do not).”

      The manuscript has been revised to clarify that MINT is considerably more flexible than a lookup table, even though a lookup table is used as a first step. Yet, on its own, this does not fully address R3’s concern. The quotes above highlight that R3 is making a standard assumption in our field: that there exists a “movement space and associated neural space”. Under this perspective, one should, as R3 argues fully explore the movement space. This would perforce fully explore the associated neural subspace. One can then “model the neural subspace and its association to movement”. MINT does not use a model of this type, and thus (from R3’s perspective) does not appear to use a model at all. A major goal of our study is to question this traditional perspective. We have thus added a new figure to highlight the contrast between the traditional (Figure 1a) and new (Figure 1b) scientific perspectives, and to clarify their decoding implications.

      While we favor the new perspective (Figure 1b), we concede that R3 may not share our view. This is fine. Part of the reason we believe this study is timely, and will be broadly read, is that it raises a topic of emerging interest where there is definitely room for debate. If we are misguided – i.e. if Figure 1a is the correct perspective – then many of R3’s concerns would be on target: MINT could still be useful, but traditional methods that make the traditional assumptions in Figure 1a would often be preferable. However, if the emerging perspective in Figure 1b is more accurate, then MINT’s assumptions would be better aligned with the data than those of traditional methods, making it a more (not less) principled choice.

      Our study provides new evidence in support of Figure 1b, while also synthesizing existing evidence from other recent studies. In addition to Figure 2, the new analysis of generalization further supports Figure 1b. Also supporting Figure 1b is the analysis in which MINT’s decoding advantage, over a traditional decoder, disappears when simulated data approximate the traditional perspective in Figure 1a.

      That said, we agree that the present study cannot fully resolve whether Figure 1a or 1b is more accurate. Doing so will take multiple studies with different approaches (indeed we are currently preparing other manuscripts on this topic). Yet we still have an informed scientific opinion, derived from past, present and yet-to-be-published observations. Our opinion is that Figure 1b is the more accurate perspective. This possibility makes it reasonable to explore the potential virtues of a decoding method whose assumptions are well-aligned with that perspective. MINT is such a method. As expected under Figure 1b, MINT outperforms traditional interpretable decoders in every single case we studied. 

      As noted above, we have added a new generalization-focused analysis (Figure 6) based on a newly collected dataset. We did so because R3’s comments highlight a deep point: which scientific perspective one takes has strong implications regarding decoder generalization. These implications are now illustrated in the new Figure 6a and 6b. Under Figure 6a, it is possible, as R3 suggests, to explore “the whole movement space and associated neural space” during training. However, under Figure 6b, expectations are very different. Generalization will be ‘easy’ when new trajectories are near the training-set trajectories. In this case, MINT should generalize well as should other methods. In contrast, generalization will be ‘hard’ when new neural trajectories have novel shapes and occupy previously unseen regions / dimensions. In this case, all current methods, including MINT, are likely to fail. R3 points out that traditional decoders have sometimes generalized well to new tasks (e.g. from center-out to ‘pinball’) when cursor movements occur in the same physical workspace. These findings could be taken to support Figure 6a, but are equally consistent with ‘easy’ generalization in Figure 6b. To explore this topic, the new analysis in Figure 6c-g considers conditions that are intended to span the range from easy to hard. Results are consistent with the predictions of Figure 6b. 

      We believe the manuscript has been significantly improved by these additions. The revisions help the manuscript achieve its twin goals: 1) introduce a novel class of decoder that performs very well despite being very simple, and 2) describe properties of motor-cortex activity that will matter for decoders of all varieties.

      Reviewer #1: 

      Summary: 

      This paper presents an innovative decoding approach for brain-computer interfaces (BCIs), introducing a new method named MINT. The authors develop a trajectory-centric approach to decode behaviors across several different datasets, including eight empirical datasets from the Neural Latents Benchmark. Overall, the paper is well written and their method shows impressive performance compared to more traditional decoding approaches that use a simpler approach. While there are some concerns (see below), the paper's strengths, particularly its emphasis on a trajectory-centric approach and the simplicity of MINT, provide a compelling contribution to the field. 

      We thank the reviewer for these comments. We share their enthusiasm for the trajectory-centric approach, and we are in complete agreement that this perspective has both scientific and decoding implications. The revision expands upon these strengths.

      Strengths: 

      The adoption of a trajectory-centric approach that utilizes statistical constraints presents a substantial shift in methodology, potentially revolutionizing the way BCIs interpret and predict neural behaviour. This is one of the strongest aspects of the paper. 

      Again, thank you. We also expect the trajectory-centric perspective to have a broad impact, given its relevance to both decoding and to thinking about manifolds.

      The thorough evaluation of the method across various datasets serves as an assurance that the superior performance of MINT is not a result of overfitting. The comparative simplicity of the method in contrast to many neural network approaches is refreshing and should facilitate broader applicability. 

      Thank you. We were similarly pleased to see such a simple method perform so well. We also agree that, while neural-network approaches will always be important, it is desirable to also possess simple ‘interpretable’ alternatives.

      Weaknesses:  

      Comment 1) Scope: Despite the impressive performance of MINT across multiple datasets, it seems predominantly applicable to M1/S1 data. Only one of the eight empirical datasets comes from an area outside the motor/somatosensory cortex. It would be beneficial if the authors could expand further on how the method might perform with other brain regions that do not exhibit low tangling or do not have a clear trial structure (e.g. decoding of position or head direction from hippocampus) 

      We agree entirely. Population activity in many brain areas (especially outside the motor system) presumably will often not have the properties upon which MINT’s assumptions are built. This doesn’t necessarily mean that MINT would perform badly. Using simulated data, we have found that MINT can perform surprisingly well even when some of its assumptions are violated. Yet at the same time, when MINT’s assumptions don’t apply, one would likely prefer to use other methods. This is, after all, one of the broader themes of the present study: it is beneficial to match decoding assumptions to empirical properties. We have thus added a section on this topic early in the Discussion: 

      “In contrast, MINT and the Kalman filter performed comparably on simulated data that better approximated the assumptions in Figure 1a. Thus, MINT is not a ‘better’ algorithm – simply better aligned with the empirical properties of motor cortex data. This highlights an important caveat. Although MINT performs well when decoding from motor areas, its assumptions may be a poor match in other areas (e.g. the hippocampus). MINT performed well on two non-motor-cortex datasets – Area2_Bump (S1) and DMFC_RSG (dorsomedial frontal cortex) – yet there will presumably be other brain areas and/or contexts where one would prefer a different method that makes assumptions appropriate for that area.”

      Comment 2) When comparing methods, the neural trajectories of MINT are based on averaged trials, while the comparison methods are trained on single trials. An additional analysis might help in disentangling the effect of the trial averaging. For this, the authors could average the input across trials for all decoders, establishing a baseline for averaged trials. Note that inference should still be done on single trials. Performance can then be visualized across different values of N, which denotes the number of averaged trials used for training. 

      We explored this question and found that the non-MINT decoders are harmed, not helped, by the inclusion of trial-averaged responses in the training set. This is presumably because the statistics of trialaveraged responses don’t resemble what will be observed during decoding. This statistical mismatch, between training and decoding, hurts most methods. It doesn’t hurt MINT, because MINT doesn’t ‘train’ in the normal way. It simply needs to know rates, and trial-averaging is a natural way to obtain them. To describe the new analysis, we have added the following to the text.

      “We also investigated the possibility that MINT gained its performance advantage simply by having access to trial-averaged neural trajectories during training, while all other methods were trained on single-trial data. This difference arises from the fundamental requirements of the decoder architectures: MINT needs to estimate typical trajectories while other methods don’t. Yet it might still be the case that other methods would benefit from including trial-averaged data in the training set, in addition to single-trial data. Alternatively, this might harm performance by creating a mismatch, between training and decoding, in the statistics of decoder inputs. We found that the latter was indeed the case: all non-MINT methods performed better when trained purely on single-trial data.”

      Reviewer #2:

      Summary: 

      The goal of this paper is to present a new method, termed MINT, for decoding behavioral states from neural spiking data. MINT is a statistical method which, in addition to outputting a decoded behavioral state, also provides soft information regarding the likelihood of that behavioral state based on the neural data. The innovation in this approach is neural states are assumed to come from sparsely distributed neural trajectories with low tangling, meaning that neural trajectories (time sequences of neural states) are sparse in the high-dimensional space of neural spiking activity and that two dissimilar neural trajectories tend to correspond to dissimilar behavioral trajectories. The authors support these assumptions through analysis of previously collected data, and then validate the performance of their method by comparing it to a suite of alternative approaches. The authors attribute the typically improved decoding performance by MINT to its assumptions being more faithfully aligned to the properties of neural spiking data relative to assumptions made by the alternatives. 

      We thank the reviewer for this accurate summary, and for highlighting the subtle but important fact that MINT provides information regarding likelihoods. The revision includes a new analysis (Figure 6e) illustrating one potential way to leverage knowledge of likelihoods.

      Strengths:  

      The paper did an excellent job critically evaluating common assumptions made by neural analytical methods, such as neural state being low-dimensional relative to the number of recorded neurons. The authors made strong arguments, supported by evidence and literature, for potentially high-dimensional neural states and thus the need for approaches that do not rely on an assumption of low dimensionality. 

      Thank you. We also hope that the shift in perspective is the most important contribution of the study. This shift matters both scientifically and for decoder design. The revision expands on this strength. The scientific alternatives are now more clearly and concretely illustrated (especially see Figure 1a,b and Figure 6a,b). We also further explore their decoding implications with new data (Figure 6c-g).

      The paper was thorough in considering multiple datasets across a variety of behaviors, as well as existing decoding methods, to benchmark the MINT approach. This provided a valuable comparison to validate the method. The authors also provided nice intuition regarding why MINT may offer performance improvement in some cases and in which instances MINT may not perform as well. 

      Thank you. We were pleased to be able to provide comparisons across so many datasets (we are grateful to the Neural Latents Benchmark for making this possible).

      In addition to providing a philosophical discussion as to the advantages of MINT and benchmarking against alternatives, the authors also provided a detailed description of practical considerations. This included training time, amount of training data, robustness to data loss or changes in the data, and interpretability. These considerations not only provided objective evaluation of practical aspects but also provided insights to the flexibility and robustness of the method as they relate back to the underlying assumptions and construction of the approach. 

      Thank you. We are glad that these sections were appreciated. MINT’s simplicity and interpretability are indeed helpful in multiple ways, and afford opportunities for interesting future extensions. One potential benefit of interpretability is now explored in the newly added Figure 6e. 

      Impact: 

      This work is motivated by brain-computer interfaces applications, which it will surely impact in terms of neural decoder design. However, this work is also broadly impactful for neuroscientific analysis to relate neural spiking activity to observable behavioral features. Thus, MINT will likely impact neuroscience research generally. The methods are made publicly available, and the datasets used are all in public repositories, which facilitates adoption and validation of this method within the greater scientific community. 

      Again, thank you. We have similar hopes for this study.

      Weaknesses (1 & 2 are related, and we have switched their order in addressing them): 

      Comment 2) With regards to the idea of neural and behavioral trajectories having different geometries, this is dependent on what behavioral variables are selected. In the example for Fig 2a, the behavior is reach position. The geometry of the behavioral trajectory of interest would look different if instead the behavior of interest was reach velocity. The paper would be strengthened by acknowledgement that geometries of trajectories are shaped by extrinsic choices rather than (or as much as they are) intrinsic properties of the data. 

      We agree. Indeed, we almost added a section to the original manuscript on this exact topic. We have now done so:

      “A potential concern regarding the analyses in Figure 2c,d is that they require explicit choices of behavioral variables: muscle population activity in Figure 2c and angular phase and velocity in Figure 2d. Perhaps these choices were misguided. Might neural and behavioral geometries become similar if one chooses ‘the right’ set of behavioral variables? This concern relates to the venerable search for movement parameters that are reliably encoded by motor cortex activity [69, 92–95]. If one chooses the wrong set of parameters (e.g. chooses muscle activity when one should have chosen joint angles) then of course neural and behavioral geometries will appear non-isometric. There are two reasons why this ‘wrong parameter choice’ explanation is unlikely to account for the results in Figure 2c,d. First, consider the implications of the left-hand side of Figure 2d. A small kinematic distance implies that angular position and velocity are nearly identical for the two moments being compared. Yet the corresponding pair of neural states can be quite distant. Under the concern above, this distance would be due to other encoded behavioral variables – perhaps joint angle and joint velocity – differing between those two moments. However, there are not enough degrees of freedom in this task to make this plausible. The shoulder remains at a fixed position (because the head is fixed) and the wrist has limited mobility due to the pedal design [60]. Thus, shoulder and elbow angles are almost completely determined by cycle phase. More generally, ‘external variables’ (positions, angles, and their derivatives) are unlikely to differ more than slightly when phase and angular velocity are matched. Muscle activity could be different because many muscles act on each joint, creating redundancy. However, as illustrated in Figure 2c, the key effect is just as clear when analyzing muscle activity. Thus, the above concern seems unlikely even if it can’t be ruled out entirely. A broader reason to doubt the ‘wrong parameter choice’ proposition is that it provides a vague explanation for a phenomenon that already has a straightforward explanation. A lack of isometry between the neural population response and behavior is expected when neural-trajectory tangling is low and output-null factors are plentiful [55, 60]. For example, in networks that generate muscle activity, neural and muscle-activity trajectories are far from isometric [52, 58, 60]. Given this straightforward explanation, and given repeated failures over decades to find the ‘correct’ parameters (muscle activity, movement direction, etc.) that create neural-behavior isometry, it seems reasonable to conclude that no such isometry exists.”

      Comment 1) The authors posit that neural and behavioral trajectories are non-isometric. To support this point, they look at distances between neural states and distances between the corresponding behavioral states, in order to demonstrate that there are differences in these distances in each respective space. This supports the idea that neural states and behavioral states are non-isometric but does not directly address their point. In order to say the trajectories are non-isometric, it would be better to look at pairs of distances between corresponding trajectories in each space. 

      We like this idea and have added such an analysis. To be clear, we like the original analysis too: isometry predicts that neural and behavioral distances (for corresponding pairs of points) should be strongly correlated, and that small behavioral distances should not be associated with large neural distances. These predictions are not true, providing a strong argument against isometry. However, we also like the reviewer’s suggestion, and have added such an analysis. It makes the same larger point, and also reveals some additional facts (e.g. it reveals that muscle-geometry is more related to neural-geometry than is kinematic-geometry). The new analysis is described in the following section:

      “We further explored the topic of isometry by considering pairs of distances. To do so, we chose two random neural states and computed their distance, yielding dneural1. We repeated this process, yielding dneural2. We then computed the corresponding pair of distances in muscle space (dmuscle1 and dmuscle2) and kinematic space (dkin1 and dkin2). We considered cases where dneural1 was meaningfully larger than (or smaller than) dneural2, and asked whether the behavioral variables had the same relationship; e.g. was dmuscle1 also larger than dmuscle2? For kinematics, this relationship was weak: across 100,000 comparisons, the sign of dkin1 − dkin2 agreed with dneural1 − dneural2 only 67.3% of the time (with 50% being chance). The relationship was much stronger for muscles: the sign of dmuscle1 − dmuscle2 agreed with dneural1 − dneural2 79.2% of the time, which is far more than expected by chance yet also far from what is expected given isometry (e.g. the sign agrees 99.7% of the time for the truly isometric control data in Figure 2e). Indeed there were multiple moments during this task when dneural1 was much larger than dneural2, yet dmuscle1 was smaller than dmuscle2. These observations are consistent with the proposal that neural trajectories resemble muscle trajectories in some dimensions, but with additional output-null dimensions that break the isometry [60].”

      Comment 3) The approach is built up on the idea of creating a "mesh" structure of possible states. In the body of the paper the definition of the mesh was not entirely clear and I could not find in the methods a more rigorous explicit definition. Since the mesh is integral to the approach, the paper would be improved with more description of this component. 

      This is a fair criticism. Although MINTs actual operations were well-documented, how those operations mapped onto the term ‘mesh’ was, we agree, a bit vague. The definition of the mesh is a bit subtle because it only emerges during decoding rather than being precomputed. This is part of what gives MINT much more flexibility than a lookup table. We have added the following to the manuscript.

      “We use the term ‘mesh’ to describe the scaffolding created by the training-set trajectories and the interpolated states that arise at runtime. The term mesh is apt because, if MINT’s assumptions are correct, interpolation will almost always be local. If so, the set of decodable states will resemble a mesh, created by line segments connecting nearby training-set trajectories. However, this mesh-like structure is not enforced by MINT’s operations.

      Interpolation could, in principle, create state-distributions that depart from the assumption of a sparse manifold. For example, interpolation could fill in the center of the green tube in Figure 1b, resulting in a solid manifold rather than a mesh around its outer surface. However, this would occur only if spiking observations argued for it. As will be documented below, we find that essentially all interpolation is local”

      We have also added Figure 4d. This new analysis documents the fact that decoded states are near trainingset trajectories, which is why the term ‘mesh’ is appropriate.

      Reviewer #3:

      Summary:  

      This manuscript develops a new method termed MINT for decoding of behavior. The method is essentially a table-lookup rather than a model. Within a given stereotyped task, MINT tabulates averaged firing rate trajectories of neurons (neural states) and corresponding averaged behavioral trajectories as stereotypes to construct a library. For a test trial with a realized neural trajectory, it then finds the closest neural trajectory to it in the table and declares the associated behavior trajectory in the table as the decoded behavior. The method can also interpolate between these tabulated trajectories. The authors mention that the method is based on three key assumptions: (1) Neural states may not be embedded in a lowdimensional subspace, but rather in a high-dimensional space. (2) Neural trajectories are sparsely distributed under different behavioral conditions. (3) These neural states traverse trajectories in a stereotyped order.  

      The authors conducted multiple analyses to validate MINT, demonstrating its decoding of behavioral trajectories in simulations and datasets (Figures 3, 4). The main behavior decoding comparison is shown in Figure 4. In stereotyped tasks, decoding performance is comparable (M_Cycle, MC_Maze) or better (Area 2_Bump) than other linear/nonlinear algorithms

      (Figure 4). However, MINT underperforms for the MC_RTT task, which is less stereotyped (Figure 4).  

      This paper is well-structured and its main idea is clear. The fact that performance on stereotyped tasks is high is interesting and informative, showing that these stereotyped tasks create stereotyped neural trajectories. The task-specific comparisons include various measures and a variety of common decoding approaches, which is a strength. However, I have several major concerns. I believe several of the conclusions in the paper, which are also emphasized in the abstract, are not accurate or supported, especially about generalization, computational scalability, and utility for BCIs. MINT is essentially a table-lookup algorithm based on stereotyped task-dependent trajectories and involves the tabulation of extensive data to build a vast library without modeling. These aspects will limit MINT's utility for real-world BCIs and tasks. These properties will also limit MINT's generalizability from task to task, which is important for BCIs and thus is commonly demonstrated in BCI experiments with other decoders without any retraining. Furthermore, MINT's computational and memory requirements can be prohibitive it seems. Finally, as MINT is based on tabulating data without learning models of data, I am unclear how it will be useful in basic investigations of neural computations. I expand on these concerns below.  

      We thank the reviewer for pointing out weaknesses in our framing and presentation. The comments above made us realize that we needed to 1) better document the ways in which MINT is far more flexible than a lookup-table, and 2) better explain the competing scientific perspectives at play. R3’s comments also motivated us to add an additional analysis of generalization. In our view the manuscript is greatly improved by these additions. Specifically, these additions directly support the broader impact that we hope the study will have.

      For simplicity and readability, we first group and summarize R3’s main concerns in order to better address them. (These main concerns are all raised above, in addition to recurring in the specific comments below. Responses to each individual specific comment are provided after these summaries.)

      (1) R3 raises concerns about ‘computational scalability.’ The concern is that “MINT's computational and memory requirements can be prohibitive.” This point was expanded upon in a specific comment, reproduced below:

      I also find the statement in the abstract and paper that "computations are simple, scalable" to be inaccurate. The authors state that MINT's computational cost is O(NC) only, but it seems this is achieved at a high memory cost as well as computational cost in training. The process is described in section "Lookup table of log-likelihoods" on line [978-990]. The idea is to precompute the log-likelihoods for any combination of all neurons with discretization x all delay/history segments x all conditions and to build a large lookup table for decoding. Basically, the computational cost of precomputing this table is O(V^{Nτ} x TC) and the table requires a memory of O(V^{Nτ}), where V is the number of discretization points for the neural firing rates, N is the number of neurons, τ is the history length, T is the trial length, and C is the number of conditions. This is a very large burden, especially the V^{Nτ} term. This cost is currently not mentioned in the manuscript and should be clarified in the main text. Accordingly, computation claims should be modified including in the abstract.

      The revised manuscript clarifies that our statement (that computations are simple and scalable) is absolutely accurate. There is no need to compute, or store, a massive lookup table. There are three tables: two of modest size and one that is tiny. This is now better explained:

      “Thus, the log-likelihood of , for a particular current neural state, is simply the sum of many individual log-likelihoods (one per neuron and time-bin). Each individual log-likelihood depends on only two numbers: the firing rate at that moment and the spike count in that bin. To simplify online computation, one can precompute the log-likelihood, under a Poisson model, for every plausible combination of rate and spike-count. For example, a lookup table of size 2001 × 21 is sufficient when considering rates that span 0-200 spikes/s in increments of 0.1 spikes/s, and considering 20 ms bins that contain at most 20 spikes (only one lookup table is ever needed, so long as its firing-rate range exceeds that of the most-active neuron at the most active moment in Ω). Now suppose we are observing a population of 200 neurons, with a 200 ms history divided into ten 20 ms bins. For each library state, the log-likelihood of the observed spike-counts is simply the sum of 200 × 10 = 2000 individual loglikelihoods, each retrieved from the lookup table. In practice, computation is even simpler because many terms can be reused from the last time bin using a recursive solution (Methods). This procedure is lightweight and amenable to real-time applications.”

      In summary, the first table simply needs to contain the firing rate of each neuron, for each condition, and each time in that condition. This table consumes relatively little memory. Assuming 100 one-second-long conditions (rates sampled every 20 ms) and 200 neurons, the table would contain 100 x 50 x 200 = 1,000,000 numbers. These numbers are typically stored as 16-bit integers (because rates are quantized), which amounts to about 2 MB. This is modest, given that most computers have (at least) tens of GB of RAM. A second table would contain the values for each behavioral variable, for each condition, and each time in that condition. This table might contain behavioral variables at a finer resolution (e.g. every millisecond) to enable decoding to update in between 20 ms bins (1 ms granularity is not needed for most BCI applications, but is the resolution used in this study). The number of behavioral variables of interest for a particular BCI application is likely to be small, often 1-2, but let’s assume for this example it is 10 (e.g. x-, y-, and z-position, velocity, and acceleration of a limb, plus one other variable). This table would thus contain 100 x 1000 x 10 = 1,000,000 floating point numbers, i.e. an 8 MB table. The third table is used to store the probability of s spikes being observed given a particular quantized firing rate (e.g. it may contain probabilities associated with firing rates ranging from 0 – 200 spikes/s in 0.1 spikes/s increments). This table is not necessary, but saves some computation time by precomputing numbers that will be used repeatedly. This is a very small table (typically ~2000 x 20, i.e. 320 KB). It does not need to be repeated for different neurons or conditions, because Poisson probabilities depend on only rate and count.

      (2) R3 raises a concern that MINT “is essentially a table-lookup rather than a model.’ R3 states that MINT 

      “is essentially a table-lookup algorithm based on stereotyped task-dependent trajectories and involves the tabulation of extensive data to build a vast library without modeling.”

      and that,

      “as MINT is based on tabulating data without learning models of data, I am unclear how it will be useful in basic investigations of neural computations.”

      This concern is central to most subsequent concerns. The manuscript has been heavily revised to address it. The revisions clarify that MINT is much more flexible than a lookup table, even though MINT uses a lookup table as its first step. Because R3’s concern is intertwined with one’s scientific assumptions, we have also added the new Figure 1 to explicitly illustrate the two key scientific perspectives and their decoding implications. 

      Under the perspective in Figure 1a, R3 would be correct in saying that there exist traditional interpretable decoders (e.g. a Kalman filter) whose assumptions better model the data. Under this perspective, MINT might still be an excellent choice in many cases, but other methods would be expected to gain the advantage when situations demand more flexibility. This is R3’s central concern, and essentially all other concerns flow from it. It makes sense that R3 has this concern, because their comments repeatedly stress a foundational assumption of the perspective in Figure 1a: the assumption of a fixed lowdimensional neural subspace where activity has a reliable relationship to behavior that can be modeled and leveraged during decoding. The phrases below accord with that view:

      “Unlike MINT, these works can achieve generalization because they model the neural subspace and its association to movement.”

      “it will not generalize… even when these tasks cover the exact same space (e.g., the same 2D computer screen and associated neural space).”

      “For proper training, the training data should explore the whole movement space and the associated neural space”

      “I also believe the authors should clarify the logic behind developing MINT better. From a scientific standpoint, we seek to gain insights into neural computations by making various assumptions and building models that parsimoniously describe the vast amount of neural data rather than simply tabulating the data. For instance, low-dimensional assumptions have led to the development of numerous dimensionality reduction algorithms and these models have led to important interpretations about the underlying dynamics”

      Thus, R3 prefers a model that 1) assumes a low-dimensional subspace that is fixed across tasks and 2) assumes a consistent ‘association’ between neural activity and kinematics. Because R3 believes this is the correct model of the data, they believe that decoders should leverage it. Traditional interpretable method do, and MINT doesn’t, which is why they find MINT to be unprincipled. This is a reasonable view, but it is not our view. We have heavily revised the manuscript to clarify that a major goal of our study is to explore the implications of a different, less-traditional scientific perspective.

      The new Figure 1a illustrates the traditional perspective. Under this perspective, one would agree with R3’s claim that other methods have the opportunity to model the data better. For example, suppose there exists a consistent neural subspace – conserved across tasks – where three neural dimensions encode 3D hand position and three additional neural dimensions encode 3D hand velocity. A traditional method such as a Kalman filter would be a very appropriate choice to model these aspects of the data.

      Figure 1b illustrates the alternative scientific perspective. This perspective arises from recent, present, and to-be-published observations. MINT’s assumptions are well-aligned with this perspective. In contrast, the assumptions of traditional methods (e.g. the Kalman filter) are not well-aligned with the properties of the data under this perspective. This does not mean traditional methods are not useful. Yet under Figure 1b, it is traditional methods, such as the Kalman filter, that lack an accurate model of the data. Of course, the reviewer may disagree with our scientific perspective. We would certainly concede that there is room for debate. However, we find the evidence for Figure 1b to be sufficiently strong that it is worth exploring the utility of methods that align with this scientific perspective. MINT is such a method. As we document, it performs very well.

      Thus, in our view, MINT is quite principled because its assumptions are well aligned with the data. It is true that the features of the data that MINT models are a bit different from those that are traditionally modeled. For example, R3 is quite correct that MINT does not attempt to use a biomimetic model of the true transformation from neural activity, to muscle activity, and thence to kinematics. We see this as a strength, and the manuscript has been revised accordingly (see paragraph beginning with “We leveraged this simulated data to compare MINT with a biomimetic decoder”).

      (3) R3 raises concerns that MINT cannot generalize. This was a major concern of R3 and is intimately related to concern #2 above. The concern is that, if MINT is “essentially a lookup table” that simply selects pre-defined trajectories, then MINT will not be able to generalize. R3 is quite correct that MINT generalizes rather differently than existing methods. Whether this is good or bad depends on one’s scientific perspective. Under Figure 1a, MINT’s generalization would indeed be limiting because other methods could achieve greater flexibility. Under Figure 1b, all methods will have serious limits regarding generalization. Thus, MINT’s method for generalizing may approximate the best one can presently do. To address this concern, we have made three major changes, numbered i-iii below:

      i) Large sections of the manuscript have been restructured to underscore the ways in which MINT can generalize. A major goal was to counter the impression, stated by R3 above, that: 

      “for a test trial with a realized neural trajectory, [MINT] then finds the closest neural trajectory to it in the table and declares the associated behavior trajectory in the table as the decoded behavior”.

      This description is a reasonable way to initially understand how MINT works, and we concede that we may have over-used this intuition. Unfortunately, it can leave the misimpression that MINT decodes by selecting whole trajectories, each corresponding to ‘a behavior’. This can happen, but it needn’t and typically doesn’t. As an example, consider the cycling task. Suppose that the library consists of stereotyped trajectories, each four cycles long, at five fixed speeds from 0.5-2.5 Hz. If the spiking observations argued for it, MINT could decode something close to one of these five stereotyped trajectories. Yet it needn’t. Decoded trajectories will typically resemble library trajectories locally, but may be very different globally. For example, a decoded trajectory could be thirty cycles long (or two, or five hundred) perhaps speeding up and slowing down multiple times across those cycles.

      Thus, the library of trajectories shouldn’t be thought of as specifying a limited set of whole movements that can be ‘selected from’. Rather, trajectories define a scaffolding that outlines where the neural state is likely to live and how it is likely to be changing over time. When we introduce the idea of library trajectories, we are now careful to stress that they don’t function as a set from which one trajectory is ‘declared’ to be the right one:

      “We thus designed MINT to approximate that manifold using the trajectories themselves, rather than their covariance matrix or corresponding subspace. Unlike a covariance matrix, neural trajectories indicate not only which states are likely, but also which state-derivatives are likely. If a neural state is near previously observed states, it should be moving in a similar direction. MINT leverages this directionality.

      Training-set trajectories can take various forms, depending on what is convenient to collect. Most simply, training data might include one trajectory per condition, with each condition corresponding to a discrete movement. Alternatively, one might instead employ one long trajectory spanning many movements. Another option is to employ many sub-trajectories, each briefer than a whole movement. The goal is simply for training-set trajectories to act as a scaffolding, outlining the manifold that might be occupied during decoding and the directions in which decoded trajectories are likely to be traveling.”

      Later in that same section we stress that decoded trajectories can move along the ‘mesh’ in nonstereotyped ways:

      “Although the mesh is formed of stereotyped trajectories, decoded trajectories can move along the mesh in non-stereotyped ways as long as they generally obey the flow-field implied by the training data. This flexibility supports many types of generalization, including generalization that is compositional in nature. Other types of generalization – e.g. from the green trajectories to the orange trajectories in Figure 1b – are unavailable when using MINT and are expected to be challenging for any method (as will be documented in a later section).”

      The section “Training and decoding using MINT” has been revised to clarify the ways in which interpolation is flexible, allowing decoded movements to be globally very different from any library trajectory.

      “To decode stereotyped trajectories, one could simply obtain the maximum-likelihood neural state from the library, then render a behavioral decode based on the behavioral state with the same values of c and k. This would be appropriate for applications in which conditions are categorical, such as typing or handwriting. Yet in most cases we wish for the trajectory library to serve not as an exhaustive set of possible states, but as a scaffolding for the mesh of possible states. MINT’s operations are thus designed to estimate any neural trajectory – and any corresponding behavioral trajectory – that moves along the mesh in a manner generally consistent with the trajectories in Ω.”

      “…interpolation allows considerable flexibility. Not only is one not ‘stuck’ on a trajectory from Φ, one is also not stuck on trajectories created by weighted averaging of trajectories in Φ. For example, if cycling speed increases, the decoded neural state could move steadily up a scaffolding like that illustrated in Figure 1b (green). In such cases, the decoded trajectory might be very different in duration from any of the library trajectories. Thus, one should not think of the library as a set of possible trajectories that are selected from, but rather as providing a mesh-like scaffolding that defines where future neural states are likely to live and the likely direction of their local motion. The decoded trajectory may differ considerably from any trajectory within Ω.”

      This flexibility is indeed used during movement. One empirical example is described in detail:

      “During movement… angular phase was decoded with effectively no net drift over time. This is noteworthy because angular velocity on test trials never perfectly matched any of the trajectories in Φ. Thus, if decoding were restricted to a library trajectory, one would expect growing phase discrepancies. Yet decoded trajectories only need to locally (and approximately) follow the flow-field defined by the library trajectories. Based on incoming spiking observations, decoded trajectories speed up or slow down (within limits).

      This decoding flexibility presumably relates to the fact that the decoded neural state is allowed to differ from the nearest state in Ω. To explore… [the text goes on to describe the new analysis in Figure 4d, which shows that the decoded state is typically not on any trajectory, though it is typically close to a trajectory].”

      Thus, MINT’s operations allow considerable flexibility, including generalization that is compositional in nature. Yet R3 is still correct that there are other forms of generalization that are unavailable to MINT. This is now stressed at multiple points in the revision. However, under the perspective in Figure 1b, these forms of generalization are unavailable to any current method. Hence we made a second major change in response to this concern…  ii) We explicitly illustrate how the structure of the data determines when generalization is or isn’t possible. The new Figure 1a,b introduces the two perspectives, and the new Figure 6a,b lays out their implications for generalization. Under the perspective in Figure 6a, the reviewer is quite right: other methods can generalize in ways that MINT cannot. Under the perspective in Figure 6b, expectations are very different. Those expectations make testable predictions. Hence the third major change… iii) We have added an analysis of generalization, using a newly collected dataset. This dataset was collected using Neuropixels Probes during our Pac-Man force-tracking task. This dataset was chosen because it is unusually well-suited to distinguishing the predictions in Figure 6a versus Figure 6b. Finding a dataset that can do so is not simple. Consider R3’s point that training data should “explore the whole movement space and the associated neural space”. The physical simplicity of the Pac-Man task makes it unusually easy to confirm that the behavioral workspace has been fully explored. Importantly, under Figure 6b, this does not mean that the neural workspace has been fully explored, which is exactly what we wish to test when testing generalization. We do so, and compare MINT with a Wiener filter. A Wiener filter is an ideal comparison because it is simple, performs very well on this task, and should be able to generalize well under Figure 1a. Additionally, the Wiener filter (unlike the Kalman Filter) doesn’t leverage the assumption that neural activity reflects the derivative of force. This matters because we find that neural activity does not reflect dforce/dt in this task. The Wiener filter is thus the most natural choice of the interpretable methods whose assumptions match Figure 1a.

      The new analysis is described in Figure 6c-g and accompanying text. Results are consistent with the predictions of Figure 6b. We are pleased to have been motivated to add this analysis for two reasons. First, it provides an additional way of evaluating the predictions of the two competing scientific perspectives that are at the heart of our study. Second, this analysis illustrates an underappreciated way in which generalization is likely to be challenging for any decode method. It can be tempting to think that the main challenge regarding generalization is to fully explore the relevant behavioral space. This makes sense if a behavioral space has “an associated neural space”. However, we are increasingly of the opinion that it doesn’t. Different tasks often involve different neural subspaces, even when behavioral subspaces overlap. We have even seen situations where motor output is identical but neural subspaces are quite different. These facts are relevant to any decoder, something highlighted in the revised Introduction:

      “MINT’s performance confirms that there are gains to be made by building decoders whose assumptions match a different, possibly more accurate view of population activity. At the same time, our results suggest fundamental limits on decoder generalization. Under the assumptions in Figure 1b, it will sometimes be difficult or impossible for decoders to generalize to not-yet-seen tasks. We found that this was true regardless of whether one uses MINT or a more traditional method. This finding has implications regarding when and how generalization should be attempted.”

      We have also added an analysis (Figure 6e) illustrating how MINT’s ability to compute likelihoods can be useful in detecting situations that may strain generalization (for any method). MINT is unusual in being able to compute and use likelihoods in this way.

      Detailed responses to R3: we reproduce each of R3’s specific concerns below, but concentrate our responses on issues not already covered above.

      Main comments: 

      Comment 1. MINT does not generalize to different tasks, which is a main limitation for BCI utility compared with prior BCI decoders that have shown this generalizability as I review below. Specifically, given that MINT tabulates task-specific trajectories, it will not generalize to tasks that are not seen in the training data even when these tasks cover the exact same space (e.g., the same 2D computer screen and associated neural space). 

      First, the authors provide a section on generalization, which is inaccurate because it mixes up two fundamentally different concepts: 1) collecting informative training data and 2) generalizing from task to task. The former is critical for any algorithm, but it does not imply the latter. For example, removing one direction of cycling from the training set as the authors do here is an example of generating poor training data because the two behavioral (and neural) directions are non-overlapping and/or orthogonal while being in the same space. As such, it is fully expected that all methods will fail. For proper training, the training data should explore the whole movement space and the associated neural space, but this does not mean all kinds of tasks performed in that space must be included in the training set (something MINT likely needs while modeling-based approaches do not). Many BCI studies have indeed shown this generalization ability using a model. For example, in Weiss et al. 2019, center-out reaching tasks are used for training and then the same trained decoder is used for typing on a keyboard or drawing on the 2D screen. In Gilja et al. 2012, training is on a center-out task but the same trained decoder generalizes to a completely different pinball task (hit four consecutive targets) and tasks requiring the avoidance of obstacles and curved movements. There are many more BCI studies, such as Jarosiewicz et al. 2015 that also show generalization to complex realworld tasks not included in the training set. Unlike MINT, these works can achieve generalization because they model the neural subspace and its association to movement. On the contrary, MINT models task-dependent neural trajectories, so the trained decoder is very task-dependent and cannot generalize to other tasks. So, unlike these prior BCIs methods, MINT will likely actually need to include every task in its library, which is not practical. 

      I suggest the authors remove claims of generalization and modify their arguments throughout the text and abstract. The generalization section needs to be substantially edited to clarify the above points. Please also provide the BCI citations and discuss the above limitation of MINT for BCIs. 

      As discussed above, R3’s concerns are accurate under the view in Figure 1a (and the corresponding Figure 6a). Under this view, a method such as that in Gilja et al. or Jarosiewicz et al. can find the correct subspace, model the correct neuron-behavior correlations, and generalize to any task that uses “the same 2D computer screen and associated neural space”, just as the reviewer argues. Under Figure 1b things are quite different.

      This topic – and the changes we have made to address it – is covered at length above. Here we simply want to highlight an empirical finding: sometimes two tasks use the same neural subspace and sometimes they don’t. We have seen both in recent data, and it is can be very non-obvious which will occur based just on behavior. It does not simply relate to whether one is using the same physical workspace. We have even seen situations where the patterns of muscle activity in two tasks are nearly identical, but the neural subspaces are fairly different. When a new task uses a new subspace, neither of the methods noted above (Gilja nor Jarosiewicz) will generalize (nor will MINT). Generalizing to a new subspace is basically impossible without some yet-to-be-invented approach. On the other hand, there are many other pairs of tasks (center-out-reaching versus some other 2D cursor control) where subspaces are likely to be similar, especially if the frequency content of the behavior is similar (in our recent experience this is often critical). When subspaces are shared, most methods will generalize, and that is presumably why generalization worked well in the studies noted above.

      Although MINT can also generalize in such circumstances, R3 is correct that, under the perspective in Figure 1a, MINT will be more limited than other methods. This is now carefully illustrated in Figure 6a. In this traditional perspective, MINT will fail to generalize in cases where new trajectories are near previously observed states, yet move in very different ways from library trajectories. The reason we don’t view this is a shortcoming is that we expect it to occur rarely (else tangling would be high). We thus anticipate the scenario in Figure 6b.

      This is worth stressing because R3 states that our discussion of generalization “is inaccurate because it mixes up two fundamentally different concepts: 1) collecting informative training data and 2) generalizing from task to task.” We have heavily revised this section and improved it. However, it was never inaccurate. Under Figure 6b, these two concepts absolutely are mixed up. If different tasks use different neural subspaces, then this requires collecting different “informative training data” for each. One cannot simply count on having explored the physical workspace.

      Comment 2. MINT is shown to achieve competitive/high performance in highly stereotyped datasets with structured trials, but worse performance on MC_RTT, which is not based on repeated trials and is less stereotyped. This shows that MINT is valuable for decoding in repetitive stereotyped use-cases. However, it also highlights a limitation of MINT for BCIs, which is that MINT may not work well for real-world and/or less-constrained setups such as typing, moving a robotic arm in 3D space, etc. This is again due to MINT being a lookup table with a library of stereotyped trajectories rather than a model. Indeed, the authors acknowledge that the lower performance on MC_RTT (Figure 4) may be caused by the lack of repeated trials of the same type. However, real-world BCI decoding scenarios will also not have such stereotyped trial structure and will be less/un-constrained, in which MINT underperforms. Thus, the claim in the abstract or lines 480-481 that MINT is an "excellent" candidate for clinical BCI applications is not accurate and needs to be qualified. The authors should revise their statements according and discuss this issue. They should also make the use-case of MINT on BCI decoding clearer and more convincing. 

      We discussed, above, multiple changes and additions to the revision that were made to address these concerns. Here we briefly expand on the comment that MINT achieves “worse performance on MC_RTT, which is not based on repeated trials and is less stereotyped”. All decoders performed poorly on this task. MINT still outperformed the two traditional methods, but this was the only dataset where MINT did not also perform better (overall) than the expressive GRU and feedforward network. There are probably multiple reasons why. We agree with R3 that one likely reason is that this dataset is straining generalization, and MINT may have felt this strain more than the two machine-learning-based methods. Another potential reason is the structure of the training data, which made it more challenging to obtain library trajectories in the first place. Importantly, these observations do not support the view in Figure 1a. MINT still outperformed the Kalman and Wiener filters (whose assumptions align with Fig. 1a). To make these points we have added the following:

      “Decoding was acceptable, but noticeably worse, for the MC_RTT dataset… As will be discussed below, every decode method achieved its worst estimates of velocity for the MC_RTT dataset. In addition to the impact of slower reaches, MINT was likely impacted by training data that made it challenging to accurate estimate library trajectories. Due to the lack of repeated trials, MINT used AutoLFADS to estimate the neural state during training. In principle this should work well. In practice AutoLFADS may have been limited by having only 10 minutes of training data. Because the random-target task involved more variable reaches, it may also have stressed the ability of all methods to generalize, perhaps for the reasons illustrated in Figure 1b.

      The only dataset where MINT did not perform the best overall was the MC_RTT dataset, where it was outperformed by the feedforward network and GRU. As noted above, this may relate to the need for MINT to learn neural trajectories from training data that lacked repeated trials of the same movement (a design choice one might wish to avoid). Alternatively, the less-structured MC_RTT dataset may strain the capacity to generalize; all methods experienced a drop in velocity-decoding R2 for this dataset compared to the others. MINT generalizes somewhat differently than other methods, and may have been at a modest disadvantage for this dataset. A strong version of this possibility is that perhaps the perspective in Figure 1a is correct, in which case MINT might struggle because it cannot use forms of generalization that are available to other methods (e.g. generalization based on neuron-velocity correlations). This strong version seems unlikely; MINT continued to significantly outperform the Wiener and Kalman filters, which make assumptions aligned with Figure 1a.”

      Comment 3. Related to 2, it may also be that MINT achieves competitive performance in offline and trial-based stereotyped decoding by overfitting to the trial structure in a given task, and thus may not generalize well to online performance due to overfitting. For example, a recent work showed that offline decoding performance may be overfitted to the task structure and may not represent online performance (Deo et al. 2023). Please discuss. 

      We agree that a limitation of our study is that we do not test online performance. There are sensible reasons for this decision:

      “By necessity and desire, all comparisons were made offline, enabling benchmarked performance across a variety of tasks and decoded variables, where each decoder had access to the exact same data and recording conditions.”

      We recently reported excellent online performance in the cycling task with a different algorithm

      (Schroeder et al. 2022). In the course of that study, we consistently found that improvements in our offline decoding translated to improvements in our online decoding. We thus believe that MINT (which improves on the offline performance of our older algorithm) is a good candidate to work very well online. Yet we agree this still remains to be seen. We have added the following to the Discussion:

      “With that goal in mind, there exist three important practical considerations. First, some decode algorithms experience a performance drop when used online. One presumed reason is that, when decoding is imperfect, the participant alters their strategy which in turn alters the neural responses upon which decoding is based. Because MINT produces particularly accurate decoding, this effect may be minimized, but this cannot be known in advance. If a performance drop does indeed occur, one could adapt the known solution of retraining using data collected during online decoding [13]. Another presumed reason (for a gap between offline and online decoding) is that offline decoders can overfit the temporal structure in training data [107]. This concern is somewhat mitigated by MINT’s use of a short spike-count history, but MINT may nevertheless benefit from data augmentation strategies such as including timedilated versions of learned trajectories in the libraries”

      Comment 4. Related to 2, since MINT requires firing rates to generate the library and simple averaging does not work for this purpose in the MC_RTT dataset (that does not have repeated trials), the authors needed to use AutoLFADS to infer the underlying firing rates. The fact that MINT requires the usage of another model to be constructed first and that this model can be computationally complex, will also be a limiting factor and should be clarified. 

      This concern relates to the computational complexity of computing firing-rate trajectories during training. Usually, rates are estimated via trial-averaging, which makes MINT very fast to train. This was quite noticeable during the Neural Latents Benchmark competition. As one example, for the “MC_Scaling 5 ms Phase”, MINT took 28 seconds to train while GPFA took 30 minutes, the transformer baseline (NDT) took 3.5 hours, and the switching nonlinear dynamical system took 4.5 hours.

      However, the reviewer is quite correct that MINT’s efficiency depends on the method used to construct the library of trajectories. As we note, “MINT is a method for leveraging a trajectory library, not a method for constructing it”. One can use trial-averaging, which is very fast. One can also use fancier, slower methods to compute the trajectories. We don’t view this as a negative – it simply provides options. Usually one would choose trial-averaging, but one does not have to. In the case of MC_RTT, one has a choice between LFADS and grouping into pseudo-conditions and averaging (which is fast). LFADS produces higher performance at the cost of being slower. The operator can choose which they prefer. This is discussed in the following section:

      “For MINT, ‘training’ simply means computation of standard quantities (e.g. firing rates) rather than parameter optimization. MINT is thus typically very fast to train (Table 1), on the order of seconds using generic hardware (no GPUs). This speed reflects the simple operations involved in constructing the library of neural-state trajectories: filtering of spikes and averaging across trials. At the same time we stress that MINT is a method for leveraging a trajectory library, not a method for constructing it. One may sometimes wish to use alternatives to trial-averaging, either of necessity or because they improve trajectory estimates. For example, for the MC_RTT task we used AutoLFADS to infer the library. Training was consequently much slower (hours rather than seconds) because of the time taken to estimate rates. Training time could be reduced back to seconds using a different approach – grouping into pseudo-conditions and averaging – but performance was reduced. Thus, training will typically be very fast, but one may choose time-consuming methods when appropriate.”

      Comment 5. I also find the statement in the abstract and paper that "computations are simple, scalable" to be inaccurate. The authors state that MINT's computational cost is O(NC) only, but it seems this is achieved at a high memory cost as well as computational cost in training. The process is described in section "Lookup table of log-likelihoods" on line [978-990]. The idea is to precompute the log-likelihoods for any combination of all neurons with discretization x all delay/history segments x all conditions and to build a large lookup table for decoding. Basically, the computational cost of precomputing this table is O(V^{Nτ} x TC) and the table requires a memory of O(V^{Nτ}), where V is the number of discretization points for the neural firing rates, N is the number of neurons, τ is the history length, T is the trial length, and C is the number of conditions. This is a very large burden, especially the V^{Nτ} term. This cost is currently not mentioned in the manuscript and should be clarified in the main text. Accordingly, computation claims should be modified including in the abstract. 

      As discussed above, the manuscript has been revised to clarify that our statement was accurate.

      Comment 6. In addition to the above technical concerns, I also believe the authors should clarify the logic behind developing MINT better. From a scientific standpoint, we seek to gain insights into neural computations by making various assumptions and building models that parsimoniously describe the vast amount of neural data rather than simply tabulating the data. For instance, low-dimensional assumptions have led to the development of numerous dimensionality reduction algorithms and these models have led to important interpretations about the underlying dynamics (e.g., fixed points/limit cycles). While it is of course valid and even insightful to propose different assumptions from existing models as the authors do here, they do not actually translate these assumptions into a new model. Without a model and by just tabulating the data, I don't believe we can provide interpretation or advance the understanding of the fundamentals behind neural computations. As such, I am not clear as to how this library building approach can advance neuroscience or how these assumptions are useful. I think the authors should clarify and discuss this point. 

      As requested, a major goal of the revision has been to clarify the scientific motivations underlying MINT’s design. In addition to many textual changes, we have added figures (Figures 1a,b and 6a,b) to outline the two competing scientific perspectives that presently exist. This topic is also addressed by extensions of existing analyses and by new analyses (e.g. Figure 6c-g). 

      In our view these additions have dramatically improved the manuscript. This is especially true because we think R3’s concerns, expressed above, are reasonable. If the perspective in Figure 1a is correct, then R3 is right and MINT is essentially a hack that fails to model the data. MINT would still be effective in many circumstances (as we show), but it would be unprincipled. This would create limitations, just as the reviewer argues. On the other hand, if the perspective in Figure 1b is correct, then MINT is quite principled relative to traditional approaches. Traditional approaches make assumptions (a fixed subspace, consistent neuron-kinematic correlations) that are not correct under Figure 1b.

      We don’t expect R3 to agree with our scientific perspective at this time (though we hope to eventually convince them). To us, the key is that we agree with R3 that the manuscript needs to lay out the different perspectives and their implications, so that readers have a good sense of the possibilities they should be considering. The revised manuscript is greatly improved in this regard.

      Comment 7. Related to 6, there seems to be a logical inconsistency between the operations of MINT and one of its three assumptions, namely, sparsity. The authors state that neural states are sparsely distributed in some neural dimensions (Figure 1a, bottom). If this is the case, then why does MINT extend its decoding scope by interpolating known neural states (and behavior) in the training library? This interpolation suggests that the neural states are dense on the manifold rather than sparse, thus being contradictory to the assumption made. If interpolation-based dense meshes/manifolds underlie the data, then why not model the neural states through the subspace or manifold representations? I think the authors should address this logical inconsistency in MINT, especially since this sparsity assumption also questions the low-dimensional subspace/manifold assumption that is commonly made. 

      We agree this is an important issue, and have added an analysis on this topic (Figure 4d). The key question is simple and empirical: during decoding, does interpolation cause MINT to violate the assumption of sparsity? R3 is quite right that in principle it could. If spiking observations argue for it, MINT’s interpolation could create a dense manifold during decoding rather than a sparse one. The short answer is that empirically this does not happen, in agreement with expectations under Figure 1b. Rather than interpolating between distant states and filling in large ‘voids’, interpolation is consistently local. This is a feature of the data, not of the decoder (MINT doesn’t insist upon sparsity, even though it is designed to work best in situations where the manifold is sparse).

      In addition to adding Figure 4d, we added the following (in an earlier section):

      “The term mesh is apt because, if MINT’s assumptions are correct, interpolation will almost always be local. If so, the set of decodable states will resemble a mesh, created by line segments connecting nearby training-set trajectories. However, this mesh-like structure is not enforced by MINT’s operations. Interpolation could, in principle, create state-distributions that depart from the assumption of a sparse manifold. For example, interpolation could fill in the center of the green tube in Figure 1b, resulting in a solid manifold rather than a mesh around its outer surface. However, this would occur only if spiking observations argued for it. As will be documented below, we find that essentially all interpolation is local.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      I appreciate the detailed methods section, however, more specifics should be integrated into the main text. For example on Line 238, it should additionally be stated how many minutes were used for training and metrics like the MAE which is used later should be reported here.

      Thank you for this suggestion. We now report the duration of training data in the main text:

      “Decoding R^2 was .968 over ~7.1 minutes of test trials based on ~4.4 minutes of training data.”

      We have also added similar specifics throughout the manuscript, e.g. in the Fig. 5 legend:

      “Results are based on the following numbers of training / test trials: MC\_Cycle (174 train, 99 test), MC\_Maze (1721 train, 574 test), Area2\_Bump (272 train, 92 test), MC\_RTT (810 train, 268 test).”

      Similar additions were made to the legends for Fig. 6 and 8. Regarding the request to add MAE for the multitask network, we did not do so for the simple reason that the decoded variable (muscle activity) has arbitrary units. The raw MAE is thus not meaningful. We could of course have normalized, but at this point the MAE is largely redundant with the correlation. In contrast, the MAE is useful when comparing across the MC_Maze, Area2_Bump, and MC_RTT datasets, because they all involve the same scale (cm/s).

      Regarding the MC_RTT task, AutoLFADS was used to obtain robust spike rates, as reported in the methods. However, the rationale for splitting the neural trajectories after AutoLFADS is unclear. If the trajectories were split based on random recording gaps, this might lead to suboptimal performance? It might be advantageous to split them based on a common behavioural state? 

      When learning neural trajectories via AutoLFADS, spiking data is broken into short (but overlapping) segments, rates are estimated for each segment via AutoLFADs, and these rates are then stitched together across segments into long neural trajectories. If there had been no recording gaps, these rates could have been stitched into a single neural trajectory for this dataset. However, the presence of recording gaps left us no choice but to stitch together these rates into more than one trajectory. Fortunately, recording gaps were rare: for the decoding analysis of MC_RTT there were only two recording gaps and therefore three neural trajectories, each ~2.7 minutes in duration. 

      We agree that in general it is desirable to learn neural trajectories that begin and end at behaviorallyrelevant moments (e.g. in between movements). However, having these trajectories potentially end midmovement is not an issue in and of itself. During decoding, MINT is never stuck on a trajectory. Thus, if MINT were decoding states near the end of a trajectory that was cut short due to a training gap, it would simply begin decoding states from other trajectories or elsewhere along the same trajectory in subsequent moments. We could have further trimmed the three neural trajectories to begin and end at behaviorallyrelevant moments, but chose not to as this would have only removed a handful of potentially useful states from the library.

      We now describe this in the Methods:

      “Although one might prefer trajectory boundaries to begin and end at behaviorally relevant moments (e.g. a stationary state), rather than at recording gaps, the exact boundary points are unlikely to be consequential for trajectories of this length that span multiple movements. If MINT estimates a state near the end of a long trajectory, its estimate will simply jump to another likely state on a different trajectory (or earlier along the same trajectory) in subsequent moments. Clipping the end of each trajectory to an earlier behaviorally-relevant moment would only remove potentially useful states from the libraries.”

      Are the training and execution times in Table 1 based on pure Matlab functions or Mex files? If it's Mex files as suggested by the code, it would be good to mention this in the Table caption.

      They are based on a combination of MATLAB and MEX files. This is now clarified in the table caption:

      “Timing measurements taken on a Macbook Pro (on CPU) with 32GB RAM and a 2.3 GHz 8-Core Intel Core i9 processor. Training and execution code used for measurements was written in MATLAB (with the core recursion implemented as a MEX file).”

      As the method most closely resembles a Bayesian decoder it would be good to compare performance against a Naive Bayes decoder. 

      We agree and have now done so. The following has been added to the text:

      “A natural question is thus whether a simpler Bayesian decoder would have yielded similar results. We explored this possibility by testing a Naïve Bayes regression decoder [85] using the MC_Maze dataset. This decoder performed poorly, especially when decoding velocity (R2 = .688 and .093 for hand position and velocity, respectively), indicating that the specific modeling assumptions that differentiate MINT from a naive Bayesian decoder are important drivers of MINT’s performance.”

      Line 199 Typo: The assumption of stereotypy trajectory also enables neural states (and decoded behaviors) to be updated in between time bins. 

      Fixed

      Table 3: It's unclear why the Gaussian binning varies significantly across different datasets. Could the authors explain why this is the case and what its implications might be? 

      We have added the following description in the “Filtering, extracting, and warping data on each trial” subsection of the Methods to discuss how 𝜎 may vary due to the number of trials available for training and how noisy the neural data for those trials is:

      “First, spiking activity for each neuron on each trial was temporally filtered with a Gaussian to yield single-trial rates. Table 3 reports the Gaussian standard deviations σ (in milliseconds) used for each dataset. Larger values of σ utilize broader windows of spiking activity when estimating rates and therefore reduce variability in those rate estimates. However, large σ values also yield neural trajectories with less fine-grained temporal structure. Thus, the optimal σ for a dataset depends on how variable the rate estimates otherwise are.”

      An implementation of the method in an open-source programming language could further enhance the widespread use of the tool. 

      We agree this would be useful, but have yet not implemented the method in any other programming languages. Implementation in Python is still a future goal.

      Reviewer #2 (Recommendations For The Authors): 

      - Figures 4 and 5 should show the error bars on the horizontal axis rather than portraying them vertically. 

      [Note that these are now Figures 5 and 6]

      The figure legend of Figure 5 now clarifies that the vertical ticks are simply to aid visibility when symbols have very similar means and thus overlap visually. We don’t include error bars (for this analysis) because they are very small and would mostly be smaller than the symbol sizes. Instead, to indicate certainty regarding MINT’s performance measurements, the revised text now gives error ranges for the correlations and MAE values in the context of Figure 4c. These error ranges were computed as the standard deviation of the sampling distribution (computed via resampling of trials) and are thus equivalent to SEMs. The error ranges are all very small; e.g. for the MC_Maze dataset the MAE for x-velocity is 4.5 +/- 0.1 cm/s. (error bars on the correlations are smaller still).

      Thus, for a given dataset, we can be quite certain of how well MINT performs (within ~2% in the above case). This is reassuring, but we also don’t want to overemphasize this accuracy. The main sources of variability one should be concerned about are: 1) different methods can perform differentially well for different brain areas and tasks, 2) methods can decode some behavioral variables better than others, and 3) performance depends on factors like neuron-count and the number of training trials, in ways that can differ across decode methods. For this reason, the study examines multiple datasets, across tasks and brain areas, and measures performance for a range of decoded variables. We also examine the impact of training-set-size (Figure 8a) and population size (solid traces in Fig. 8b, see R2’s next comment below). 

      There is one other source of variance one might be concerned about, but it is specific to the neuralnetwork approaches: different weight initializations might result in different performance. For this reason, each neural-network approach was trained ten times, with the average performance computed. The variability around this average was very small, and this is now stated in the Methods.

      “For the neural networks, the training/testing procedure was repeated 10 times with different random seeds. For most behavioral variables, there was very little variability in performance across repetitions. However, there were a few outliers for which variability was larger. Reported performance for each behavioral group is the average performance across the 10 repetitions to ensure results were not sensitive to any specific random initialization of each network.”

      - For Figure 6, it is unclear whether the neuron-dropping process was repeated multiple times. If not, it should be since the results will be sensitive to which particular subsets of neurons were "dropped". In this case, the results presented in Figure 6 should include error bars to describe the variability in the model performance for each decoder considered. 

      A good point. The results in Figure 8 (previously Figure 6) were computed by averaging over the removal of different random subsets of neurons (50 subsets per neuron count), just as the reviewer requests. The figure has been modified to include the standard deviation of performance across these 50 subsets. The legend clarifies how this was done.

      Reviewer #3 (Recommendations For The Authors): 

      Other comments: 

      (1) [Line 185-188] The authors argue that in a 100-dimensional space with 10 possible discretized values, 10^100 potential neural states need to be computed. But I am not clear on this. This argument seems to hold only in the absence of a model (as in MINT). For a model, e.g., Kalman filter or AutoLFADS, information is encoded in the latent state. For example, a simple Kalman filter for a linear model can be used for efficient inference. This 10^100 computation isn't a general problem but seems MINT-specific, please clarify. 

      We agree this section was potentially confusing. It has been rewritten. We were simply attempting to illustrate why maximum likelihood computations are challenging without constraints. MINT simplifies this problem by adding constraints, which is why it can readily provide data likelihoods (and can do so using a Poisson model). The rewritten section is below:

      “Even with 1000 samples for each of the neural trajectories in Figure 3, there are only 4000 possible neural states for which log-likelihoods must be computed (in practice it is fewer still, see Methods). This is far fewer than if one were to naively consider all possible neural states in a typical rate- or factor-based subspace. It thus becomes tractable to compute log-likelihoods using a Poisson observation model. A Poisson observation model is usually considered desirable, yet can pose tractability challenges for methods that utilize a continuous model of neural states. For example, when using a Kalman filter, one is often restricted to assuming a Gaussian observation model to maintain computational tractability “

      (2) [Figure 6b] Why do the authors set the dropped neurons to zero in the "zeroed" results of the robustness analysis? Why not disregard the dropped neurons during the decoding process? 

      We agree the terminology we had used in this section was confusing. We have altered the figure and rewritten the text. The following, now at the beginning of that section, addresses the reviewer’s query: 

      “It is desirable for a decoder to be robust to the unexpected loss of the ability to detect spikes from some neurons. Such loss might occur while decoding, without being immediately detected. Additionally, one desires robustness to a known loss of neurons / recording channels. For example, there may have been channels that were active one morning but are no longer active that afternoon. At least in principle, MINT makes it very easy to handle this second situation: there is no need to retrain the decoder, one simply ignores the lost neurons when computing likelihoods. This is in contrast to nearly all other methods, which require retraining because the loss of one neuron alters the optimal parameters associated with every other neuron.”

      The figure has been relabeled accordingly; instead of the label ‘zeroed’, we use the label ‘undetected neuron loss’.

      (3) Authors should provide statistical significance on their results, which they already did for Fig. S3a,b,c but missing on some other figures/places. 

      We have added error bars in some key places, including in the text when quantifying MINT’s performance in the context of Figure 4. Importantly, error bars are only as meaningful as the source of error they assess, and there are reasons to be careful given this. The standard method for putting error bars on performance is to resample trials, which is indeed what we now report. These error bars are very small. For example, when decoding horizontal velocity for the MC_Maze dataset, the correlation between MINT’s decode and the true velocity had a mean and SD of the sampling distribution of 0.963 +/- 0.001. This means that, for a given dataset and target variable, we have enough trials/data that we can be quite certain of how well MINT performs. However, we want to be careful not to overstate this certainty. What one really wants to know is how well MINT performs across a variety of datasets, brain areas, target variables, neuron counts, etc. It is for this reason that we make multiple such comparisons, which provides a more valuable view of performance variability.

      For Figure 7, error bars are unavailable. Because this was a benchmark, there was exactly one test-set that was never seen before. This is thus not something that could be resampled many times (that would have revealed the test data and thus invalidated the benchmark, not to mention that some of these methods take days to train). We could, in principle, have added resampling to Figure 5. In our view it would not be helpful and could be misleading for the reasons noted above. If we computed standard errors using different train/test partitions, they would be very tight (mostly smaller than the symbol sizes), which would give the impression that one can be quite certain of a given R^2 value. Yet variability in the train/test partition is not the variability one is concerned about in practice. In practice, one is concerned about whether one would get a similar R^2 for a different dataset, or brain area, or task, or choice of decoded variable. Our analysis thus concentrated on showing results across a broad range of situations. In our view this is a far more relevant way of illustrating the degree of meaningful variability (which is quite large) than resampling, which produces reassuringly small but (mostly) irrelevant standard errors.

      Error bars are supplied in Figure 8b. These error bars give a sense of variability across re-samplings of the neural population. While this is not typically the source of variability one is most concerned about, for this analysis it becomes appropriate to show resampling-based standard errors because a natural concern is that results may depend on which neurons were dropped. So here it is both straightforward, and desirable, to compute standard errors. (The fact that MINT and the Wiener filter can be retrained many times swiftly was also key – this isn’t true of the more expressive methods). Figure S1 also uses resampling-based confidence intervals for similar reasons.

      (4) [Line 431-437] Authors state that MINT outperforms other methods with the PSTH R^2 metric (trial-averaged smoothed spikes for each condition). However, I think this measure may not provide a fair comparison and is confounded because MINT's library is built using PSTH (i.e., averaged firing rate) but other methods do not use the PSTH. The author should clarify this. 

      The PSTH R^2 metric was not created by us; it was part of the Neural Latents Benchmark. They chose it because it ensures that a method cannot ‘cheat’ (on the Bits/Spike measure) by reproducing fine features of spiking while estimating rates badly. We agree with the reviewer’s point: MINT’s design does give it a potential advantage in this particular performance metric. This isn’t a confound though, just a feature. Importantly, MINT will score well on this metric only if MINT’s neural state estimate is accurate (including accuracy in time). Without accurate estimation of the neural state at each time, it wouldn’t matter that the library trajectory is based on PSTHs. This is now explicitly stated:

      “This is in some ways unsurprising: MINT estimates neural states that tend to resemble (at least locally) trajectories ‘built’ from training-set-derived rates, which presumably resemble test-set rates. Yet strong performance is not a trivial consequence of MINT’s design. MINT does not ‘select’ whole library trajectories; PSTH R2 will be high only if condition (c), index (k), and the interpolation parameter (α) are accurately estimated for most moments.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      In the presented manuscript, the authors investigate how neural networks can learn to replay presented sequences of activity. Their focus lies on the stochastic replay according to learned transition probabilities. They show that based on error-based excitatory and balance-based inhibitory plasticity networks can selforganize towards this goal. Finally, they demonstrate that these learning rules can recover experimental observations from song-bird song learning experiments. 

      Overall, the study appears well-executed and coherent, and the presentation is very clear and helpful. However, it remains somewhat vague regarding the novelty. The authors could elaborate on the experimental and theoretical impact of the study, and also discuss how their results relate to those of Kappel et al, and others (e.g., Kappel et al (doi.org/10.1371/journal.pcbi.1003511))). 

      We agree with the reviewer that our previous manuscript lacked comparison with previously published similar works. While Kappel et al. demonstrated that STDP in winner-take-all circuits can approximate online learning of hidden Markov models (HMMs), a key distinction from our model is that their neural representations acquire deterministic sequential activations, rather than exhibiting stochastic transitions governing Markovian dynamics. Specifically, in their model, the neural representation of state B would be different in the sequences ABC and CBA, resulting in distinct deterministic representations like ABC and C'B'A', where ‘A’ and ‘A'’ are represented by different neural states (e.g., activations of different cell assemblies). In contrast, our network learns to generate stochastically transitioning cell assemblies which replay Markovian trajectories of spontaneous activity obeying the learned transition probabilities between neural representations of states. For example, starting from reactivation from assembly ‘A’, there may be an 80% probability to transition to assembly ‘B’ and 20% to ‘C’. Although Kappel et al.'s model successfully solves HMMs, their neural representations do not themselves stochastically transition between states according to the learned model. Similar to the Kappel et al.'s model, while the models proposed in Barber (2002) and Barber and Agakov (2002) learn the Markovian statistics, these models learned a static spatiotemporal input patterns only and how assemblies of neurons show stochastic transition in spontaneous activity has been still unclear. In contrast with these models, our model captures the probabilistic neural state trajectories, allowing spontaneous replay of experienced sequences with stochastic dynamics matching the learned environmental statistics.

      We have included new sentences for explain these in ll. 509-533 in the revised manuscript.

      Overall, the work could benefit if there was either (A) a formal analysis or derivation of the plasticity rules involved and a formal justification of the usefulness of the resulting (learned) neural dynamics; 

      We have included a derivation of our plasticity rules in ll. 630-670 in the revised manuscript. Consistent with our claim that excitatory plasticity updates the excitatory synapse to predict output firing rates, we have shown that the corresponding cost function measures the discrepancy between the recurrent prediction and the output firing rate. Similarly, for inhibitory plasticity, we defined the cost function that evaluates the difference between the excitatory and inhibitory potential within each neuron. We showed that the resulting inhibitory plasticity rule updates the inhibitory synapses to maintain the excitation-inhibition balance.

      and/or (B) a clear connection of the employed plasticity rules to biological plasticity and clear testable experimental predictions. Thus, overall, this is a good work with some room for improvement. 

      Our proposed plasticity mechanism could be implemented through somatodendritic interactions. Analogous to previous computational works (Urbanczik and Senn., 2014; Asabuki and Fukai., 2020; Asabuki et al., 2022), our model suggests that somatic responses may encode the stimulus-evoked neural activity states, while dendrites encode predictions based on recurrent dynamics that aim to minimize the discrepancy between somatic and dendritic activity. To directly test this hypothesis, future experimental studies could simultaneously record from both somatic and dendritic compartments to investigate how they encode evoked responses and predictive signals during learning (Francioni et al., 2022).

      We have included new sentences for explain these in ll. 476-484 in the revised manuscript.

      Reviewer #2 (Public Review): 

      Summary: 

      This work proposes a synaptic plasticity rule that explains the generation of learned stochastic dynamics during spontaneous activity. The proposed plasticity rule assumes that excitatory synapses seek to minimize the difference between the internal predicted activity and stimulus-evoked activity, and inhibitory synapses try to maintain the E-I balance by matching the excitatory activity. By implementing this plasticity rule in a spiking recurrent neural network, the authors show that the state-transition statistics of spontaneous excitatory activity agree with that of the learned stimulus patterns, which are reflected in the learned excitatory synaptic weights. The authors further demonstrate that inhibitory connections contribute to well-defined state transitions matching the transition patterns evoked by the stimulus. Finally, they show that this mechanism can be expanded to more complex state-transition structures including songbird neural data. 

      Strengths: 

      This study makes an important contribution to computational neuroscience, by proposing a possible synaptic plasticity mechanism underlying spontaneous generations of learned stochastic state-switching dynamics that are experimentally observed in the visual cortex and hippocampus. This work is also very clearly presented and well-written, and the authors conducted comprehensive simulations testing multiple hypotheses. Overall, I believe this is a well-conducted study providing interesting and novel aspects of the capacity of recurrent spiking neural networks with local synaptic plasticity. 

      Weaknesses: 

      This study is very well-thought-out and theoretically valuable to the neuroscience community, and I think the main weaknesses are in regard to how much biological realism is taken into account. For example, the proposed model assumes that only synapses targeting excitatory neurons are plastic, and uses an equal number of excitatory and inhibitory neurons. 

      We agree with the reviewer. The network shown in the previous manuscript consists of an equal number of excitatory and inhibitory neurons, which seems to lack biological plausibility. Therefore, we first tested whether a biologically plausible scenario would affect learning performance by setting the ratio of excitatory to inhibitory neurons to 80% and 20% (Supplementary Figure 7a; left). Even in such a scenario, the network still showed structured spontaneous activity (Supplementary Figure 7a; center), with transition statistics of replayed events matching the true transition probabilities (Supplementary Figure 7a; right). We then asked whether the model with our plasticity rule applied to all synapses would reproduce the corresponding stochastic transitions. We found that the network can learn transition statistics but only under certain conditions. The network showed only weak replay and failed to reproduce the appropriate transition (Supplementary Fig. 7b) if the inhibitory neurons were no longer driven by the synaptic currents reflecting the stimulus, due to a tight balance of excitatory and inhibitory currents on the inhibitory neurons. We then tested whether the network with all synapses plastic can learn transition statistics if the external inputs project to the inhibitory neurons as well. We found that, when each stimulus pattern activates a non-overlapping subset of neurons, the network does not exhibit the correct stochastic transition of assembly reactivation (Supplementary Fig. 7c). Interestingly, when each neuron's activity is triggered by multiple stimuli and has mixed selectivity, the reactivation reproduced the appropriate stochastic transitions (Supplementary Fig. 7d).

      We have included these new results as new Supplementary Figure 7 and they are explained in ll.215-230 in the revised manuscript.

      The model also assumes Markovian state dynamics while biological systems can depend more on history. This limitation, however, is acknowledged in the Discussion. 

      We have included the following sentence to provide a possible solution to this limitation: “Therefore, to learn higher-order stochastic transitions, recurrent neural networks like ours may need to integrate higher-order inputs with longer time scales.” in ll.557-559 in the revised manuscript. 

      Finally, to simulate spontaneous activity, the authors use a constant input of 0.3 throughout the study. Different amplitudes of constant input may correspond to different internal states, so it will be more convincing if the authors test the model with varying amplitudes of constant inputs. 

      We thank the reviewer for pointing this out. In the revised manuscript, we have tested constant input with three different strengths. If the strength is moderate, the network showed accurate encoding of transition statistics in the spontaneous activity as we have seen in Fig.2. We have additionally shown that the weaker background input causes spontaneous activity with lower replay rate, which in turn leads to high variance of encoded transition, while stronger inputs make assembly replay transitions more uniform. We have included these new results as new Supplementary Figure 6 and they are explained in ll.211214 in the revised manuscript.

      Reviewer #3 (Public Review): 

      Summary: 

      Asabuki and Clopath study stochastic sequence learning in recurrent networks of Poisson spiking neurons that obey Dale's law. Inspired by previous modeling studies, they introduce two distinct learning rules, to adapt excitatory-to-excitatory and inhibitory-to-excitatory synaptic connections. Through a series of computer experiments, the authors demonstrate that their networks can learn to generate stochastic sequential patterns, where states correspond to non-overlapping sets of neurons (cell assemblies) and the state-transition conditional probabilities are first-order Markov, i.e., the transition to a given next state only depends on the current state. Finally, the authors use their model to reproduce certain experimental songbird data involving highly-predictable and highly-uncertain transitions between song syllables. 

      Strengths: 

      This is an easy-to-follow, well-written paper, whose results are likely easy to reproduce. The experiments are clear and well-explained. The study of songbird experimental data is a good feature of this paper; finches are classical model animals for understanding sequence learning in the brain. I also liked the study of rapid task-switching, it's a good-to-know type of result that is not very common in sequence learning papers. 

      Weaknesses: 

      While the general subject of this paper is very interesting, I missed a clear main result. The paper focuses on a simple family of sequence learning problems that are well-understood, namely first-order Markov sequences and fully visible (nohidden-neuron) networks, studied extensively in prior work, including with spiking neurons. Thus, because the main results can be roughly summarized as examples of success, it is not entirely clear what the main point of the authors is. 

      We apologize the reviewer that our main claim was not clear. While various computational studies have suggested possible plasticity mechanisms for embedding evoked activity patterns or their probability structures into spontaneous activity (Litwin-Kumar et al., Nat. Commun. 2014, Asabuki and Fukai., Biorxiv 2023), how transition statistics of the environment are learned in spontaneous activity is still elusive and poorly understood. Furthermore, while several network models have been proposed to learn Markovian dynamics via synaptic plasticity (Brea, et al. (2013); Pfister et al. (2004); Kappel et al. (2014)), they have been limited in a sense that the learned network does not show stochastic transition in a neural state space. For instance, while Kappel et al. demonstrated that STDP in winner-take-all circuits can approximate online learning of hidden Markov models (HMMs), a key distinction from our model is that their neural representations acquire deterministic sequential activations, rather than exhibiting stochastic transitions governing Markovian dynamics. Specifically, in their model, the neural representation of state B would be different in the sequences ABC and CBA, resulting in distinct deterministic representations like ABC and C'B'A', where ‘A’ and ‘A'’ are represented by different neural states (e.g., activations of different cell assemblies). In contrast, our network learns to generate stochastically transitioning cell assemblies that replay Markovian trajectories of spontaneous activity obeying the learned transition probabilities between neural representations of states. For example, starting from reactivation from assembly ‘A’, there may be an 80% probability to transition to assembly ‘B’ and 20% to ‘C’. Although Kappel et al.'s model successfully solves HMMs, their neural representations do not themselves stochastically transition between states according to the learned model. Similar to the Kappel et al.'s model, while the models proposed in Barber (2002) and Barber and Agakov (2002) learn the Markovian statistics, these models learned a static spatiotemporal input patterns only and how assemblies of neurons show stochastic transition in spontaneous activity has been still unclear. In contrast with these models, our model captures the probabilistic neural state trajectories, allowing spontaneous replay of experienced sequences with stochastic dynamics matching the learned environmental statistics.

      We have explained this point in ll.509-533 in the revised manuscript.

      Going into more detail, the first major weakness I see in this paper is the heuristic choice of learning rules. The paper studies Poisson spiking neurons (I return to this point below), for which learning rules can be derived from a statistical objective, typically maximum likelihood. For fully-visible networks, these rules take a simple form, similar in many ways to the E-to-E rule introduced by the authors. This more principled route provides quite a lot of additional understanding on what is to be expected from the learning process. 

      We thank the reviewer for pointing this out. To better demonstrate the function of our plasticity rules, we have included the derivation of the rules of synaptic plasticity in ll. 630-670 in the revised manuscript. Consistent with our claim that excitatory plasticity updates the excitatory synapse to predict output firing rates, we have shown that the corresponding cost function measures the discrepancy between the recurrent prediction and the output firing rate. Similarly, for inhibitory plasticity, we defined the cost function that evaluates the difference between the excitatory and inhibitory potential within each neuron. We showed that the resulting inhibitory plasticity rule updates the inhibitory synapses to maintain the excitation-inhibition balance.

      For instance, should maximum likelihood learning succeed, it is not surprising that the statistics of the training sequence distribution are reproduced. Moreover, given that the networks are fully visible, I think that the maximum likelihood objective is a convex function of the weights, which then gives hope that the learning rule does succeed. And so on. This sort of learning rule has been studied in a series of papers by David Barber and colleagues [refs. 1, 2 below], who applied them to essentially the same problem of reproducing sequence statistics in recurrent fully-visible nets. It seems to me that one key difference is that the authors consider separate E and I populations, and find the need to introduce a balancing I-to-E learning rule. 

      The reviewer’s understanding that inhibitory plasticity to maintain EI balance is one of a critical difference from previous works is correct. However, we believe that the most striking point of our study is that we have shown numerically that predictive plasticity rules enable recurrent networks to learn and replay the assembly activations whose transition statistics match those of the evoked activity. Please see our reply above.

      Because the rules here are heuristic, a number of questions come to mind. Why these rules and not others - especially, as the authors do not discuss in detail how they could be implemented through biophysical mechanisms? When does learning succeed or fail? What is the main point being conveyed, and what is the contribution on top of the work of e.g. Barber, Brea, et al. (2013), or Pfister et al. (2004)? 

      Our proposed plasticity mechanism could be implemented through somatodendritic interactions. Analogous to previous computational works (Senn, Asabuki), our model suggests that somatic responses may encode the stimulusevoked neural activity states, while dendrites encode predictions based on recurrent dynamics that aim to minimize the discrepancy between somatic and dendritic activity. To directly test this hypothesis, future experimental studies could simultaneously record from both somatic and dendritic compartments to investigate how they encode evoked responses and predictive signals during learning.

      To address the point of the reviewer, we conducted addionnal simulations to test where the model fails. We found that the model with our plasticity rule applied to all synapses only showed faint replays and failed to replay the appropriate transition (Supplementary Fig. 7b). This result is reasonable because the inhibitory neurons were no longer driven by the synaptic currents reflecting the stimulus, due to a tight balance of excitatory and inhibitory currents on the inhibitory neurons. Our model predicts that mixed selectivity in the inhibitory population is crucial to learn an appropriate transition statistics (Supplementary Fig. 7d). Future work should clarify the role of synaptic plasticity on inhibitory neurons, especially plasticity at I to I synapses. We have explained this result as new supplementary Figure7 in the revised manuscript.

      The use of a Poisson spiking neuron model is the second major weakness of the study. A chief challenge in much of the cited work is to generate stochastic transitions from recurrent networks of deterministic neurons. The task the authors set out to do is much easier with stochastic neurons; it is reasonable that the network succeeds in reproducing Markovian sequences, given an appropriate learning rule. I believe that the main point comes from mapping abstract Markov states to assemblies of neurons. If I am right, I missed more analyses on this point, for instance on the impact that varying cell assembly size would have on the findings reported by the authors.

      The reviewer’s understanding is correct. Our main point comes from mapping Markov statistics to replays of cell assemblies. In the revised manuscript, we performed additional simulations to ask whether varying the size of the cell assemblies would affect learning. We ran simulations with two different configurations in the task shown in Figure 2. The first configuration used three assemblies with a size ratio of 1:1.5:2. After training, these assemblies exhibited transition statistics that closely matched those of the evoked activity (Supplementary Fig.4a,b). In contrast, the second configuration, which used a size ratio of 1:2:3, showed worse performance compared to the 1:1.5:2 case (Supplementary Fig.4c,d). These results suggest that the model can learn appropriate transition statistics as long as the size ratio of the assemblies is not drastically varied.

      Finally, it was not entirely clear to me what the main fundamental point in the HVC data section was. Can the findings be roughly explained as follows: if we map syllables to cell assemblies, for high-uncertainty syllable-to-syllable transitions, it becomes harder to predict future neural activity? In other words, is the main point that the HVC encodes syllables by cell assemblies? 

      The reviewer's understanding is correct. We wanted to show that if the HVC learns transition statistics as a replay of cell assemblies, a high-uncertainty syllable-to-syllable transition would make predicting future reactivations more difficult, since trial-averaged activities (i.e., poststimulus activities; PSAs) marginalized all possible transitions in the transition diagram.

      (1) Learning in Spiking Neural Assemblies, David Barber, 2002. URL: https://proceedings.neurips.cc/paper/2002/file/619205da514e83f869515c782a328d3c-Paper.pdf  

      (2) Correlated sequence learning in a network of spiking neurons usingmaximum likelihood, David Barber, Felix Agakov, 2002. URL: http://web4.cs.ucl.ac.uk/staff/D.Barber/publications/barber-agakovTR0149.pdf  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      In more detail: 

      A) Theoretical analysis 

      The plasticity rules in the study are introduced with a vague reference to previous theoretical studies of others. Doing this, one does not provide any formal insight as to why these plasticity rules should enable one to learn to solve the intended task, and whether they are optimal in some respect. This becomes noticeable, especially in the discussion of the importance of inhibitory balance, which does not go into any detail, but rather only states that its required, both in the results and discussion sections. Another unclarity appears when error-based learning is discussed and compared to Hebbian plasticity, which, as you state, "alone is insufficient to learn transition probabilities". It is not evident how this claim is warranted, nor why error-based plasticity in comparison should be able to perform this (other than referring to the simulation results). Please either clarify formally (or at least intuitively) how plasticity rules result in the mentioned behavior, or alternatively acknowledge explicitly the (current) lack of intuition. 

      The lack of formal discussion is a relevant shortcoming compared to previous research that showed very similar results with formally more rigorous and principled approaches. In particular, Kappel et al derived explicitly how neural networks can learn to sample from HMMs using STDP and winner-take-all dynamics. Even though this study has limitations, the relation with respect to that work should be made very clear; potentially the claims of novelty of some results (sampling) should be adjusted accordingly. See also Yanping Huang, Rajesh PN Rao (NIPS 2014), and possibly other publications. While it might be difficult to formally justify the learning rules post-hoc, it would be very helpful to the field if you very clearly related your work to that of others, where learning rules have been formally justified, and elaborate on the intuition of how the employed rules operate and interact (especially for inhibition). 

      Lastly, while the importance of sampling learned transition probabilities is discussed, the discussion again remains on a vague level, characterized by the lack of references in the relevant paragraphs. Ideally, there should be a proof of concept or a formal understanding of how the learned behaviour enables to solve a problem that is not solved by deterministic networks. Please incorporate also the relation to the literature on neural sampling/planning/RL etc. and substantiate the claims with citations. 

      We have included sentences in ll. 691-696 in the revised manuscript to explain that for Poisson spiking neurons, the derived learning rule is equivalent to the one that minimizes the Kullback-Leibler divergence between the distributions of output firing and the dendritic prediction, in our case, the recurrent prediction (Asabuki and Fukai; 2020). Thus, the rule suggests that the recurrent prediction learns the statistical model of the evoked activity, which in turn allows the network to reproduce the learned transition statistics.

      We have also added a paragraph to discuss the differences between previously published similar models (e.g., Kappel et al.). Please see our response above.

      B) Connection to biology 

      The plasticity rules in the study are introduced with a vague reference to previous theoretical studies of others. Please discuss in more detail if these rules (especially the error-based learning rule) could be implemented biologically and how this could be achieved. Are there connections to biologically observed plasticity? E.g. for error-based plasticity has been discussed in the original publication by Urbanzcik and Senn, or more recently by Mikulasch et al (TINS 2023). The biological plausibility of inhibitory balance has been discussed many times before, e.g. by Vogels and others, and a citation would acknowledge that earlier work. This also leaves the question of how neurons in the songbird experiment could adapt and if the model does capture this well (i.e., do they exhibit E-I balance? etc), which might be discussed as well. 

      Last, please provide some testable experimental predictions. By proposing an interesting experimental prediction, the model could become considerably more relevant to experimentalists. Also, are there potentially alternative models of stochastic sequence learning (e.g., Kappel et al)? How could they be distinguished? (especially, again, why not Hebbian/STDP learning?) 

      We have cited the Vogels paper to acknowledge the earlier work. We have also included additional paragraphs to discuss a possible biologically plausible implementation of our model and how our model differs from similar models proposed previously (e.g., Kappel et al.). Please see our response above.

      Other comments 

      As mentioned, a derivation of recurrent plasticity rules is missing, and parameters are chosen ad-hoc. This leaves the question of how much the results rely on the specific choice of parameters, and how robust they are to perturbations. As a robustness check, please clarify how the duration of the Markov states influences performance. It can be expected that this interacts with the timescale of recurrent connections, so having longer or shorter Markov states, as it would be in reality, should make a difference in learning that should be tested and discussed.

      We thank the reviewer for pointing this out. To address this point, we performed new simulations and asked to what extent the duration of Markov states affect performance. Interestingly, even when the network was trained with input states of half the duration, the distributions of the durations of assembly reactivations remain almost identical to those in the original case (Supplementary Figure 3a). Furthermore, the transition probabilities in the replay were still consistent with the true transition probabilities (Supplementary Figure 3b). We have also included the derivation of our plasticity rule in ll. 630-670 in the revised manuscript. 

      Similarly, inhibitory plasticity operates with the same plasticity timescale parameter as excitatory plasticity, but, as the authors discuss, lags behind excitatory plasticity in simulation as in experiment. Is this required or was the parameter chosen such that this behaviour emerges? Please clarify this in the methods section; moreover, it would be good to test if the same results appear with fast inhibitory plasticity. 

      We have performed a new simulation and showed that even when the learning rate of inhibitory plasticity was larger than that of excitatory plasticity, inhibitory plasticity still occurred on a slower timescale than excitatory plasticity. We have included this result in a new Supplementary Figure 2 in the revised manuscript.

      What is the justification (biologically and theoretically) for the memory trace h and its impact on neural spiking? Is it required for the results or can it be left away? Since this seems to be an important and unconventional component of the model, please discuss it in more detail. 

      In the model, it is assumed that each stimulus presentation drives a specific subset of network neurons with a fixed input strength, which avoids convergence to trivial solutions. Nevertheless, we choose to add this dynamic sigmoid function to facilitate stable replay by regulating neuron activity to prevent saturation. We have explained this point in ll.605-611 in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      I noticed a couple of minor typos: 

      Page 3 "underly"->"underlie" 

      Page 7 "assemblies decreased settled"->"assemblies decreased and settled"

      We have modified the text. We thank the reviewer for their careful review.

      I think Figure 1C is rather confusing and not intuitive. 

      We apologize that the Figure 1C was confusing. In the revised figure, we have emphasized the flow of excitatory and inhibitory error for updating synapses.

      Reviewer #3 (Recommendations For The Authors): 

      One possible path to improve the paper would be to establish a relationship between the proposed learning rules and e.g. the ones derived by Barber. 

      When reading the paper, I was left with a number of more detailed questions I omitted from the public review: 

      (1) The authors introduce a dynamic sigmoidal function for excitatory neurons, Eq. 3. This point requires more discussion and analysis. How does this impact the results? 

      In the model, it is assumed that each stimulus presentation drives a specific subset of network neurons with a fixed input strength, which avoids convergence to trivial solutions. Nevertheless, we choose to add this dynamic sigmoid function to facilitate stable replay by regulating neuron activity to prevent saturation. We have explained this point in ll.605-611 in the revised manuscript.

      (2) For Poisson spiking neurons, it would be great to understand what cell assemblies bring (apart from biological realism, i.e., reproducing data where assemblies can be found), compared to self-connected single neurons. For example, how do the results shown in Figure 2 depend on assembly size? 

      We have changed the cell assembly size ratio and how it affects learning performance in a new Supplementary Figure 4. Please see our reply above.

      (3) The authors focus on modeling spontaneous transitions, corresponding to a highly stochastic generative model (with most transition probabilities far from 1). A complementary question is that of learning to produce a set of stereotypical sequences, with probabilities close to 1. I wondered whether the learning rules and architecture of the model (in particular under the I-to-E rule) would also work in such a scenario. 

      We thank the reviewer for pointing this out. In fact, we had the same question, so we considered a situation in which the setting in Figure 2 includes both cases where the transition matrix is very stochastic (prob=0.5) and near deterministic (prob=0.9).

      (4) An analysis of what controls the time so that the network stays in a certain state would be welcome. 

      We trained the network model in two cases, one with a fast speed of plasticity and one with a slow speed of plasticity. As a result, we found that the duration of assembly becomes longer in the slow learning case than in the fast case. We have included these results as Supplementary Figure 5 in the revised manuscript.

      Regarding the presentation, given that this is a computational modeling paper, I wonder whether *all* the formulas belong in the Methods section. I found myself skipping back and forth to understand what the main text meant, mainly because I missed a few key equations. I understand that this is a style issue that is very much community-dependent, but I think readability would improve drastically if the main model and learning rule equations could be introduced in the main text, as they start being discussed. 

      We thank the reviewer for the suggestion. To cater to a wider audience, we try to explain the principle of the paper without using mathematical formulas as much as possible in the main text.

    2. eLife Assessment

      This is an important study that investigates how neural networks can learn to stochastically replay presented sequences of activity according to learned transition probabilities. The authors use error-based excitatory plasticity to minimize the difference between internally predicted activity and stimulus-driven activity, and inhibitory plasticity to maintain E-I balance. The approach is solid but the choice of learning rules and parameters is not always always justified, with some unclear aspects to the formal derivation.

    3. Reviewer #2 (Public review):

      Summary:

      This work proposes a synaptic plasticity rule which explains the generation of learned stochastic dynamics during spontaneous activity. The proposed plasticity rule assumes that excitatory synapses seek to minimize the difference between the internal predicted activity and stimulus-evoked activity, and inhibitory synapses tries to maintain the E-I balance by matching the excitatory activity. By implementing this plasticity rule in a spiking recurrent neural network, the authors show that the state-transition statistics of spontaneous excitatory activity agrees with that of the learned stimulus patterns, which is reflected in the learned excitatory synaptic weights. The authors further demonstrate that inhibitory connections contribute to well-defined state-transitions matching the transition patterns evoked by the stimulus. Finally, they show that this mechanism can be expanded to more complex state-transition structures including songbird neural data.

      Strengths:

      This study makes an important contribution to computational neuroscience, by proposing a possible synaptic plasticity mechanism underlying spontaneous generations of learned stochastic state-switching dynamics that are experimentally observed in the visual cortex and hippocampus. This work is also very clearly presented and well-written, and the authors conducted comprehensive simulations testing multiple hypotheses. Overall, I believe this is a well-conducted study providing interesting and novel aspects on the capacity of recurrent spiking neural networks with local synaptic plasticity.

      Weaknesses:

      This study is very well-thought out and theoretically valuable to the neuroscience community, and I think the main weaknesses are in regard to how much biological realism is taken into account. For example, the proposed model assumes that only synapses targeting excitatory neurons are plastic, and uses an equal number of excitatory and inhibitory neurons.<br /> The model also assumes Markovian state dynamics while biological systems can depend more on history. This limitation, however, is acknowledged in the Discussion.<br /> Finally, to simulate spontaneous activity, the authors use a constant input of 0.3 throughout the study. Different amplitudes of constant input may correspond to different internal states, so it will be more convincing if the authors test the model with varying amplitudes of constant inputs.

      Comments on revisions:

      The authors have addressed all of the previously raised concerns satisfactorily, by running extra simulations with a biologically plausible composition of excitatory and inhibitory neurons, plasticity assumed for all synapses, and varied amounts of constant inputs representing internal states or background activities. While in some of these cases the stochastic dynamics during spontaneous activity change or do not replicate those of the learned stimulus patterns as well as before, these extended studies provide thorough evaluations of the strengths and limitations of the proposed plasticity rule as the underlying mechanism of stochastic dynamics during spontaneous activity. Overall, the revision has strengthened the paper significantly.

    4. Reviewer #3 (Public review):

      Summary:

      Asabuki and Clopath study stochastic sequence learning in recurrent networks of Poisson spiking neurons that obey Dale's law. Inspired by previous modeling studies, they introduce two distinct learning rules, to adapt excitatory-to-excitatory and inhibitory-to-excitatory synaptic connections. Through a series of computer experiments, the authors demonstrate that their networks can learn to generate stochastic sequential patterns, where states correspond to non-overlapping sets of neurons (cell assemblies) and the state-transition conditional probabilities are first-order Markov, i.e., the transition to a given next state only depends on the current state. Finally, the authors use their model to reproduce certain experimental songbird data involving highly-predictable and highly-uncertain transitions between song syllables. While the findings are only moderately surprising, this is a well-written and welcome detailed study that may be of interest to experts of plasticity and learning in recurrent neural networks that respect Dale's law.

      Strengths:

      This is an easy-to-follow, well-written paper, whose results are likely easy to reproduce. The experiments are clear and well-explained. In particular, the study of the interplay between excitation and inhibition (and their different plasticity rules) is a highlight of the study. The study of songbird experimental data is another good feature of this paper; finches are classical model animals for understanding sequence learning in the brain. I also liked the study of rapid task-switching, it's a good-to-know type of result that is not very common in sequence learning papers.

      Weaknesses:

      One weakness I see in this paper is the derivation of the learning rules, which is semi-heuristic. The paper studies Poisson spiking neurons, for which learning rules can be derived from a statistical objective, typically maximum likelihood, as previously done in the cited literature. The authors provide a brief section connecting the learning rules to gradient descent on objective functions, but the link is only heuristic or at least not entirely presented. The reason is that the neural network state is not fully determined by (or "clamped to") the target during learning (for instance, inhibitory neurons do not even have a target assigned). So, the (total) gradient should take into account the recurrent contributions from other neurons, and equation 13 does not appear to be complete/correct to me. Moreover, the target firing rate is a mixture of external currents with currents arising from other neurons in the recurrent network. The authors ideally should start from an actual distribution matching objective (e.g., KL divergence, and not such a squared error), so that their main claims immediately follow from the mathematical derivations. Along the same line, it would be excellent to get some additional insights on the interaction of the two distinct plasticity rules, one of the highlights of the study. This could be naturally achieved by relating their distinct rules to a common principled objective.

      The other major weakness (albeit one that is clearly discussed by the authors) is that the study assumes that every excitatory neuron is directly given its target state when learning. In machine learning language, there are no 'hidden' excitatory neurons. While this assumption greatly simplifies the derivation of efficient and biologically-plausible learning rules that can be mapped to synaptic plasticity, it also limits considerably the distributions that can be learned by the network, more precisely to those that satisfy the Markov property.

    1. eLife Assessment

      This fundamental study combines Global Positioning System tracking and the analysis of social interactions among feral pigs, to provide insights into the likelihood of disease transmission based on contact rates both within and between sounders. The method used for data collection is compelling, but the varying sample sizes across populations could be a potential source of bias. With the potential biases from varying sample sizes strengthened this paper would be of interest to the fields of Veterinary Medicine, Public Health, and Epidemiology.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to quantify feral pig interactions in eastern Australia to inform disease transmission networks. They used GPS tracking data from 146 feral pigs across multiple locations to construct proximity-based social networks and analyze contact rates within and between pig social units.

      Strengths:

      (1) Addresses a critical knowledge gap in feral pig social dynamics in Australia.

      (2) Uses robust methodology combining GPS tracking and network analysis.

      (3) Provides valuable insights into sex-based and seasonal variations in contact rates.

      (4) Effectively contextualizes findings for disease transmission modeling and management.

      (5) Includes comprehensive ethical approval for animal research.

      (6) Utilizes data from multiple locations across eastern Australia, enhancing generalizability.

      Weaknesses:

      (1) Limited discussion of potential biases from varying sample sizes across populations

      (2) Some key figures are in supplementary materials rather than the main text.

      (3) Economic impact figures are from the US rather than Australia-specific data.

      (4) Rationale for spatial and temporal thresholds for defining contacts could be clearer.

      (5) Limited discussion of ethical considerations beyond basic animal ethics approval.

      The authors largely achieved their aims, with the results supporting their conclusions about the importance of sex and seasonality in feral pig contact networks. This work is likely to have a significant impact on feral pig management and disease control strategies in Australia, providing crucial data for refining disease transmission models.

    3. Reviewer #2 (Public review):

      Summary:

      The paper attempts to elucidate how feral (wild) pigs cause distortion of the environment in over 54 countries of the world, particularly Australia.

      The paper displays proof that over $120 billion worth of facilities were destroyed annually in the United States of America.

      The authors have tried to infer that the findings of their work were important and possess a convincing strength of evidence.

      Strengths:

      (1) Clearly stating feral (wild) pigs as a problem in the environment.

      (2) Stating how 54 countries were affected by the feral pigs.

      (3) Mentioning how $120 billion was lost in the US, annually, as a result of the activities of the feral pigs.

      (4) Amplifying the fact that 14 species of animals were being driven into extinction by the feral pigs.

      (5) Feral pigs possessing zoonotic abilities.

      (6) Feral pigs acting as reservoirs for endemic diseases like brucellosis and leptospirosis.

      (7) Understanding disease patterns by the social dynamics of feral pig interactions.

      (8) The use of 146 GPS-monitored feral pigs to establish their social interaction among themselves.

      Weaknesses:

      (1) Unclear explanation of the association of either the female or male feral pigs with each other, seasonally.

      (2) The "abstract paragraph" was not justified.

      (3) Typographical errors in the abstract.

    4. Reviewer #3 (Public review):

      Summary:

      The authors sought to understand social interactions both within and between groups of feral pigs, with the intent of applying their findings to models of disease transmission. The authors analyzed GPS tracking data from across various populations to determine patterns of contact that could support the transmission of a range of zoonotic and livestock diseases. The analysis then focused on the effects of sex, group dynamics, and seasonal changes on contact rates that could be used to base targeted disease control strategies that would prioritize the removal of adult males for reducing intergroup disease transmission.

      Strengths:

      It utilized GPS tracking data from 146 feral pigs over several years, effectively capturing seasonal and spatial variation in the social behaviors of interest. Using proximity-based social network analysis, this work provides a highly resolved snapshot of contact rates and interactions both within and between groups, substantially improving research in wildlife disease transmission. Results were highly useful and provided practical guidance for disease management, showing that control targeted at adult males could reduce intergroup disease transmission, hence providing an approach for the control of zoonotic and livestock diseases.

      Weaknesses:

      Despite their reliability, populations can be skewed by small sample sizes and limited generalizability due to specific environmental and demographic characteristics. Further validation is needed to account for additional environmental factors influencing social dynamics and contact rates

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors aimed to quantify feral pig interactions in eastern Australia to inform disease transmission networks. They used GPS tracking data from 146 feral pigs across multiple locations to construct proximity-based social networks and analyze contact rates within and between pig social units.

      Strengths:

      (1) Addresses a critical knowledge gap in feral pig social dynamics in Australia.

      (2) Uses robust methodology combining GPS tracking and network analysis.

      (3) Provides valuable insights into sex-based and seasonal variations in contact rates.

      (4) Effectively contextualizes findings for disease transmission modeling and management.

      (5) Includes comprehensive ethical approval for animal research.

      (6) Utilizes data from multiple locations across eastern Australia, enhancing generalizability.

      Weaknesses:

      (1) Limited discussion of potential biases from varying sample sizes across populations

      This is a really good comment, and we will address this in the discussion as one of the limitations of the study.

      (2) Some key figures are in supplementary materials rather than the main text.

      We will move some of our supplementary material to the main text as suggested.

      (3) Economic impact figures are from the US rather than Australia-specific data.

      We included the impact figures that are available for Australia (for FDM), and we will include the estimated impact of ASF in Australia in the introduction.

      (4) Rationale for spatial and temporal thresholds for defining contacts could be clearer.

      We will improve the explanation of why we chose the spatial and temporal thresholds based on literature, the size of animals and GPS errors.

      (5) Limited discussion of ethical considerations beyond basic animal ethics approval.

      This research was conducted under an ethics committee's approval for collaring the feral pigs. This research is part of an ongoing pest management activity, and all the ethics approvals have been highlighted in the main manuscript.

      The authors largely achieved their aims, with the results supporting their conclusions about the importance of sex and seasonality in feral pig contact networks. This work is likely to have a significant impact on feral pig management and disease control strategies in Australia, providing crucial data for refining disease transmission models.

      Reviewer #2 (Public review):

      Summary:

      The paper attempts to elucidate how feral (wild) pigs cause distortion of the environment in over 54 countries of the world, particularly Australia.

      The paper displays proof that over $120 billion worth of facilities were destroyed annually in the United States of America.

      The authors have tried to infer that the findings of their work were important and possess a convincing strength of evidence.

      Strengths:

      (1) Clearly stating feral (wild) pigs as a problem in the environment.

      (2) Stating how 54 countries were affected by the feral pigs.

      (3) Mentioning how $120 billion was lost in the US, annually, as a result of the activities of the feral pigs.

      (4) Amplifying the fact that 14 species of animals were being driven into extinction by the feral pigs.

      (5) Feral pigs possessing zoonotic abilities.

      (6) Feral pigs acting as reservoirs for endemic diseases like brucellosis and leptospirosis.

      (7) Understanding disease patterns by the social dynamics of feral pig interactions.

      (8) The use of 146 GPS-monitored feral pigs to establish their social interaction among themselves.

      Weaknesses:

      (1) Unclear explanation of the association of either the female or male feral pigs with each other, seasonally.

      This will be better explain in the methods.

      (2) The "abstract paragraph" was not justified.

      We have justified the abstract paragraph as requested by the reviewer.

      (3) Typographical errors in the abstract.

      Typographical errors have been corrected in the Abstract.

      Reviewer #3 (Public review):

      Summary:

      The authors sought to understand social interactions both within and between groups of feral pigs, with the intent of applying their findings to models of disease transmission. The authors analyzed GPS tracking data from across various populations to determine patterns of contact that could support the transmission of a range of zoonotic and livestock diseases. The analysis then focused on the effects of sex, group dynamics, and seasonal changes on contact rates that could be used to base targeted disease control strategies that would prioritize the removal of adult males for reducing intergroup disease transmission.

      Strengths:

      It utilized GPS tracking data from 146 feral pigs over several years, effectively capturing seasonal and spatial variation in the social behaviors of interest. Using proximity-based social network analysis, this work provides a highly resolved snapshot of contact rates and interactions both within and between groups, substantially improving research in wildlife disease transmission. Results were highly useful and provided practical guidance for disease management, showing that control targeted at adult males could reduce intergroup disease transmission, hence providing an approach for the control of zoonotic and livestock diseases.

      Weaknesses:

      Despite their reliability, populations can be skewed by small sample sizes and limited generalizability due to specific environmental and demographic characteristics. Further validation is needed to account for additional environmental factors influencing social dynamics and contact rates

      This is a good point, and we thank the reviewer for pointing out this issue. We will discuss the potential biases due to sample size in our discussion. We agree that environmental factors need to be incorporated and tested for their influence on social dynamics, and this will be added to the discussion as we have plans to expand this research and conduct, the analysis to determine if environmental factors are influencing social dynamics.

    1. eLife Assessment

      This valuable study uses extensive comparative analysis to examine the relationship between plasma glucose levels, albumin glycation levels, and diet and life history, within the framework of the "pace of life syndrome" hypothesis. The evidence that glucose and glycation levels are broadly correlated is convincing. However, concerns about the consistency of the data quality across species and some aspects of data analysis make the key conclusion about higher glycation resistance in species with higher glucose levels currently incomplete. Still, as the first extensive comparative analysis of glycation rates, life history, and glucose levels in birds, the study has potential to be of interest to evolutionary ecologists and the aging research community more broadly.

    2. Reviewer #1 (Public review):

      The paper explored cross-species variance in albumin glycation and blood glucose levels in the function of various life-history traits. Their results show that<br /> (1) blood glucose levels predict albumin gylcation rates<br /> (2) larger species have lower blood glucose levels<br /> (3) lifespan positively correlates with blood glucose levels and<br /> (4) diet predicts albumin glycation rates.

      The data presented is interesting, especially due to the relevance of glycation to the ageing process and the interesting life-history and physiological traits of birds. Most importantly, the results suggest that some mechanisms might exist that limit the level of glycation in species with the highest blood glucose levels.

      While the questions raised are interesting and the amount of data the authors collected is impressive, I have some major concerns about this study:

      (1) The authors combine many databases and samples of various sources. This is understandable when access to data is limited, but I expected more caution when combining these. E.g. glucose is measured in all samples without any description of how handling stress was controlled for. E.g glucose levels can easily double in a few minutes in birds, potentially introducing variation in the data generated. The authors report no caution of this effect, or any statistical approaches aiming to check whether handling stress had an effect here, either on glucose or on glycation levels.

      (2) The database with the predictors is similarly problematic. There is information pulled from captivity and wild (e.g. on lifespan) without any confirmation that the different databases are comparable or not (and here I'm not just referring to the correlation between the databases, but also to a potential systematic bias (e.g. captivate-based sources likely consistently report longer lifespans). This is even more surprising, given that the authors raise the possibility of captivity effects in the discussion, and exploring this question would be extremely easy in their statistical models (a simple covariate in the MCMCglmms).

      (3) The authors state that the measurement of one of the primary response variables (glycation) was measured without any replicability test or reference to the replicability of the measurement technique.

      (4) The methods and results are very poorly presented. For instance, new model types and variables are popping up throughout the manuscript, already reporting results, before explaining what these are e.g. results are presented on "species average models" and "model with individuals", but it's not described what these are and why we need to see both. Variables, like "centered log body mass", or "mass-adjusted lifespan" are not explained. The results section is extremely long, describing general patterns that have little relevance to the questions raised in the introduction and would be much more efficiently communicated visually or in a table.

    3. Reviewer #2 (Public review):

      Summary

      In this extensive comparative study, Moreno-Borrallo and colleagues examine the relationships between plasma glucose levels, albumin glycation levels, diet, and life-history traits across birds. Their results confirmed the expected positive relationship between plasma blood glucose level and albumin glycation rate but also provided findings that are somewhat surprising or contradicting findings of some previous studies (relationships with lifespan, clutch mass, or diet). This is the first extensive comparative analysis of glycation rates and their relationships to plasma glucose levels and life history traits in birds that are based on data collected in a single study and measured using unified analytical methods.

      Strengths

      This is an emerging topic gaining momentum in evolutionary physiology, which makes this study a timely, novel, and very important contribution. The study is based on a novel data set collected by the authors from 88 bird species (67 in captivity, 21 in the wild) of 22 orders, which itself greatly contributes to the pool of available data on avian glycemia, as previous comparative studies either extracted data from various studies or a database of veterinary records of zoo animals (therefore potentially containing much more noise due to different methodologies or other unstandardised factors), or only collected data from a single order, namely Passeriformes. The data further represents the first comparative avian data set on albumin glycation obtained using a unified methodology. The authors used LC-MS to determine glycation levels, which does not have problems with specificity and sensitivity that may occur with assays used in previous studies. The data analysis is thorough, and the conclusions are mostly well-supported (but see my comments below). Overall, this is a very important study representing a substantial contribution to the emerging field of evolutionary physiology focused on the ecology and evolution of blood/plasma glucose levels and resistance to glycation.

      Weaknesses

      My main concern is about the interpretation of the coefficient of the relationship between glycation rate and plasma glucose, which reads as follows: "Given that plasma glucose is logarithm transformed and the estimated slope of their relationship is lower than one, this implies that birds with higher glucose levels have relatively lower albumin glycation rates for their glucose, fact that we would be referring as higher glycation resistance" (lines 318-321) and "the logarithmic nature of the relationship, suggests that species with higher plasma glucose levels exhibit relatively greater resistance to glycation" (lines 386-388). First, only plasma glucose (predictor) but not glycation level (response) is logarithm transformed, and this semi-logarithmic relationship assumed by the model means that an increase in glycation always slows down when blood glucose goes up, irrespective of the coefficient. The coefficient thus does not carry information that could be interpreted as higher (when <1) or lower (when >1) resistance to glycation (this only can be done in a log-log model, see below) because the semi-log relationship means that glycation increases by a constant amount (expressed by the coefficient of plasma glucose) for every tenfold increase in plasma glucose (for example, with glucose values 10 and 100, the model would predict glycation values 2 and 4 if the coefficient is 2, or 0.5 and 1 if the coefficient is 0.5). Second, the semi-logarithmic relationship could indeed be interpreted such that glycation rates are relatively lower in species with high plasma glucose levels. However, the semi-log relationship is assumed here a priori and forced to the model by log-transforming only glucose level, while not being tested against alternative models, such as: (i) a model with a simple linear relationship (glycation ~ glucose); or (ii) a log-log model (log(glycation) ~ log(glucose)) assuming power function relationship (glycation = a * glucose^b). The latter model would allow for the interpretation of the coefficient (b) as higher (when <1) or lower (when >1) resistance in glycation in species with high glucose levels as suggested by the authors.

      Besides, a clear explanation of why glucose is log-transformed when included as a predictor, but not when included as a response variable, is missing.

      The models in the study do not control for the sampling time (i.e., time latency between capture and blood sampling), which may be an important source of noise because blood glucose increases because of stress following the capture. Although the authors claim that "this change in glucose levels with stress is mostly driven by an increase in variation instead of an increase in average values" (ESM6, line 46), their analysis of Tomasek et al.'s (2022) data set in ESM1 using Kruskal-Wallis rank sum test shows that, compared to baseline glucose levels, stress-induced glucose levels have higher median values, not only higher variation.

      Although the authors calculated the variance inflation factor (VIF) for each model, it is not clear how these were interpreted and considered. In some models, GVIF^(1/(2*Df)) is higher than 1.6, which indicates potentially important collinearity; see for example https://www.bookdown.org/rwnahhas/RMPH/mlr-collinearity.html). This is often the case for body mass or clutch mass (e.g. models of glucose or glycation based on individual measurements).

      It seems that the differences between diet groups other than omnivores (the reference category in the models) were not tested and only inferred using the credible intervals from the models. However, these credible intervals relate to the comparison of each group with the reference group (Omnivore) and cannot be used for pairwise comparisons between other groups. Statistics for these contrasts should be provided instead. Based on the plot in Figure 4B, it seems possible that terrestrial carnivores differed in glycation level not only from omnivores but also from herbivores and frugivores/nectarivores.

      Given that blood glucose is related to maximum lifespan, it would be interesting to also see the results of the model from Table 2 while excluding blood glucose from the predictors. This would allow for assessing if the maximum lifespan is completely independent of glycation levels. Alternatively, there might be a positive correlation mediated by blood glucose levels (based on its positive correlations with both lifespan and glycation), which would be a very interesting finding suggesting that high glycation levels do not preclude the evolution of long lifespans.

    4. Author response:

      Reviewer #1:

      (1) This concern is addressed in the ESM6, and partly in the ESM1. Indeed, many of the concerns raised by the reviewer later are already addressed on the multiple supplementary materials provided, so we kindly ask the reviewer to read them before moving forward into the discussion.

      (2) This concern is reasonable, but its solution is not "extremely easy", as the reviewer states. The reviewer indicates the use of captive-based versus non-captive-based sources, remarking maximum lifespan, the main variable that is clearly expected to be systematically biased by the source of the data. Nevertheless, except for the ZIMS database, which includes only captive individuals, and some sources, as CNRS databases and EURING, which exclusively includes wild populations, the remaining databases, which are indeed where the vast majority of the data was collected from (i.e. Amniotes database, Birds of the World and AnAge) do not make any distinction. This means that they include just the maximum lifespan from the species as known by the authors of such databases' entries, regardless of provenance, which is also not usually made explicit by the database. Therefore, correcting for this would imply checking all the primary sources. Considering that these databases sometimes do not cite the primary source, but a secondary one, and that on several occasions such source is a specialized book that is not easily accessible, and still these referenced datasets may not indicate the source of the data, tracing all of this information becomes an arduous task, that would even render the usage of databases themselves useless. We will include some details about the concerns of database usage in the discussion to address this.

      Furthermore, it remains relevant to indicate that what we discuss later about the possible effects of captivity is about our usage of animals that come from both sources, not about the provenance of the literature-extracted data used (i.e. captive or wild maximum lifespan, for example), which is an independent matter. We can test for the first for next submission, but very difficultly could we test for the second (as the reviewer seems to be pointing to). In any case, as we do not have in any case the same species from both a captive and a wild source, it would be difficult to determine if the effect tested comes from captivity or from species-specific differences.

      (3) We will add data on the replicability of the glycation measurement in the next manuscript version. The CV for several individuals of different species measured repeated times is quite low (always below 2%).

      (4) The reviewer remarks reported here are already addressed on the supplementary material (ESM6), given the lack of space in the main manuscript. We therefore kindly ask the reviewer to read the supplementary material added to the submission. If the editors agree, all or a considerable part of this could be transferred to the main text for clarity, but this would severely extend the length of a text that the reviewer already considered very long.

      Reviewer #2:

      Thanks for spotting this issue with the coefficient, as it is actually a redaction mistake. It is a remnant of a previous version of the manuscript in which a log-log relation was performed instead. Previous reviewers raised concerns about the usage of log transformation for glycation, this variable being (theoretically) a proportion variable (to which we argue that it does not behave as such), which they considered not to be transformed with a logarithm. After this, we still finally took the decision of not to transform this variable. In this line, the transformations of variables were decided generally by preliminary data exploration. In this particular case, both approaches lead to the same conclusion of higher glycation resistance in the species with higher glucose. Nevertheless, we will consider exploring the comparison of different versions for the resubmission.

      About the issue related to handling time, this variable is not available, for the reasons already exposed in the answer to the other reviewer. Moreover, Kruskal-Wallis test, by its nature, does not determine differences in medians between groups per se, as the reviewer claims, but just differences in ranks-sums. It can be equivalently used for that purpose when the groups' distributions are similar, but not when they differ, as we see here with a difference in variance. What a significant outcome in a Kruskal-Wallis test tells us, thus, is just that the groups differ (in their ranks-sums), which here is plausibly caused by the higher variance in the stressed individuals. Even if we conclude that the average is higher in those groups, mere comparisons of averages for groups with very different variances render different interpretations than when homoscedasticity is met, particularly more so when the distribution of groups overlaps. For example, in a case like this, where the data is left censored (glucose levels cannot be lower than 0), most of this higher variance is related to many values in the stressed groups lying above all the baseline values. This, of course, would increase the average, but such a parameter would not mean the same as if the distributions did not overlap.

      Regarding the GVIFs, why the values are above 1.6 is not well known, but we do not consider this a major concern, as the values are never above 2.2, level usually considered more worrying. We will include a brief explanation of this in the results section. Also, we explicitly calculated life history variables adjusted for body mass, which should eliminate their otherwise strong correlation. There exist other biological and interpretational reasons justified in the ESM6 for using the residuals on the models, instead of the raw values, despite previously raised concerns.

      Given the asseveration by the reviewer that credible intervals are not to be used for the post hoc comparisons, as this is what the whiskers shown in Figure 4B represent, the affirmation of this graph suggesting any difference between groups remains doubtful. New comparisons have now been made with the function HPDinterval() applied to the differences between each diet category calculated from the posterior values of each group, confirming no significant differences exist.

      We do not understand the suggestion made in relation to the model shown in Table 2. Removing glucose from the model could have two results, as the reviewer indicates: 1. Maximum lifespan (ML) relates with glycation, potentially spuriously through the effect of glucose (in this case not included) on both; 2. ML does not relate to glycation, and therefore "high glycation levels do not preclude the evolution of long lifespans", which is what we are already showing with the current model, which also controls for glucose, in an attempt to determine if not just raw glycation values, but glycation resistance, relates to longevity. This is intended to asses if long-lived species may show mechanisms that avoid glycation, by showing levels lower than expected for a non-enzymatic reaction.

    1. eLife Assessment

      This study aims to investigate the RNA binding activities of a conserved heterochromatin protein (Swi6) and proposes an entirely new model for how heterochromatin formation is initiated in fission yeast. While the concept is interesting, the data provided are inadequate, both for support of the claims regarding the new RNA binding activities and for support of the new model. The paper requires extensive editing as well as the inclusion of numerous experiments with appropriately controlled conditions.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript explores the RNA binding activities of the fission yeast Swi6 (HP1) protein and proposes a new role for Swi6 in RNAi-mediated heterochromatin establishment. The authors claim that Swi6 has a specific and high affinity for short interfering RNAs (siRNAs) and recruits the Clr4 (Suv39h) H3K9 methyltransferases to siRNA-DNA hybrids to initiate heterochromatin formation. These claims are not in any way supported by the incomplete and preliminary RNA binding or the in vivo experiments that the authors present. The proposed model also lacks any mechanistic basis as it remains unclear (and unexplored) how Swi6 might bind to specific small RNA sequences or RNA-DNA hybrids. Work by several other groups in the field has led to a model in which siRNAs produced by the RNAi pathway load onto the Ago1-containing RITS complex, which then binds to nascent transcripts at pericentromeric DNA repeats and recruits Clr4 to initiate heterochromatin formation. Swi6 facilitates this process by promoting the recruitment of the RNA-dependent RNA polymerase leading to siRNA amplification.

      Weaknesses:

      (1) The claims that Swi6 binds to specific small RNAs or to RNA-DNA hybrids are not supported by the evidence that the authors present. Their experiments do not rule out non-specific charged-based interactions. Claims about different affinities of Swi6 for RNAs of different sizes are based on a comparison of KD values derived by the authors for a handful of S. pombe siRNAs with previous studies from the Buhler lab on Swi6 RNA binding. The authors need to compare binding affinities under identical conditions in their assays. The regions of Swi6 that bind to siRNAs need to be identified and evidence must be provided that Swi6 binds to RNAs of a specific length, 20-22 mers, to support the claim that Swi6 binds to siRNAs. This is critical for all the subsequent experiments and claims in the study.

      (2) The in vivo results do not validate Swi6 binding to specific RNAs, as stated by the authors. Swi6 pulldowns have been shown to be enriched for all heterochromatic proteins including the RITS complex. The sRNA binding observed by the authors is therefore likely to be mediated by Ago1/RITS.

      Most of the binding in Figure S8C seems to be non-specific.

      In Figure S8D, the authors' data shows that Swi6 deletion does not derepress the rev dh transcript while dcr1 delete cells do, which is consistent with previous reports but does not relate to the authors' conclusions.

      Previous results have shown that swi6 delete cells have 20-fold fewer dg and dh siRNAs than swi6+ cells due to decreased RNA-dependent RNA polymerase complex recruitment and reduced siRNA amplification.

      (3) The RIP-seq data are difficult to interpret as presented. The size distribution of bound small RNAs, and where they map along the genome should be shown as for example presented in previous Ago1 sRNA-seq experiments.

      It is also unclear whether the defects in sRNA binding observed by the authors represent direct sRNA binding to Swi6 or co-precipitation of Ago1-bound sRNAs.

      The authors should also sequence total sRNAs to test whether Swi6-3A affects sRNA synthesis, as is the case in swi6 delete cells.

      (4) The authors examine the effects of Swi6-3A mutant by overexpression from the strong nmt1 promoter. Heterochromatin formation is sensitive to the dosage of Swi6. These experiments should be performed by introducing the 3A mutations at the endogenous Swi6 locus and effects on Swi6 protein levels should be tested.

      (5) The authors' data indicate an impairment of silencing in Swi6-3A mutant cells but whether this is due to a general lower affinity for nucleosomes, DNA, RNA, or as claimed by the authors, siRNAs is unclear. These experiments are consistent with previous findings suggesting an important role for basic residues in the HP1 hinge region in gene silencing but do not reveal how the hinge region enhances silencing.

      (6) RNase H1 overexpression may affect Swi6 localization and silencing indirectly as it would lead to a general reduction in R loops and RNA-DNA hybrids across the genome. RNaseH1 OE may also release chromatin-bound RNAs that act as scaffolds for siRNA-Ag1/RITS complexes that recruit Clr4 and ultimately Swi6.

      (7) Examples of inaccurate presentation of the literature.<br /> a. The authors state that "RNA binding by the murine HP1 through its hinge domains is required for heterochromatin assembly (Muchardt et al, 2002). The cited reference provides no evidence that HP1 RNA binding is required for heterochromatin assembly. Only the hinge region of bacterially produced HP1 contributes to its localization to DAPI-stained heterochromatic regions in fixed NIH 3T3 cells.<br /> b. "... This scenario is consistent with the loss of heterochromatin recruitment of Swi6 as well as siRNA generation in rnai mutants (Volpe et al, 2002)." Volpe et al. did not examine changes in siRNA levels in swi6 mutant cells. In fact, no siRNA analysis of any kind was reported in Volpe et al., 2002.

    3. Reviewer #2 (Public review):

      The aim of this study is to investigate the role of Swi6 binding to RNA in heterochromatin assembly in fission yeast. Using in vitro protein-RNA binding assays (EMSA) they showed that Swi6/HP1 binds centromere-derived siRNA (identified by Reinhardt and Bartel in 2002) via the chromodomain and hinge domains. They demonstrate that this binding is regulated by a lysine triplet in the conserved region of the Swi6 hinge domain and that wild-type Swi6 favours binding to DNA-RNA hybrids and siRNA, which then facilitates, rather than competes with, binding to H3K9me2 and to a lesser extent H3K9me3.

      However, the majority of the experiments are carried out in swi6 null cells overexpressing wild-type Swi6 or Swi63K-3A mutant from a very strong promoter (nmt1). Both swi6 null cells and overexpression of Swi6 are well known to exhibit phenotypes, some of which interfere with heterochromatin assembly. This is not made clear in the text. Whilst the RNA binding experiments show that Swi6 can indeed bind RNA and that binding is decreased by Swi63K-3A mutation in vitro (confusingly, they only much later in the text explained that these 3 bands represent differential binding and that II is likely an isotherm). The gels showing these data are of poor quality and it is unclear which bands are used to calculate the Kd. RNA-seq data shows that overall fewer siRNAs are produced from regions of heterochromatin in the Swi63K-3A mutant so it is unsurprising that analysis of siRNA-associated motifs also shows lower enrichment (or indeed that they share some similarities, given that they originate from repeat regions).

      The experiments are seemingly linked yet fail to substantiate their overall conclusions. For instance, the authors show that the Swi63K-3A mutant displays reduced siRNA binding in vitro (Figure 1D) and that H3K9me2 levels at heterochromatin loci are reduced in vivo (Figure 3C-D). They conclude that Swi6 siRNA binding is important for Swi6 heterochromatin localization, whilst it remains entirely possible that heterochromatin integrity is impaired by the Swi63K-3A mutation and hence fewer siRNAs are produced and available to bind. Their interpretation of the data is really confusing.

      The authors go on to show that Swi63K-3A cells have impaired silencing at all regions tested and the mutant protein itself has less association with regions of heterochromatin. They perform DNA-RNA hybrid IPs and show that Swi63K-3A cells which also overexpress RNAseH/rnh1 have reduced levels of dh DNA-RNA hybrids than wild-type Swi6 cells. They interpret this to mean that Swi6 binds and protects DNA-RNA hybrids, presumably to facilitate binding to H3K9me2. The final piece of data is an EMSA assay showing that "high-affinity binding of Swi6 to a dg-dh specific RNA/DNA hybrid facilitates the binding to Me2-K9-H3 rather than competing against it." This EMSA gel shown is of very poor quality, and this casts doubt on their overall conclusion.

      Unfortunately, the manuscript is generally poorly written and difficult to comprehend. The experimental setups and interpretations of the data are not fully explained, or, are explained in the wrong order leading to a lack of clarity. An example of this is the reasoning behind the use of the cid14 mutant which is not explained until the discussion of Figure 5C, but it is utilised at the outset in Figure 5A.

      Another example of this lack of clarity/confusion is that the abstract states "Here we provide evidence in support of RNAi-independent recruitment of Swi6". Yet it then states "We show that...Swi6/HP1 displays a hierarchy of increasing binding affinity through its chromodomain to the siRNAs corresponding to specific dg-dh repeats, and even stronger binding to the cognate siRNA-DNA hybrids than to the siRNA precursors or general RNAs." RNAi is required to produce siRNAs, so their message is very unclear. Moreover, an entire section is titled "Heterochromatin recruitment of Swi6-HP1 depends on siRNA generation" so what is the author's message?

      The data presented, whilst sound in some parts is generally overinterpreted and does not fully support the author's confusing conclusions. The authors essentially characterise an overexpressed Swi6 mutant protein with a few other experiments on the side, that do not entirely support their conclusions. They make the point several times that the KD for their binding experiments is far higher than that previously reported (Keller et al Mol Cell 2012) but unfortunately the data provided here are of an inferior quality and thus their conclusions are neither fully supported nor convincing.

    4. Author response:

      In this manuscript, we have addressed one of the possible modes of recruitment of Swi6 to the putative heterochromatin loci.

      Our investigation was guided by earlier work showing ability of HP1 a to bind to a class of RNAs and the role of this binding in recruitment of HP1a to heterochromatin loci in mouse cells (Muchardt et al). While there has been no clarity about the mechanism of Swi6 recruitment given the multiple pathways being involved, the issue is compounded by the overall lack of understanding as to how Swi6 recruitment occurs only at the repeat regions. At the same time, various observations suggested a causal role of RNAi in Swi6 recruitment.

      Thus, guided by the work of Muchardt et al we developed a heuristic approach to explore a possibly direct link between Swi6 and heterochromatin through RNAi pathway. Interestingly, we found that the lysine triplet found in the hinge domain in HP1, which influences its recruitment to heterochromatin in mouse cells, is also present in the hinge domain of Swi6, although we were cautious, keeping in mind the findings of Keller et al showing another role of Swi6 in binding to RNAs and channeling them to the exosome pathway. 

      Accordingly, we envisaged that a mode of recruitment of Swi6 through binding to siRNAs to cognate sites in the dg-dh repeats shared among mating type, centromere and telomere loci could explain specific recruitment as well as inheritance following DNA replication. In accordance we framed the main questions as follows: i) Whether Swi6 binds specifically and with high affinity to the siRNAs and the cognate siRNA-DNA hybrids and whether the Swi63K-3A mutant is defective in this binding, ii) whether this lack of binding of Swi63K-3A affects its localization to heterochromatin, iii) whether the this specificity is validated by binding of Swi6 but not Swi63K-3A  to siRNAs and siRNA-DNA hybrids in vivo and iv) whether the binding mode was qualitatively and quantitatively different from that of Cen100 RNA or random RNAs, like GFP RNA.

      We think that our data provides answers to these lines of inquiry to support a model wherein the Swi6-siRNA mediated recruitment can explain a cis-controlled nucleation of heterochromatin at the cognate sites in the genome. We have also partially addressed the points raised by the study by Keller et al by invoking a dynamic balance between different modes of binding of Swi6 to different classes of RNA to exercise heterochromatin formation by Swi6 under normal conditions and RNA degradation under other conditions.

      While we aver about our hypothesis, we do acknowledge the need for more detailed investigation both to buttress our hypothesis and address the dynamics of siRNA binding and recruitment of Swi6  and how Swi6 functions fit in the context of other components of heterochromatin assembly, like the HDACs and Clr4 on one hand and exosome pathway on the other. Our future studies will attempt to address these issues.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript explores the RNA binding activities of the fission yeast Swi6 (HP1) protein and proposes a new role for Swi6 in RNAi-mediated heterochromatin establishment. The authors claim that Swi6 has a specific and high affinity for short interfering RNAs (siRNAs) and recruits the Clr4 (Suv39h) H3K9 methyltransferases to siRNA-DNA hybrids to initiate heterochromatin formation. These claims are not in any way supported by the incomplete and preliminary RNA binding or the in vivo experiments that the authors present. The proposed model also lacks any mechanistic basis as it remains unclear (and unexplored) how Swi6 might bind to specific small RNA sequences or RNA-DNA hybrids. Work by several other groups in the field has led to a model in which siRNAs produced by the RNAi pathway load onto the Ago1-containing RITS complex, which then binds to nascent transcripts at pericentromeric DNA repeats and recruits Clr4 to initiate heterochromatin formation. Swi6 facilitates this process by promoting the recruitment of the RNA-dependent RNA polymerase leading to siRNA amplification.

      Weaknesses:

      (1) a) The claims that Swi6 binds to specific small RNAs or to RNA-DNA hybrids are not supported by the evidence that the authors present. Their experiments do not rule out non-specific charged-based interactions.

      We disagree. We have used synthetic siRNAs of 20-22 nt length to do EMSA assay, as mentioned in the manuscript. Further, we have sequenced the small RNAs obtained after RIP experiments to validate the enrichment of siRNA in Swi6 bound fraction as compared to the mutant Swi6-bound fraction. These results are internally consistent regardless of the mode of binding. In any case the binding occurs primarily through the chromodomain although it is influenced by the hinge domain (see below).

      Furthermore, we have carried out EMSA experiments using Swi6 mutants carrying all three possible double mutations of the K residues in the KKK triplet and found that there was no difference in the binding pattern as compared to the wt Swi6: only the triple mutant “3K-3A” showed the effect. These results suggest that that the bdining is not completely dependent on the basic residues. These results will be included in the revised version.

      We also have some preliminary data from SAXS study showing that the CD of wt Swi6 shows a change in its structure upon binding to the siRNA, while the “3K-3A” mutant of Swi6 has a compact, folded structure that occludes the binding site of Swi6 in the chromodomain.” We propose to mention this preliminary finding in the revised version as unpublished data.

      b) Claims about different affinities of Swi6 for RNAs of different sizes are based on a comparison of KD values derived by the authors for a handful of S. pombe siRNAs with previous studies from the Buhler lab on Swi6 RNA binding. The authors need to compare binding affinities under identical conditions in their assays.

      Thus, the EMSA data do suggest sequence specificity in binding of Swi6 to specific siRNA sequences (Figure S5) and implies specific residues in Swi6 being responsible for that. Thus, Identification of the residues in Swi6 involved in siRNA binding in the CD would definitely be interesting, as also the experimental confirmation of the consensus siRNA sequence. It may however be noted that as against the binding of Swi6 to siRNAs occurs through CD, that of Cen100 or GFP RNA was shown be through the hinge domain by Keller et al.

      The estimation of Kd by the Buhler group was based on NMR study, which we are not in a position to perform in the near future. Nonetheless, we did carry out EMSA study using the ‘Cen100’ RNA, same as the one used by the Keller et al study. Surprisingly, in contrast with the result of EMSA in agarose gel showing binding of Swi6 to “Cen100” RNA as reported by Keller et al, we fail to observe any binding in EMSA done in acrylamide gel. (The same is true of the RevCen 100). While this raises issues of why the Keller et al chose to do EMSA in agarose gel instead of the conventional approach of using acrylamide gel, it does lend support to our claim of stronger binding of Swi6 to siRNAs. Another relevant observation of binding of Swi6 to the “RevCen” RNA precursor RNAs but a detectable binding to siRNAs denoted as VI-IX (as measured by competition experiments, that are derived from RevCen RNA; Figure S4 and S7), which are derived by Dcr1 cleavage of the ‘’RevCen’’ RNA.

      We also disagree that we carried out EMSA with a small bunch of siRNAs. As indicated in Figure 1 and S1, we synthesized nearly 12 siRNAs representing the dg-dh repeats at Cen, mat and tel loci and measured their specificity of binding to Swi6 using EMSA assay by labeling the ones labelled “D”, “E” and “V” directly and those of the remaining ones by the latter’s ability to compete against the binding (Figure 1, S4). These results point to presence of a consensus sequence in siRNAs that shows highly specific and strong binding to Swi6 in the low micromolar range.

      Further, our claim of binding of Swi6 and not Swi63K>3A to siRNA in vivo is validated by RIP experiments, as shown in Fig 2 and S9.

      c) The regions of Swi6 that bind to siRNAs need to be identified and evidence must be provided that Swi6 binds to RNAs of a specific length, 20-22 mers, to support the claim that Swi6 binds to siRNAs. This is critical for all the subsequent experiments and claims in the study.

      We have provided both in vitro data, which is va;idiated in vivo by RIP experiments, as mentioned above. However, we agree that it wpuld be very interesting to identify the residues in Swi6 chromdomain responsible for binding to siRNA. However, such an investigation is beyond the scope of the present study.

      (2) a) The in vivo results do not validate Swi6 binding to specific RNAs, as stated by the authors. Swi6 pulldowns have been shown to be enriched for all heterochromatic proteins including the RITS complex. The sRNA binding observed by the authors is therefore likely to be mediated by Ago1/RITS.

      We disagree with the first comment. Our RIP experiments do validate the in vitro results (Fig 1, 2, S4 and S9), as argued above. The observation alluded to by the reviewer “Swi6 pulldowns have been shown to be enriched for all heterochromatic proteins including the RITS complex” is not inconsistent with our observation; it is possible that the siRNA may be released from the RITS complex and transferred to Swi6, possibly due to its higher affinity.

      Thus, we would like to suggest that the role of Swi6 is likely to be coincidental or subsequent to that of Ago1/RITS (see below). We think that the binding by Swi6 to the siRNA and siRNA-DNA hybrid and could be also carried out in cis at the level of siRNA-DNA hybrids.

      This point needs to be addressed in future studies.

      b) Most of the binding in Figure S8C seems to be non-specific.

      We would like to point out that the result in Figure S8C needs to be examined together with the Figure S8B, which shows RNA bound by Swi6 but not Swi63K-3A to hybridize with dg, dh and dh-k probes.

      c) In Figure S8D, the authors' data shows that Swi6 deletion does not derepress the rev dh transcript while dcr1 delete cells do, which is consistent with previous reports but does not relate to the authors' conclusions.

      The purpose of results shown in Figure S8D is just to compare the results of Swi6 with that of Swi63K-3A.

      d) Previous results have shown that swi6 delete cells have 20-fold fewer dg and dh siRNAs than swi6+ cells due to decreased RNA-dependent RNA polymerase complex recruitment and reduced siRNA amplification.

      This result is consistent with our results invoking a role of Swi6 in binding to, protecting and recruiting siRNAs to homologous sites.

      To find if the overall production of siRNA is compromised in swi6 3K->3A mutant, we i) calculated the RIP-Seq read counts for swi6 3K->3A , swi6+ and vector control in 200 bp genomic bins , ii) divided the Swi6 3K->3A and swi6+ signals by that of control, iii) removed the background using the criteria of signal value < 25% of max signal, and iv) counted the total reads (in excess to control) in all peak regions in both samples.  This revealed a total count of 10878 and 8994 respectively for Swi6 3K->3A  and swi6+ samples, possibly implying that the overall siRNA production is not compromised in the Swi6 3K->3A mutant.

      (3) a) The RIP-seq data are difficult to interpret as presented. The size distribution of bound small RNAs, and where they map along the genome should be shown as for example presented in previous Ago1 sRNA-seq experiments.

      Please see the response to 2(d).

      b) It is also unclear whether the defects in sRNA binding observed by the authors represent direct sRNA binding to Swi6 or co-precipitation of Ago1-bound sRNAs.

      The correspondence between our in vivo and in vitro results suggests that the binding to Swi6 would be direct. We do not observe a complete correspondence between the Swi6- and Ago-bound siRNAs. We think Swi6 binding may be coincident with or following RITS complex formation.

      This point will be discussed in the Revision.

      The authors should also sequence total sRNAs to test whether Swi6-3A affects sRNA synthesis, as is the case in swi6 delete cells.

      Please see response to 2(d) above.

      (4) The authors examine the effects of Swi6-3A mutant by overexpression from the strong nmt1 promoter. Heterochromatin formation is sensitive to the dosage of Swi6. These experiments should be performed by introducing the 3A mutations at the endogenous Swi6 locus and effects on Swi6 protein levels should be tested.

      Although we agree, we think that the heterochromatin formation is occurring in presence of nmt1-driven Swi6 but not Swi63K>3A, as indicated by the phenotype and Swi6 enrichment at otr1R::ade6, imr1::ura4 and his3-telo (Figure 3) and mating type (Fig. S10). Furthermore, the both GFP-Swi6 and GFPSwi63K>3A are expressed at similar level (Fig. S8A).

      (5) The authors' data indicate an impairment of silencing in Swi6-3A mutant cells but whether this is due to a general lower affinity for nucleosomes, DNA, RNA, or as claimed by the authors, siRNAs is unclear. These experiments are consistent with previous findings suggesting an important role for basic residues in the HP1 hinge region in gene silencing but do not reveal how the hinge region enhances silencing.

      Our study aims to correlate the binding of Swi6 but not Swi63K-3A to siRNA with its localization to heterochromatin. A similar difference in binding of Swi6 but not Swi63K-3A to siRNA-DNA hybrid, together with sensitivity of silencing and Swi6 localization to heterochromatin to RNaseH support the above correlations as being causally connected.

      In terms of mechanism of binding, we need to clarify that the primary mode of binding is through the CD and not the hinge domain, although the hinge domain does influence this binding. This result is different from those of Keller et al.

      We have some structural data based on preliminary SAXS experiment supporting binding of siRNA to the CD and influence of the hinge domain on this binding. However, this line of investigation need to be extended and will be subject of future investigations.

      (6) RNase H1 overexpression may affect Swi6 localization and silencing indirectly as it would lead to a general reduction in R loops and RNA-DNA hybrids across the genome. RNaseH1 OE may also release chromatin-bound RNAs that act as scaffolds for siRNA-Ag1/RITS complexes that recruit Clr4 and ultimately Swi6.

      These are formal possibilities. However, the correlation between swi6 binding to siRNA-DNA hybrid and delocalization upon RNase H1 treatment argues for a more direct link.

      (7) Examples of inaccurate presentation of the literature.

      a) The authors state that "RNA binding by the murine HP1 through its hinge domains is required for heterochromatin assembly (Muchardt et al, 2002). The cited reference provides no evidence that HP1 RNA binding is required for heterochromatin assembly. Only the hinge region of bacterially produced HP1 contributes to its localization to DAPI-stained heterochromatic regions in fixed NIH 3T3 cells.

      Noted. Statement will be corrected.

      b) "... This scenario is consistent with the loss of heterochromatin recruitment of Swi6 as well as siRNA generation in rnai mutants (Volpe et al, 2002)." Volpe et al. did not examine changes in siRNA levels in swi6 mutant cells. In fact, no siRNA analysis of any kind was reported in Volpe et al., 2002.

      Correct.  We only say that Swi6 recruitment is reduced in rnai mutants and correlate it with ability of SWi6 to bind to siRNA generated by RNAi and subsequently to siRNA-DNA hybrid.

      Reviewer #2 (Public review):

      The aim of this study is to investigate the role of Swi6 binding to RNA in heterochromatin assembly in fission yeast. Using in vitro protein-RNA binding assays (EMSA) they showed that Swi6/HP1 binds centromere-derived siRNA (identified by Reinhardt and Bartel in 2002) via the chromodomain and hinge domains. They demonstrate that this binding is regulated by a lysine triplet in the conserved region of the Swi6 hinge domain and that wild-type Swi6 favours binding to DNA-RNA hybrids and siRNA, which then facilitates, rather than competes with, binding to H3K9me2 and to a lesser extent H3K9me3.

      However, the majority of the experiments are carried out in swi6 null cells overexpressing wild-type Swi6 or Swi63K-3A mutant from a very strong promoter (nmt1). Both swi6 null cells and overexpression of Swi6 are well known to exhibit phenotypes, some of which interfere with heterochromatin assembly. This is not made clear in the text.

      We think that the argument is not valid as we show that swi6 but not Swi63K-3A could restore silencing at imr1::ura4, otr1::ade6 and his3-telo (Fig 3) and mating type (Fig. S10), when transformed into a swi6D strain.

      Whilst the RNA binding experiments show that Swi6 can indeed bind RNA and that binding is decreased by Swi63K-3A mutation in vitro (confusingly, they only much later in the text explained that these 3 bands represent differential binding and that II is likely an isotherm). The gels showing these data are of poor quality and it is unclear which bands are used to calculate the Kd.

      We disagree with the comment about the quality of EMSA data. We think it is of similar quality or better than that of Keller et al, except in some cases, like Fig 1D, a shorter exposure shown to distinguish the slowest shifted band has caused the remaining bands to look fainter.

      RNA-seq data shows that overall fewer siRNAs are produced from regions of heterochromatin in the Swi63K-3A mutant so it is unsurprising that analysis of siRNA-associated motifs also shows lower enrichment (or indeed that they share some similarities, given that they originate from repeat regions).

      Please see response to comment 2(d) of the first reviewer above.

      It is not clear which bands are being alluded to. However, we‘ll rectify any gaps in information in the revision.

      The experiments are seemingly linked yet fail to substantiate their overall conclusions. For instance, the authors show that the Swi63K-3A mutant displays reduced siRNA binding in vitro (Figure 1D) and that H3K9me2 levels at heterochromatin loci are reduced in vivo (Figure 3C-D). They conclude that Swi6 siRNA binding is important for Swi6 heterochromatin localization, whilst it remains entirely possible that heterochromatin integrity is impaired by the Swi63K-3A mutation and hence fewer siRNAs are produced and available to bind. Their interpretation of the data is really confusing.

      Our argument is that the lack of binding by Swi63K>3A to siRNA can explain the loss of recruitment to heterochromatin loci and thus affect the integrity of heterochroamtin; the recruitment of Swi6 can occur possibly by binding initially to siRNA and thereafter as siRNA-DNA hybrid. However, the overall level of siRNAs is not affected, as in 2(D) above. This interpretation is supported by results of ChIP assay and confocal experiments, as also by the effect of RNaseH1 in the recruitment of Swi6.

      The authors go on to show that Swi63K-3A cells have impaired silencing at all regions tested and the mutant protein itself has less association with regions of heterochromatin. They perform DNA-RNA hybrid IPs and show that Swi63K-3A cells which also overexpress RNAseH/rnh1 have reduced levels of dh DNA-RNA hybrids than wild-type Swi6 cells. They interpret this to mean that Swi6 binds and protects DNA-RNA hybrids, presumably to facilitate binding to H3K9me2. The final piece of data is an EMSA assay showing that "high-affinity binding of Swi6 to a dg-dh specific RNA/DNA hybrid facilitates the binding to Me2-K9-H3 rather than competing against it." This EMSA gel shown is of very poor quality, and this casts doubt on their overall conclusion.

      We do agree with the reviewer about the quality of EMSA (Fig. 5B). However, as may be noticed in the EMSA for siRNA-DNA hybrid binding  (Fig 4A), the bands of Swi6-bound siRNA-DNA hybrid are extremely retarded. Hence the EMSA for subsequent binding by H3-K9-Me peptides required a longer electrophoretic run, which led to reduction in the sharpness of the bands. Nevertheless, the data does indicate binding efficiency in the order H3K9-Me2> H3-K9-Me3 > H3-K9-Me0. Having said that, we plan to repeat the EMSA or address the question by other methods, like SPR.

      Unfortunately, the manuscript is generally poorly written and difficult to comprehend. The experimental setups and interpretations of the data are not fully explained, or, are explained in the wrong order leading to a lack of clarity. An example of this is the reasoning behind the use of the cid14 mutant which is not explained until the discussion of Figure 5C, but it is utilised at the outset in Figure 5A.

      We tend to agree somewhat and will attempt to submit a revised version with greater clarity, as also the explanation of experiment with cid14D strain.

      Another example of this lack of clarity/confusion is that the abstract states "Here we provide evidence in support of RNAi-independent recruitment of Swi6". Yet it then states "We show that...Swi6/HP1 displays a hierarchy of increasing binding affinity through its chromodomain to the siRNAs corresponding to specific dg-dh repeats, and even stronger binding to the cognate siRNA-DNA hybrids than to the siRNA precursors or general RNAs." RNAi is required to produce siRNAs, so their message is very unclear. Moreover, an entire section is titled "Heterochromatin recruitment of Swi6-HP1 depends on siRNA generation" so what is the author's message?

      The reviewer has correctly pointed out the error. Indeed, our results actually indicate an RNAi-dependent rather than independent mode of recruitment. Rather, we would like to suggest an H3-K9-Me2-indpendnet recruitment of Swi6. We will rectify this error in our revised manuscript.

      The data presented, whilst sound in some parts is generally overinterpreted and does not fully support the author's confusing conclusions. The authors essentially characterise an overexpressed Swi6 mutant protein with a few other experiments on the side, that do not entirely support their conclusions. They make the point several times that the KD for their binding experiments is far higher than that previously reported (Keller et al Mol Cell 2012) but unfortunately the data provided here are of an inferior quality and thus their conclusions are neither fully supported nor convincing.

      We have used the method of Heffler et al (2012) to compute the Kd from EMSA data.

    1. eLife Assessment

      This manuscript provides a valuable in-depth biochemical analysis of p53 isoforms, highlighting their aggregation propensity, interaction with chaperones, and potential dominant-negative effects on p53 family members. The study presents solid evidence of isoform-specific properties, which may contribute to protein misfolding and impaired cellular function in cancer. While highly informative, the findings would benefit from further discussion of physiological relevance, given the high isoform expression levels used, and addressing prior evidence of isoform-specific transcriptional activity. Overall, this work significantly advances our understanding of p53 isoform biochemistry and its implications for cancer research.

    2. Reviewer #1 (Public review):

      Summary:

      Brdar, Osterburg, Munick, et al. present an interesting cellular and biochemical investigation of different p53 isoforms. The authors investigate the impact of different isoforms on the in-vivo transcriptional activity, protein stability, induction of the stress response, and hetero-oligomerization with WT p53. The results are logically presented and clearly explained. Indeed, the large volume of data on different p53 isoforms will provide a rich resource for researchers in the field to begin to understand the biochemical effects of different truncations or sequence alterations.

      Strengths:

      The authors achieved their aims to better understand the impact/activity of different p53 is-forms, and their data will support their statements. Indeed, the major strengths of the paper lie in its comprehensive characterization of different p53 isoforms and the different assays that are measured. Notably, this includes p53 transcriptional activity, protein degradation, induction of the chaperone machinery, and hetero-oligomerization with wtp53. This will provide a valuable dataset where p53 researchers can evaluate the biological impact of different isoforms in different cell lines. The authors went to great lengths to control and test for the effect of (1) p53 expression level, (2) promotor type, and (3) cell type. I applaud their careful experiments in this regard.

      Weaknesses:

      One thing that I would have liked to see more of is the quantification of the various pull-down/gel assays - to better quantify the effect of, e.g., hetero-oligomerization among the various isoforms. In addition, a discussion about the role of isoforms that contain truncations in the IDRs is not available. It is well known that these regions function in an auto-inhibitory manner (e.g. work by Wright/Dyson) and also mediate many PPIs, which likely have functional roles in vivo (e.g. recruiting p53 to various complexes). The discussion could be strengthened by focusing on some of these aspects of p53 as well.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript entitled "p53 isoforms have a high aggregation propensity, interact with chaperones and lack 1 binding to p53 interaction partners", the authors suggest that the p53 isoforms have high aggregation propensity and that they can co-aggregate with canonical p53 (FLp53), p63 and p73 thus exerting a dominant-negative effect.

      Strengths:

      Overall, the paper is interesting as it provides some characterization of most p53 isoforms DNA binding (when expressed alone), folding structure, and interaction with chaperones. The data presented support their conclusion and bring interesting mechanistic insight into how p53 isoforms may exert some of their activity or how they may be regulated when they are expressed in excess.

      Weaknesses:

      The main limitation of this manuscript is that the isoforms are highly over-expressed throughout the manuscript, although the authors acknowledge that the level of expression is a major factor in the aggregation phenomenon and "that aggregation will only become a problem if the expression level surpasses a certain threshold level" (lines 273-274 and results shown in Figures S3D, 6E). The p53 isoforms are physiologically expressed in most normal human cell types at relatively low levels which makes me wonder about the physiological relevance of this phenomenon.

      Furthermore, it was previously reported that some isoforms clearly induce transcription of target genes which are not observed here. For example, p53β induces p21 expression (Fujita K. et al. p53 isoforms Delta133p53 and p53beta are endogenous regulators of replicative cellular senescence. Nat Cell Biol. 2009 Sep;11(9):1135-42), and Δ133p53α induces RAD51, RAD52, LIG4, SENS1 and SOD1 expression (Gong, L. et al. p53 isoform D113p53/D133p53 promotes DNA double-strand break repair to protect cell from death and senescence in response to DNA damage. Cell Res. 2015, 25, 351-369. / Gong, L. et al. p53 isoform D133p53 promotes the efficiency of induced pluripotent stem cells and ensures genomic integrity during reprogramming. Sci. Rep. 2016, 6, 37281. / Horikawa, I. et al. D133p53 represses p53-inducible senescence genes and enhances the generation of human induced pluripotent stem cells. Cell Death Differ. 2017, 24, 1017-1028. / Gong, L. p53 coordinates with D133p53 isoform to promote cell survival under low-level oxidative stress. J. Mol. Cell Biol. 2016, 8, 88-90. / Joruiz et al. Distinct functions of wild-type and R273H mutant Δ133p53α differentially regulate glioblastoma aggressiveness and therapy-induced senescence. Cell Death Dis. 2024 Jun 27;15(6):454.) which demonstrates that some isoforms can induce target genes transcription and have defined normal functions (e.g. Cellular senescence or DNA repair).

      However, in this manuscript, the authors conclude that isoforms are "largely unfolded and not capable of fulfilling a normal cellular function" (line 438), that they do not have "well defined physiological roles" (line 456), and that they only "have the potential to inactivate members of the p53 protein family by forming inactive hetero complexes with wtp53" (line 457-458).

      Therefore, I think it is essential that the authors better discuss this major discrepancy between their study and previously published research.

    1. eLife Assessment

      This study examines age-related, sex-specific gene expression and alternative splicing in humans using the GTEx dataset. Solid evidence is provided to demonstrate that alternative splicing was affected by both sex and age across many tissues in this dataset. Although the authors performed comprehensive computational analyses with useful 'transcriptomic' changes with sex/age, they did not validate their findings with independent longitudinal datasets. This limits the wide impact of this study but can be used as a starting point to examine sex- and age differences in the transcriptome due to alternative splicing.

    2. Reviewer #1 (Public review):

      Summary:

      Wang et al. investigate sexual dimorphic changes in the transcriptome of aged humans. This study relies upon analysis of the Genotype-Tissue Expression dataset that includes 54 tissues from human donors. The authors investigate 17,000 transcriptomes from 35 tissues to investigate the effect of age and sex on transcriptomic variation, including the analysis of alternative splicing. Alternative splicing is becoming more appreciated as an influence in the aging process, but how it is affected by sexual dimorphism is still largely unclear. The authors investigated multiple tissues but ended up distilling brain tissue down to four separate regions: decision, hormone, memory, and movement. Building upon prior work, the authors used an analysis method called principal component-based signal-to-variation ratio (pcSVR) to quantify differences between sex or age by considering data dispersion. This method also considers differentially expressed genes and alternative splicing events.

      Strengths:

      (1) The authors investigate sexual dimorphism on gene expression and alternative splicing events with age in multiple tissues from a large publicly available data set that allows for reanalysis.

      (2) Furthermore, the authors take into account the ethnic background of donors. Identification of aging-modulating genes could be useful for the reanalysis of prior data sets.

      Weaknesses:

      The models built off of the GTEx dataset should be tested in another data set (ex. Alzheimer's disease) where there are functional changes that can be correlated. Gene-length-dependent transcription decline, which occurs with age and disease, should also be investigated in this data set for potential sexual dimorphism.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Wang et al analyze ~17,000 transcriptomes from 35 human tissues from the GTEx database and address transcriptomic variations due to age and sex. They identified both gene expression changes as well as alternative splicing events that differ among sexes. Using breakpoint analysis, the authors find sex dimorphic shifts begin with declining sex hormone levels with males being affected more than females. This is an important pan-tissue transcriptomic study exploring age and sex-dependent changes although not the first one.

      Strengths:

      (1) The authors use sophisticated modeling and statistics for differential, correlational, and predictive analysis.

      (2) The authors consider important variables such as genetic background, ethnicity, sampling bias, sample sizes, detected genes, etc.

      (3) This is likely the first study to evaluate alternative splicing changes with age and sex at a pan-tissue scale.

      (4) Sex dimorphism with age is an important topic and is thoroughly analyzed in this study.

      Weaknesses:

      (1) The findings have not been independently validated in a separate cohort or through experiments. Only selective splicing factor regulation has been verified in other studies.

      (2) It seems the authors have not considered PMI or manner of death as a variable in their analysis.

      (3) The manuscript is very dense and sometimes difficult to follow due to many different types of analyses and correlations.

      (4) Short-read data can detect and quantify alternative splicing events with only moderate confidence and therefore the generalizability of these findings remains to be experimentally validated.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, Wang et al utilized the available GTEx data to compile a comprehensive analysis that attempt to reveal aging-related sex-dimorphic gene expression as well as alternative splicing changes in humans.

      The key conclusions based on their analysis are that

      (1) extensive sex-dimorphisms during aging with distinct patterns of change in gene expression and alternative splicing (AS), and

      (2) the male-biased age-associated AS events have a stronger association with Alzheimer's disease, and

      (3) the female-biased events are often regulated by several sex-biased splicing factors that may be controlled by estrogen receptors. They further performed break-point analysis and revealed that in males there are two main breakpoints around ages 35 and 50, while in females, there is only one breakpoint at 45.

      Strengths:

      This study sets an ambitious goal, leveraging the extensive GTEx dataset to investigate aging-related, sex-dimorphic gene expression and alternative splicing changes in humans. The research addresses a significant question, as our understanding of sex-dimorphic gene expression in the context of human aging is still in its early stages. Advancing our knowledge of these molecular changes is vital for identifying therapeutic targets for age-related diseases and extending the human health span. The study is highly comprehensive, and the authors are commendable for their attempted thorough analysis of both gene expression and alternative splicing - an area often overlooked in similar studies.

      Weaknesses:

      Due to the inherent noise within the GTEx dataset - which includes numerous variables beyond aging and sex - there are significant technical concerns surrounding this study. Additionally, the lack of cross-validation with independent, existing data raises questions about whether the observed gene expression changes genuinely reflect those associated with human aging. For instance, the break-point analysis in this study identifies two major breakpoints in males around ages 35 and 50, and one breakpoint in females at age 45; however, these findings contradict a recent multi-omics longitudinal study involving 108 participants aged 25 to 75 years, where breakpoint at 44 and 60 years was observed in both male and females (Shen et al, 2024). These issues cast doubt on the robustness of the study's conclusions. Specific concerns are outlined below:

      (1) The primary method used in this study is linear regression, incorporating age, sex, and age-by-sex interactions as covariates, alongside other confounding factors (such as ethnicity) as unknown variables. However, the analysis overlooks two critical known variables in the GTEx dataset: time of death (TOD) and postmortem interval (PMI). Both TOD and PMI are recorded for each sample and account for substantial variance in gene expression profiles. A recent study by Wucher et al.(Wucher et al, 2023) demonstrated the powerful impact of TOD on gene expression by using it to reconstruct human circadian and even circannual datasets. Similarly, Ferreira et al. (Ferreira et al, 2018) highlighted PMI's influence on gene expression patterns. Without properly adjusting for these two variables, confidence in the study's conclusions remains limited at best.

      (2) To demonstrate that their analysis is robust and that the covariates TOD and PMI are otherwise negligible - the authors should cross-validate their findings with independent datasets to confirm that the identified gene expression changes are reproducible for some tissues. For instance, the recent study by Shen et al. (Shen et al., 2024) in Nature Aging offers an excellent dataset for cross-validation, particularly for blood samples. Comparing the GTEx-derived results with this longitudinal transcriptome dataset would enable verification of gene expression changes at both the individual gene and pathway levels. Without such validation, confidence in the study's conclusions remains limited.

      (3) As a demonstration of the lack of such validation, in the Shen et al. study (Shen et al., 2024), breakpoints at 44 and 60 years were observed in both males and females, while this study identifies two major breakpoints in males around ages 35 and 50, and one breakpoint in females at age 45. What caused this discrepancy?

      (4) Although the alternative splicing analysis is intriguing, the authors did not differentiate between splicing events that alter the protein-coding sequence and those that do not. Many splicing changes occurring in the 5' UTR and 3' UTR regions do not impact protein coding, so it is essential to filter these out and focus specifically on alternative splicing events that can modify protein-coding sequences.

      (5) One of the study's main conclusions - that "male-biased age-associated AS events have a stronger association with Alzheimer's disease" - is not supported by the data presented in Figure 4A, which shows an association with "regulation of amyloid precursor formation" only in female, not male, alternative splicing genes. Additionally, the gene ontology term "Alzheimer's disease" is absent from the unbiased GO analysis in Figure S6. These discrepancies suggest that the focus on Alzheimer's disease may reflect selective data interpretation rather than results driven by an unbiased analysis.

      (6) The experimental data presented in Figures 5E - I merely demonstrate that estrogen receptor regulates the expression of two splicing factors, SRSF1 and SRSF7, in an estradiol-dependent manner. However, this finding does not support the notion that this regulation actually contributes to sex-dimorphic alternative splicing changes during human aging. Notably, the authors do not provide evidence that SRSF1 and SRSF7 expression changes actually occur in a sex-dependent manner with human aging (in a manner similar to TIA1). As such, this experimental dataset is disconnected from the main focus of the study and does not substantiate the conclusions on sex-dimorphic splicing during human aging. The authors performed RNA-seq in wild-type and ER mutant cells, and they should perform a comprehensive analysis of ER-dependent alternative splicing and compare the results with the GTEx data. It should be straightforward.

      References:

      Ferreira PG, Muñoz-Aguirre M, Reverter F, Sá Godinho CP, Sousa A, Amadoz A, Sodaei R, Hidalgo MR, Pervouchine D, Carbonell-Caballero J et al (2018) The effects of death and post-mortem cold ischemia on human tissue transcriptomes. Nature Communications 9: 490.

      Shen X, Wang C, Zhou X, Zhou W, Hornburg D, Wu S, Snyder MP (2024) Nonlinear dynamics of multi-omics profiles during human aging. Nature Aging.

      Wucher V, Sodaei R, Amador R, Irimia M, Guigó R (2023) Day-night and seasonal variation of human gene expression across tissues. PLOS Biology 21: e3001986.

    1. eLife Assessment

      This useful study informs the transcriptional mechanisms that promote stem cell differentiation and prevent degeneration in the adult eye. Through inducible mouse mutagenesis, the authors uncover a dual role for a transcription factor (Sox9) in stem cell differentiation and prevention of retinal degeneration. The data at hand provide solid support to the main conclusions with several minor weaknesses identified as well. The study will be of general interest to the fields of neuronal development and neurodegeneration.

    2. Reviewer #1 (Public review):

      Summary:

      Hurtado et al. show that Sox9 is essential for retinal integrity, and its null mutation causes the loss of the outer nuclear layer (ONL). The authors then show that this absence of the ONL is due to apoptosis of photoreceptors and a reduction in the numbers of other retinal cell types such as ganglion cells, amacrine cells, and horizontal cells. They also describe that Müller Glia undergoes reactive gliosis by upregulating the Glial Fibrillary Acidic Protein. The authors then show that Sox9+ progenitors proliferate and differentiate to generate the corneal cells through Sox9 lineage-tracing experiments. They validate Sox9 expression and characterize its dynamics in limbal stem cells using an existing single-cell RNA sequencing dataset. Finally, the authors argue that Sox9 deletion causes progenitor cells to lose their clonogenic capacity by comparing the sizes of control and Sox9-null clones. Overall, Hurtado et al. underline the importance of Sox9 function in retinal and corneal cells.

      Strengths:

      The authors have characterized a myriad of striking phenotypes due to Sox9 deletion in the retina and limbal stem cells which will serve as a basis for future studies.

      Weaknesses:

      Hurtado et al. investigate the importance of Sox9 in the retina and limbal stem cells. However, the overall experimental narrative appears dispersed.

      The authors begin by characterizing the phenotype of Sox9 deletion in the retina and show that the absence of the ON layer is due to photoreceptor apoptosis and a reduction in other retinal cell types. The authors also note that Müller glia undergoes gliosis in the Sox9 deletion condition. These striking observations are never investigated further, and instead, the authors switch to lineage-tracing experiments in the limbus that seem disconnected from the first three figures of the paper. Another example of this disconnect is the comparison of Sox9 high and Sox9 low populations using an existing scRNA-seq dataset and the subsequent GO term analysis, which does not directly tie in with the lineage-tracing data of the succeeding Sox9∆/∆ experiments.

      A major concern is that a single Sox9∆/∆ limbal clone has a sufficiently large size, comparable to wild-type clones, as seen in Figure 6D. This singular result is contrary to their conclusion, which states that Sox9-deficient stem cells minimally contribute to the maintenance of the cornea.

    3. Reviewer #2 (Public review):

      Summary:

      Sox9 is a transcription factor crucial for development and tissue homeostasis, and its expression continues in various adult eye cell types, including retinal pigmented epithelium cells, Müller glial cells, and limbal and corneal basal epithelia. To investigate its functional roles in the adult eye, this study employed inducible mouse mutagenesis. Adult-specific Sox9 depletion led to severe retinal degeneration, including the loss of Müller glial cells and photoreceptors. Further, lineage tracing revealed that Sox9 is expressed in a basal limbal stem cell population that supports stem cell maintenance and homeostasis. Mosaic analysis confirmed that Sox9 is essential for the differentiation of limbal stem cells. Overall, the study highlights that Sox9 is critical for both retinal integrity and the differentiation of limbal stem cells in the adult mouse eye.

      Strengths:

      In general, inducible genetic approaches in the adult mouse nervous system are rare and difficult to carry out. Here, the authors employ tamoxifen-inducible mouse mutagenesis to uncover the functional roles of Sox9 in the adult mouse eye.

      Careful analysis suggests that two degeneration phenotypes (mild and severe) are detected in the adult mouse eye upon tamoxifen-dependent Sox9 depletion. Phenotype severity nicely correlates with the efficiency of Cre-mediated Sox9 depletion.

      Molecular marker analysis provides strong evidence of Mueller cell loss and photoreceptor degeneration.

      A clever genetic tracing strategy uncovers a critical role for Sox9 in limbal stem cell differentiation.

      Weaknesses:

      The Introduction can be improved by explaining clearly what was previously known about Sox9 in the eye. A lot of this info is mentioned in a single, 3-page long paragraph in the Discussion. However, the current study's significance and novelty would become clearer if the authors articulated in more detail in the Introduction what was already known about Sox9 in retina cell types (in vitro and in vivo).

      Because a ubiquitous tamoxifen-inducible CreER line is employed, non-cell autonomous mechanisms possibly contribute to the observed retina degeneration. There is precedence for this in the literature. For example, RPE-specific ablation of Otx2 results in photoreceptor degeneration (PMID: 23761884). Have the authors considered the possibility of non-cell autonomous effects upon ubiquitous Sox9 deletion?

      Given the similar phenotypes between animals lacking Otx2 and Sox9 in specific cell types of the eye, the authors are encouraged to evaluate Otx2 expression in the tamoxifen-induced Sox9 adult retina.

      The most parsimonious explanation for the dual role of Sox9 in retinal cell types and limbal stem cells is that the cell context is different. For example, Sox9 may cooperate with TF1 in photoreceptors, TF2, in Mueller cells, and TF3 in limbal stem cells, and such cell type-specific cooperation may result in different outcomes (retinal integrity, stem cell differentiation). The authors are encouraged to add a paragraph to the discussion and share their thoughts on the dual role of Sox9.

      One more molecular marker for Mueller glial cells would strengthen the conclusion that these cells are lost upon Sox9 deletion.

      Using opsins as markers, the authors conclude that the photoreceptors are lost upon Sox9 deletion. However, an alternate possibility is that the photoreceptors are still present and that Sox9 is required for the transcription of opsin genes. In that case, Sox9 (like Otx2) may act as a terminal selector in photoreceptor cells. This point is particularly important because vertebrate terminal selectors (e.g., Nurr1, Otx2, Brn3a) initially affect neuron type identity and eventually lead to cell loss.

      Quantification is needed for the TUNEL and GFAP analysis in Figure 3.

      Line 269-320: The authors examined available scRNA-Seq data on adult retina. This data provides evidence for Sox9 expression in distinct cell types. However, the dataset does not inform about the functional role of Sox9 because Sox9 mutant cells were not analyzed with RNA-Seq. Hence, all the data that claim that this experiment provides insights into possible Sox9 functional roles must be removed. This includes panels F, G, and H in Figure 5. In general, this section of the paper (Lines 269-320) needs a major revision. Similarly, lines 442-446 in the Discussion should be removed.

    1. eLife Assessment

      This study presents numerical results on a framework for understanding the dynamics of subthreshold waves in a network of electrical synapses modeled on the connectome data of the C elegans nematode. The strength of the evidence presented in favor of interference effects being a major component in subthreshold wave dynamics is inadequate and the approach is flawed. Substantial methodological issues are present, including altering the original network structure of the connectome without a clear justification and providing little motivation for the choice of numerical parameters values that were used.

    2. Reviewer #1 (Public review):

      Summary:

      This work investigates numerically the propagation of subthreshold waves in a model neural network that is derived from the C. elegans connectome. Using a scattering formalism and tight-binding description of the network -- approximations which are commonplace in condensed matter physics -- this work attempts at showing the relevance of interference phenomena, such as wavenumber-dependent propagation, for the dynamics of subthreshold waves propagating in a network of electrical synapses.

      Strengths:

      The primary strength of the work is in trying to use theoretical tools from a far-away corner of fundamental physics to shed light on the properties of a real neural system.

      Weaknesses:

      The authors provide a good introduction and motivation for studying the propagation of subthreshold oscillations in the inferior olive nuclei. However, they chose to use the C elegans connectome for their study, and the implications of this work for C elegans neuroscience remain unclear by the end of the preprint. The authors should also give more evidence for the claim that their study may give a mechanism for synchronized rhythmic activity in the mammalian inferior olive nucleus, or refrain from making this conclusion. In the same vein, since the work emphasizes the dependence on the wavenumber for the propagation of subthreshold oscillations, they should make an attempt at estimating the wavenumber of subthreshold oscillations in C elegans if they were to exist and be observed. Next, the presence of two "mobility edges" in the transmission coefficient calculated in this work is unmistakably due to the discrete nature of the system, coming from the tight-binding approximation, and it is unclear to me if this approximation is justified in the current system. Similarly, it is possible that the wavenumber-dependent transmission observed depends strongly on the addition of a large number of virtual nodes (VNs) in the network, which the authors give little to no motivation for. As these nodes are not present in the C elegans connectome, the authors should explain the motivation for their inclusion in the model and should discuss their consequences on the transmission properties of the network. As it stands, I think the work would only have a very limited impact on the understanding of subthreshold oscillations in the rat or in C elegans. Indeed, the preprint falls short of relating its numerical results to any phenomena which could be observed in the lab.

    3. Reviewer #2 (Public review):

      This manuscript addresses an interesting and important question: the basic mechanisms underlying subthreshold intrinsic oscillations in the inferior olive. Instead of a direct investigation of the questions, the authors decide to study subthreshold oscillations in the C-elegance, where the connectivity pattern is known but does not exhibit sub-threshold oscillations. Furthermore, instead of the common description of gap-junction coupling by resistors, the authors decide to represent the system as a tight-binding Anderson Hamiltonian.

      Weaknesses:

      The authors study an architecture of the C-elegance instead of that of the inferior olive of mammals because the architecture of C-elegance is known.

      No subthreshold oscillations were identified in the C-elegance.<br /> Instead of representing electrical coupling via resistors that connect neurons, the authors use a quantum formalism and introduce the tight-binding Anderson Hamiltonian. Why?

      Equally spaced two virtual nodes were added between cells connected by a gap junction. Why?

      Comments on revised version:

      Last time, I recommended that the authors should represent electrical coupling via resistors that connect neurons instead of via the quantum formalism. The authors have not tested this direction.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      (1) This work investigates numerically the propagation of subthreshold waves in a model neural network that is derived from the C. elegans connectome. Using a scattering formalism and tight-binding description of the network -- approximations which are commonplace in condensed matter physics -- this work attempts to show the relevance of interference phenomena, such as wavenumber-dependent propagation, for the dynamics of subthreshold waves propagating in a network of electrical synapses.

      (2) The primary strength of the work is in trying to use theoretical tools from a far-away corner of fundamental physics to shed light on the properties of a real neural system. While a system composed of neurons and synapses is classical in nature, there are occasions in which interference or localization effects are useful for understanding wave propagation in complex media [review, van Rossum & Nieuwenhuizen, 1999]. However, it is expected that localization effects only have an impact in some parameter regimes and with low phase dissipation. The authors should have addressed the existence of this validity regime in detail prior to assuming that interference effects are important.

      The theoretical concept and tool used in this study are not situated in a far-away corner of fundamental physics but hold one of the central positions in condensed matter physics and statistical physics. In fact, the non-scientific statement about where the theoretical concept and tool employed by the researchers are positioned within the realm of fundamental physics is irrelevant. The fundamental physics governs the foundations of all natural phenomena, and thus it provides indispensable principles for interpreting not only neural systems but also all life phenomena. One such principle explored in our study is the interference and localization of waves.

      Specifically, in the third paragraph of the Introduction, we introduced that the interference effect of subthreshold oscillating waves, beyond being a theoretical possibility, is a phenomenon actually observed in neural tissue (Chiang and Durand, 2023; Gupta et al., 2016). Moreover, according to Devor and Yarom (2002), the propagation of subthreshold oscillations observed in the inferior olivary nucleus extended beyond a distance of 0.2 mm. Therefore, considering the propagation of subthreshold waves and the resulting interference in the connectome of C. elegans, which has a total body length of less than 1 mm, a diameter of about 0.08 mm, and most neurons distributed in the ring structure near its neck, provides sufficient validity for the initiation of theoretical and computational studies.

      The primary objective of our study is to investigate which regimes of signal transmission/localization and interference phenomena are valid within the network of electrical synapses in C. elegans, the only system for which the neural connectome structure is perfectly known. As the Reviewer rightly pointed out in the question, this is exactly the issue that the Reviewer is curious about. Therefore, the existence of this validity regime cannot be addressed prior to conducting the study but can only be identified as a result of performing the research. And we have conducted such a study.

      (3) An additional approximation that was made without adequate justification is the use of a tight-binding Hamiltonian. This can be a reasonable approximation, even for classical waves, in particular in the presence of high-quality-factor resonators, where most of the wave amplitude is concentrated on the nodes of the network, and nodes are coupled evanescently with each other. Neither of these conditions were verified for this study.

      The tight-binding Anderson Hamiltonian we used in this study originally consisted of the on-site energy at each node and the hopping matrix between nodes. When the on-site energy is relatively much more stable (i.e., has a large negative value) compared to the hopping matrix, most of the wave amplitude becomes concentrated on the nodes as the Reviewer mentioned. However, as is well-known from reference papers (Anderson, 1958; Chang et al., 1995; Meir et al., 1989; Shapir et al., 1982; Thomas and Nakanishi, 2016), in this study, we also removed the on-site energy to prevent the waves from being concentrated on the nodes. Therefore, the tight-binding Hamiltonian we used in this study ensures that waves propagate through edges in the network where the values of the hopping matrix exist.

      To assist the Reviewer in better understanding the model used in this study, we provide additional explanations as follows. In the manuscript, we have already provided detailed descriptions of the setup using the tight-binding Anderson Hamiltonian in the Method section under “Construction of our circuit model” and the explanation of Figure 1. In the model we used, the edges represented by solid lines are perfect conductors, while the dotted lines representing gap junctions act as potential barriers (Fig. 1B). Therefore, when electric signals propagate, we are dealing with the phenomenon where signals transmitted through the edges encounter potential barriers, causing scattering or attenuation. The model described by the Reviewer is indeed a commonly used model in condensed matter physics, but we did not use the exact model mentioned by the Reviewer. Instead, as is common in well-known reference papers, we modified it to suit our purposes. We hope this explanation helps the Reviewer gain a better understanding.

      (4) The motivation for this work is to understand the basic mechanisms underlying subthreshold intrinsic oscillations in the inferior olive, but detailed connectivity patterns in this brain area are not available. The connectome is known for C elegans, but sub-threshold oscillations have not been observed there, and the implications of this work for C elegans neuroscience remain unclear. The authors should also give more evidence for the claim that their study may give a mechanism for synchronized rhythmic activity in the mammalian inferior olive nucleus, or refrain from making this conclusion.

      We agree with the Reviewer's point. In this study, we do not provide additional analysis on the mammalian inferior olive nucleus beyond what is already known from previous research. What we intended to discuss in the Discussion section was to suggest that within our model, there is a “possibility” that a group of cells exchanging wave signals of a specific wavenumber with high transmittance may show synchronized rhythmic activity. Therefore, to avoid any misunderstanding for the reader, we have revised the corresponding sentence in the Discussion as follows.

      In the Discussion, “The plausible possibility according to our model study is that the constructive interference of subthreshold membrane potential waves with a specific wavenumber may generate the synchronized rhythmic activation.

      (5) In the same vein, since the work emphasizes the dependence on the wavenumber for the propagation of subthreshold oscillations, they should make an attempt at estimating the wavenumber of subthreshold oscillations in C elegans if they were to exist and be observed. Next, the presence of two "mobility edges" in the transmission coefficient calculated in this work is unmistakably due to the discrete nature of the system, coming from the tight-binding approximation, and it is unclear if this approximation is justified in the current system.

      In this study, we modeled the propagation of subthreshold waves on the electrical synapse network of C. elegans, but we did not explain the generation of subthreshold oscillations themselves. Here, we simply injected wave signals with various wavenumber values into the network using a hypothetical device called an "Injector." As the Reviewer pointed out, estimating the wavenumbers of subthreshold oscillations that may exist or be observed in C. elegans would require a comprehensive investigation of the membrane potential dynamics occurring in the membranes of individual neurons. However, this is beyond the scope of this study and would require considerable effort to accomplish.

      As for the use of the tight-binding Hamiltonian, we have addressed that in our response to the third paragraph in the Joint Public Review above.

      (6) Similarly, it is possible that the wavenumber-dependent transmission observed depends strongly on the addition of a large number of virtual nodes (VNs) in the network, which the authors give little to no motivation for. As these nodes are not present in the C elegans connectome, the authors should explain the motivation for their inclusion in the model and should discuss their consequences on the transmission properties of the network.

      As mentioned in our response to the third paragraph in the Joint Public Review above, in our model, a node is simply a pathway for waves to pass through. Therefore, inserting virtual nodes between two neurons that are connected in the C. elegans connectome does not alter the actual connection structure. In other words, virtual nodes do not create new connections between cells that didn’t exist in the connectome. The virtual nodes we introduced are merely a way to divide the sections—axon, gap junction, dendrite—through which the wave passes when it is transmitted between two neurons. As we have already explained in Fig. 1B, the edge connected by two virtual nodes, represented by a dotted line, is motivated to depict the gap junction acting as a potential barrier. We hope this explanation helps the Reviewer better understand the model used in this study.

      (7) As it stands, the work would only have a very limited impact on the understanding of subthreshold oscillations in the rat or in C elegans. Indeed, the preprint falls short of relating its numerical results to any phenomena which could be observed in the lab.

      In this study, we proposed a minimalistic model built using the currently available but limited C. elegans connectome information. Specifically, our model is not a phenomenological one that adjusts parameters to accurately predict experimental measurements, but rather an attempt at a novel conceptual approach to theoretically possible scenarios. While the model may not be satisfactory enough to explain experimental phenomena at present, it is a theoretical/computational study that someone needs to undertake. We believe this is the path of scientific progress. Therefore, as the Reviewer has expressed concern, it is entirely understandable that reproducing the numerical results measured in actual experiments is difficult in this study. Nevertheless, we believe that this study makes a basic contribution to the conceptual understanding of subthreshold signal propagation in C. elegans’ electric synapses.

      Rather than offering a stretched opinion, we maintain a positive hope that future researchers in this field will improve the model by incorporating more detailed and extensive biological data through follow-up studies, allowing us to get closer to describing real phenomena.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The word "Sensory" was misspelled in Figures 2, 4 and 5.

      We appreciate the feedback from Reviewer #1. We have corrected the mentioned typos in Figures 2, 4, and 5 of the revised manuscript.

    1. eLife Assessment

      This important study uses calcium imaging to show an increase in the selectivity of the sensory-evoked response in the apical dendritic tuft of layer 5 barrel cortex neurons as mice learn a whisker-dependent discrimination task. The evidence supporting the conclusions is compelling, and this work will be of great interest to neuroscientists working on reward-based learning and sensory processing.

    2. Reviewer #1 (Public review):

      What neurophysiological changes support the learning of new sensorimotor transformations is a key question in neuroscience. Many studies have attempted to answer this question at the neuronal population level - with varying degrees of success - but few, if any, have studied the change in activity of the apical dendrites of layer 5 cortical neurons. Neurons in the layer 5 of the sensory cortex appear to play a key role in sensorimotor transformations, showing important decision and reward-related signals, and being the main source of cortical and subcortical projections from the cortex. In particular, pyramidal track (PT) neurons project directly to subcortical regions related to motor activity, such as the striatum and brainstem, and could initiate rapid motor action in response to given sensory inputs. Additionally, layer 5 cortical neurons have large apical dendrites that extend to layer 1 where different neuromodulatory and long-range inputs converge, providing motor and contextual information that could be used to modulate layer 5 neurons output and/or to establish the synaptic plasticity required for learning a new association.

      In this study, the authors aimed to test whether the learning of a new sensorimotor transformation could be supported by a change in the evoked response of the apical dendrites of layer 5 neurons in the mouse whisker primary somatosensory cortex. To do this, they performed longitudinal functional calcium imaging of the apical dendrites of layer 5 neurons while mice learned to discriminate between two multiwhiskers stimuli. The authors used a simple conditioning task in which one whisker stimulus (upward or backward air puff, CS+) is associated with reward after a short delay, while the other whisker stimulus (CS-) is not. They found that task learning (measured by the probability of anticipatory licking just after the CS+) was not associated with a significant change of the average population response evoked by the CS+ or the CS-, nor change in the average population selectivity. However, when considering individual dendritic tufts, they found interesting changes in selectivity, with approximately equal numbers of dendrites becoming more selective for CS+ and dendrites becoming more selective for CS-.

      One of the major challenges when assessing changes in neural representation during the learning of such Go/NoGo tasks is that the movements and rewards themselves may elicit strong neural responses that may be a confounding factor, that is, inexperienced mice do not lick in response to the CS+, while trained mice do. In this study, the authors addressed this issue in three ways: first, they carefully monitor the orofacial movements of mice and show that task learning is not associated with changes in evoked whisker movements. Second, they show that whisking or licking evokes very little activity in the dendritic tufts compared to whisker stimuli (CS+ and CS-). Finally, the authors introduced into the design of their task a post-conditioning session after the last conditioning session during which the CS+ and the CS- are presented but no reward is delivered. During this post-session, the mice gradually stopped licking in response to the CS+. A better design might have been to perform the pre-conditioning and post-conditioning sessions in non-water-restricted, unmotivated mice to completely exclude any lick response, but the fact that the change in selectivity persists after the mice stopped licking in the last blocks of the post-conditioning session (in mice relying only on their whiskers to perform the task) is convincing.

      The clever task design and careful data analysis provide compelling evidence that learning this whisker discrimination task does not result in a massive change in sensory representation in the apical dendritic tufts of layer 5 neurons in the primary somatosensory cortex on average. Nevertheless, individual dendritic tufts do increase their selectivity for one or the other sensory stimulus, likely enhancing the ability of S1 neurons to accurately discriminate the two stimuli and trigger the appropriate motor response (to lick or not to lick).

      One limitation of the present study is the lack of evidence for the necessity of the primary somatosensory cortex in the learning and execution of the task. As the authors have strongly emphasized in their previous publications, the primary somatosensory cortex may not be necessary for the learning and execution of simple whisker detection tasks, especially when the stimulus is very salient. Although this new task requires the discrimination between two whisker stimuli, the simplicity and salience of the whisker stimuli used could make this task cortex independent. Especially when considering that some mice seem to not rely entirely on their whiskers to execute the task.

      Nevertheless, this is an important result that shows for the first-time changes in the selectivity to sensory stimuli at the level of individual apical dendritic tufts in correlation with the learning of a discrimination task. This study sheds new light on the cortical cellular substrates of reward-based learning, and opens interesting perspectives for future research in this area. In future studies, it will be important to determine whether the change in selectivity of dendritic calcium spikes is causally involved in the learning the task or whether it simply correlates with learning, as a consequence of changes in synaptic inputs caused by reward. The dendritic calcium spikes may be involved in the establishment of synaptic plasticity required for learning and impact the output of layer 5 pyramidal neurons to trigger the appropriate motor response. It would be important also to study the changes in selectivity in the apical dendrite of the identified projection neurons.

      Comments on revisions:

      The authors have addressed all my questions. I have no further recommendations.

    3. Reviewer #2 (Public review):

      Summary:

      The authors did not find an increased representation of CS+ throughout reinforcement learning in the tuft dendrites of Rbp4-positive neurons from layer 5B of the barrel cortex, as previously reported for soma from layer 2/3 of the visual cortex.<br /> Alternatively, the authors observed an increased selectivity to both stimuli (CS+ and CS-) during reinforcement learning. This feature 1) was not present in repeated exposures (without reinforcement), 2) was not explained by animal's behaviour (choice, licking and whisking) and 3) was long-lasting, being present even when the mice disengaged from the task.<br /> Importantly, increased selectivity was correlated with learning (% correct choices), and neural discriminability between stimuli increased with learning.

      In conclusion, the authors show that tuft dendrites from layer 5B of the barrel cortex increase the representation of conditioned (CS+) and unconditioned stimuli (CS-) applied to the whiskers, during reinforcement learning.

      Strengths:<br /> The results presented are very consistent throughout the entire study, and therefore very convincing:

      (1) The results observed are very similar using two different imaging techniques (using 2-photon -planar imaging- and SCAPE - volumetric imaging). Fig. 3 and Fig.4 respectively.<br /> (2) The results are similar using "different groups" of tuft dendrites for the analysis (e.g. initially unresponsive and responsive pre- and post-learning). Fig. 5.<br /> (3) The results are similar from a specific set of trials (with the same sensory input, but different choices). Fig.7.<br /> (4) Additionally, the selectivity of tuft dendrites from layer 5B of the barrel cortex was higher in the mice that exclusively used the whisker to respond to the stimuli (CS+ and CS-).

      The results presented are controlled against a group of mice that received the same stimuli presentation, except the reinforcement (reward).

      Additionally, the behaviour outputs, such as choice, whisking and licking could not account for the results observed.

      Although there are no causal experiments, the correlation between selectivity and learning (% of correct choices), as well as the increased neural discriminability with learning, but not in repeated exposure, are very convincing.

      Weaknesses:

      The biggest weakness is the absence of causality experiments. Although inhibiting specifically tuft dendritic activity in layer 1 from layer 5 pyramidal neurons is very challenging, tuft dendritic activity in layer 1 could be silenced through optogenetic experiments as in Abs et al. 2018. By manipulating NDNF-positive neurons the authors could specifically modify tuft dendritic activity in the barrel cortex during CS presentations, and test if silencing tuft dendritic activity in layer 1 would lead to the lack of selectivity and an impairment of reinforcement learning. Additionally, this experiment will test if the selectivity observed during reinforcement learning is due to changes in the local network, namely changes in local synaptic connectivity, or solely due to changes in the long-range inputs.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      What neurophysiological changes support the learning of new sensorimotor transformations is a key question in neuroscience. Many studies have attempted to answer this question at the neuronal population level - with varying degrees of success - but few, if any, have studied the change in activity of the apical dendrites of layer 5 cortical neurons. Neurons in layer 5 of the sensory cortex appear to play a key role in sensorimotor transformations, showing important decision and reward-related signals, and being the main source of cortical and subcortical projections from the cortex. In particular, pyramidal track (PT) neurons project directly to subcortical regions related to motor activity, such as the striatum and brainstem, and could initiate rapid motor action in response to given sensory inputs. Additionally, layer 5 cortical neurons have large apical dendrites that extend to layer 1 where different neuromodulatory and long-range inputs converge, providing motor and contextual information that could be used to modulate layer 5 neurons output and/or to establish the synaptic plasticity required for learning a new association. 

      In this study, the authors aimed to test whether the learning of a new sensorimotor transformation could be supported by a change in the evoked response of the apical dendrites of layer 5 neurons in the mouse whisker primary somatosensory cortex. To do this, they performed longitudinal functional calcium imaging of the apical dendrites of layer 5 neurons while mice learned to discriminate between two multi-whisker stimuli. The authors used a simple conditioning task in which one whisker stimulus (upward or backward air pu , CS+) is associated with a reward after a short delay, while the other whisker stimulus (CS-) is not. They found that task learning (measured by the probability of anticipatory licking just after the CS+) was not associated with a significant change in the average population response evoked by the CS+ or the CS-, nor a change in the average population selectivity. However, when considering individual dendritic tufts, they found interesting changes in selectivity, with approximately equal numbers of dendrites becoming more selective for CS+ and dendrites becoming more selective for CS-. 

      One of the major challenges when assessing changes in neural representation during the learning of such Go/NoGo tasks is that the movements and rewards themselves may elicit strong neural responses that may be a confounding factor, that is, inexperienced mice do not lick in response to the CS+, while trained mice do. In this study, the authors addressed this issue in three ways: first, they carefully monitored the orofacial movements of mice and showed that task learning is not associated with changes in evoked whisker movements. Second, they show that whisking or licking evokes very little activity in the dendritic tufts compared to whisker stimuli (CS+ and CS-). Finally, the authors introduced into the design of their task a post-conditioning session after the last conditioning session during which the CS+ and the CS- are presented but no reward is delivered. During this post-session, the mice gradually stopped licking in response to the CS+. A better design might have been to perform the pre-conditioning and post-conditioning sessions in nonwater-restricted, unmotivated mice to completely exclude any lick response, but the fact that the change in selectivity persists after the mice stopped licking in the last blocks of the post-conditioning session (in mice relying only on their whiskers to perform the task) is convincing. 

      The clever task design and careful data analysis provide compelling evidence that learning this whisker discrimination task does not result in a massive change in sensory representation in the apical dendritic tufts of layer 5 neurons in the primary somatosensory cortex on average. Nevertheless, individual dendritic tufts do increase their selectivity for one or the other sensory stimulus, likely enhancing the ability of S1 neurons to accurately discriminate the two stimuli and trigger the appropriate motor response (to lick or not to lick). 

      One limitation of the present study is the lack of evidence for the necessity of the primary somatosensory cortex in the learning and execution of the task. As the authors have strongly emphasized in their previous publications, the primary somatosensory cortex may not be necessary for the learning and execution of simple whisker detection tasks, especially when the stimulus is very salient. Although this new task requires the discrimination between two whisker stimuli, the simplicity and salience of the whisker stimuli used could make this task cortex-independent. Especially when considering that some mice seem to not rely entirely on their whiskers to execute the task. 

      Nevertheless, this is an important result that shows for the first time changes in the selectivity to sensory stimuli at the level of individual apical dendritic tufts in correlation with the learning of a discrimination task. This study sheds new light on the cortical cellular substrates of reward-based learning and opens interesting perspectives for future research in this area. In future studies, it will be important to determine whether the change in selectivity of dendritic calcium spikes is causally involved in the learning of the task or whether it simply correlates with learning, as a consequence of changes in synaptic inputs caused by reward. The dendritic calcium spikes may be involved in the establishment of synaptic plasticity required for learning and impact the output of layer 5 pyramidal neurons to trigger the appropriate motor response. It would be important also to study the changes in selectivity in the apical dendrite of the identified projection neurons.  

      Reviewer #2 (Public Review):

      Summary: 

      The authors did not find an increased representation of CS+ throughout reinforcement learning in the tuft dendrites of Rbp4-positive neurons from layer 5B of the barrel cortex, as previously reported for soma from layer 2/3 of the visual cortex. 

      Alternatively, the authors observed an increased selectivity to both stimuli (CS+ and CS-) during reinforcement learning. This feature: 

      (1) was not present in repeated exposures (without reinforcement), 

      (2) was not explained by the animal's behaviour (choice, licking, and whisking), and 

      (3) was long-lasting, being present even when the mice disengaged from the task. 

      Importantly, increased selectivity was correlated with learning (% correct choices), and neural discriminability between stimuli increased with learning. 

      In conclusion, the authors show that tuft dendrites from layer 5B of the barrel cortex increase the representation of conditioned (CS+) and unconditioned stimuli (CS-) applied to the whiskers, during reinforcement learning. 

      Strengths: 

      The results presented are very consistent throughout the entire study, and therefore very convincing: 

      (1) The results observed are very similar using two different imaging techniques (2-photon planar imaging- and SCAPE-volumetric imaging). Figure 3 and Figure 4 respectively. 

      (2) The results are similar using "different groups" of tuft dendrites for the analysis (e.g.

      initially unresponsive and responsive pre- and post-learning). Figure 5. 

      (3) The results are similar from a specific set of trials (with the same sensory input, but di erent choices). Figure 7. 

      (4) Additionally, the selectivity of tuft dendrites from layer 5B of the barrel cortex was higher in the mice that exclusively used the whisker to respond to the stimuli (CS+ and CS-).  The results presented are controlled against a group of mice that received the same stimuli presentation, except for the reinforcement (reward). 

      Additionally, the behaviour outputs, such as choice, whisking, and licking could not account for the results observed. 

      Although there are no causal experiments, the correlation between selectivity and learning (percentage of correct choices), as well as the increased neural discriminability with learning, but not in repeated exposure, are very convincing. 

      Weaknesses: 

      The biggest weakness is the absence of causality experiments. Although inhibiting specifically tuft dendritic activity in layer 1 from layer 5 pyramidal neurons is very challenging, tuft dendritic activity in layer 1 could be silenced through optogenetic experiments as in Abs et al. 2018. By manipulating NDNF-positive neurons the authors could specifically modify tuft dendritic activity in the barrel cortex during CS presentations, and test if silencing tuft dendritic activity in layer 1 would lead to the lack of selectivity and an impairment of reinforcement learning. Additionally, this experiment will test if the selectivity observed during reinforcement learning is due to changes in the local network, namely changes in local synaptic connectivity, or solely due to changes in the long-range inputs.    

      We agree that such causal manipulations are a logical next step. Such manipulations are unfortunately not specific to layer 5 apicals, so the results would be difficult to interpret. We now discuss the challenge of such manipulations in the Discussion section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Overall, the study is solid and the article is well and clearly written. I have no suggestion for other experiments that would fall within the scope of this article. I would like only to suggest some additional analyses and clarifications in the writing. 

      Additional analyses: 

      Obviously, the main confounding factor in this type of data comes from the acquired motor response which follows - with a short latency - the sensory stimulus. This is particularly problematic for functional calcium imaging which has very low temporal resolution. The authors have addressed this question to some extent by showing that motor-evoked activity does not account for the change in selectivity acquired with learning and through the use of a post-conditioning session during which no reward was delivered. Figures 8C-D show that mice gradually stop licking in response to CS+ in this session and that the distribution of the selectivity index remains similar in these last blocks. Perhaps a more convincing analysis would be to simply select Miss and Correct rejection trials in which mice did not lick in response to the CS+ and CS-, respectively. Ideally, if the number of trials is sufficient, one could even select trials devoid of any evoked movement (no licking and no whisking).  

      We agree it would be interesting to compare Miss and Correct rejection trials to further rule out effects of a motor response, but there were never enough Miss trials to conduct such an analysis. Even in very early learning, there are few Miss trials (see Figure 1, session 2). We found that in early learning, animals would lick in most trials. Then, over the course of conditioning, they would learn to withhold licks during CS- presentation. Thus, we were able to examine Hits, Correct rejections, and False alarms (Figure 7), but not Miss trials. We have added text suggesting a future experiment in which the stimulus strengths are substantially reduced to drastically increase the error rates.

      The fact that changes in selectivity occur in both directions overall is really interesting. However, in the way the data are presented currently, one may wonder about mice/field of view vs single cell effect. i.e., do di erent dendritic tufts in the same field of view show opposite changes in selectivity? If we were to replot Figure 3A for a single mouse, would we obtain the same picture?  

      We appreciate this very good suggestion and have added scatter plots and selectivity index histograms for individual conditioned animals in Supplementary figure 2. These data demonstrate that different dendritic tufts in the same field of view exhibit opposite changes in selectivity.

      The authors point out that they observed no change in the mean response or selectivity during learning, but did find changes in selectivity at the level of individual dendritic tufts. This suggests that, at the population level, the ability to discriminate between the two stimuli should improve. A possible complementary analysis would be to show that the ability to decode stimulus identity from dendritic tuft population activity increases with learning.  

      Given the substantial change in individual tuft selectivity and that the tuft events occur are not rare, the population result is guaranteed. If individual tufts increase selectivity, the population will also increase its selectivity on a trial-by-trial basis. We have nevertheless included a new supplementary figure with a population analysis using SVMs to demonstrate this.

      Clarification: 

      The authors should make it clear from the beginning that mice are still water-restricted during the post-conditioning session and actually do keep licking for many CS+ trials. Therefore, this session is not devoid of motor response. 

      We have clarified this in the text.

      Did mice in the repeated exposure condition receive any reward during the recording sessions? If so when were rewards delivered? 

      We previously described in the Methods that these mice received water in their home cage, but we now additionally clarify this in the Results section.

      Minor: 

      Figure 2Aii, the labels of the Alpha and Betta barrels should be swapped. 

      Fixed

      Line 218: I believe this sentence should read "Using SCAPE microscopy, ...". 

      Corrected.

      Line 665: 'Reconstruction from 50' does that refer to the single cell reconstruction on the left panel? 

      Yes – Clarified in legend

      Reviewer #2 (Recommendations For The Authors): 

      Minor suggestions: 

      The 'summary' should mention from which brain area the results were acquired. Otherwise, it is misleading, giving the idea that the results described a generic feature, which is still unknown.  

      Added to the text.

      Please correct sentence 219: "SCAPE microscopy, we image tuft activity of additional mice..." 

      Added to the text.

      In the same sentence (219) it would be good to provide the number of additional mice imaged (2). 

      Added to the text.

      Regarding Supplementary Figure 1, it would be interesting to correlate the second peak after reward and learning rate, to provide further support to the sentences 109 to 113. 

      We agree this would be interesting to examine, but only four animals exhibited this second peak, which is too small of a sample to observe a meaningful correlation. We now clarify this in the text.

      In Figure 3, why not present the correlation between 'neural discriminability' and % of correct choices? 

      We appreciate the suggestion and have added this plot to Figure 3.

      The 'results' section will benefit tremendously if the authors consistently indicate the figures to which the results are being described, or 'data not shown' if it is the case. To give a few examples: 

      Sentence 108 - "averaged 28% ΔF/F" - From which figure is this result coming from?  Sentence 123 - "(p = 0.62, 0.64, respectively)" - comparison not shown, but see Figures 2E and D respectively? 

      Sentence 125 - "(CS+ responsive (...) across all sessions)" - From which figure is this result coming from? 

      Sentence 130 - "during pre-conditioning (p=0.66) or post-conditioning sessions (p=0.44) - From which figure? 

      Sentence 154 - "(Pre: p=0.20; last rewarded: p=0.43; Post: p=0.64, sign-rank test)" - From which figure? 

      Sentence 175 - "(-0.049, -0.001, and 0.003" - From which figure? Please show the graph that shows that the mean SI is not different. It can be supplementary. The distribution of SI will be strengthened by it.  

      We added this plot to supplementary figure 2.

      Sentence 244 - "(conditioned: 458/603; repeated exposure: 334/457) - From Figure 5E. 

      Sentence 256 - "(p=0.04, 2-sample t-test comparison mice) - From Figure 5B.  Sentence 258 - "(p=0.03, paired t-test) - from Figure 5B  Sentences 370 to 378 - No reference to the figure. 

      The 'discussion' section (sentences 459 to 494) refers to the differences between the current and previous studies (references 1,3,5), namely soma vs. dendrites and layer 2/3 vs. layer 5. However, it should also mention the difference between the nature of the stimuli and the brain area recorded (visual cortex vs. barrel cortex).

      We have addressed these issues in the text.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Authors reject the substance of Reviewer 1’s feedback primarily due to clear lack of understanding of typical parameterization practices used to avoid overfitting. To ensure the Spearman-rank correlation accuracy, 70% of all data was withheld from the optimization process and used solely for testing to yield figure 6. Data was withheld prior to model parameterization and therefore avoids Reviewer 1’s charge of “artificially forcing the correlation”. Authors did appreciate the request for clarification of additional definitions and minor reorganization suggestions. Below we provide specific responses to each numbered point (note: multiple responses are provided for some of the reviewer points).

      Point 1: Clarify Metrics Definition and Evaluation

      Authors clarified the description of biodiversity metrics. The metrics associated with manual methods are detailed in the third paragraph of the Materials and Methods: Data Analysis section, while the sensor-based metric is described in the second paragraph, and summarized in its last sentence.

      Text Additions:

      Authors added clarification to the introduction’s first paragraph defining biodiversity metrics, including species richness.

      Authors added detailed definitions of community metrics and their significance in community ecology in the Materials and Methods section (3rd paragraph of “Data Analysis” section). The discussion was updated to include a reference to community ecology and the benefits of big data, specifically highlighting the potential of autonomous optical sensors in entomology.

      Methods Reorganization

      We have reorganized the Methods section for clarity. Updated section clarifies metrics studied, location, dates, a description and methods around optical sensors, Malaise traps, and sweep netting.

      Text Additions:

      An overview paragraph was added to “Data analysis” (3rd paragraph) detailing key metrics used, specifying metrics such as abundance, richness, Shannon index, and Simpson index.

      Visualization methods for sensor data to deliver analogous metrics of abundance, richness, and diversity indices was added to “Data analysis” section.

      Supplementary Table 1 and the first paragraph of the Materials and Methods section cover location, dates, and other general information.

      Detailed descriptions and methods for optical sensors, Malaise traps, and sweeping are provided.

      Integration of Metrics

      Authors integrated two paragraphs explaining the fundamental differences between conventional methods in the 3rd paragraph of the discussion and the presented method of biodiversity measurement.

      Point 2: Body-to-Wing Ratio Calculation

      The backscattered optical cross-section is now clearly defined as the value measured at the maximum point of the event. Specifically, we have added the word ‘maximum’ to our methods section for clarity.

      Point 3: Ecosystem Services Paragraph

      We have shortened and edited this paragraph for clarity. The revised text is now more straightforward and comprehensible.

      Point 4: Results Section Structure

      We believe restructuring the results section around each metric would result in redundancy. The value of our analysis is in the comparison of different methods; therefore, instead of talking about methods in isolation, we provide an integrated discussion and comparison of all three methods across all metrics. Instead, we have maintained our current structure but ensured that the metrics are consistently described and analyzed.

      Point 5: Abundance Correlation

      We agree that the lack of a correlation between methods for abundance remains an open question. However, we maintain that fitting a linear model would be inappropriate and potentially misleading in the absence of significant correlation. We have clarified this in our manuscript.

      Point 6: Richness and Diversity Evaluations

      The authors disagree with Reviewer 1's feedback, citing a clear misunderstanding of standard parameterization practices used to prevent overfitting. Specifically, authors implemented a 30/70 Training/Testing split. Therefore only 30% of the data was used to fit the model and 70% of the dataset was reserved for testing to ensure the validity and reliability of our clustering results. By validating with a 70% testing dataset, we ensure that the clustering model can accurately group new data points and is robust against overfitting. This process helps verify that the identified clusters are meaningful and consistent across different subsets of the data.  Spearman's rho converts the data values into ranks and does not assume a linear relationship between the variables or require the data to follow a normal distribution. Spearman's rank correlation offers robustness against non-linearity and outliers by focusing on ranks. This approach is explained in the 4th paragraph of the “Data Analysis” section.

      Point 7: Clustering Method Credibility

      Authors acknowledge the variability in optical sensor features. However, the Law of Large Numbers supports increased insect measurement accuracy and stability occurs from optical insect sensors due to the increased number of observations made by the optical sensors compared to conventional methods. The manuscript now includes a detailed discussion of these aspects in the 3rd paragraph of discussion, emphasizing the correlation observed despite variability.

      Reviewer 2:

      Authors appreciate Reviewer 2’s feedback especially regarding contextualization. While authors disagree with the need for more specific experimental questions in a methods paper and the suggested need for more complex analysis, we agree with the essence of the review and added additional text regarding potential questions, method applications, and ecosystem processes for contextualization.

      Point 1: Larger Question Framing

      We present this article as a methodological paper rather than asking a specific experimental question. This approach is justified by the generalizable nature of methods papers, akin to those describing ImageJ or mass spectrometers. The method is widely applicable to a range of scientific questions. 

      We provided a discussion on how this technology could be applied in community ecology, conservation, and managed ecological systems like agriculture.

      In the Conclusion section we provided elaboration on the potential research questions and applications.

      Point 2: Complex Analyses

      While complex analyses like NMDS are useful for specific questions, this paper aims to establish the method. Once established, this method can be applied to various research questions in future studies. Therefore, as we are not directly asking an experimental question, more complex analysis is unnecessary.

      Point 3: Ecosystem Process (Granivory) Assay

      We have improved the contextualization and explanation of the ecosystem process assay throughout the manuscript, ensuring it is well-integrated and clear to readers.

    2. eLife Assessment

      The authors propose a new methodology to survey insects, using new sensors and analytical capabilities that could be valuable for addressing urgent conservation challenges. While the results of the optical sensors appear to be comparable to those obtained with classical survey methodologies, current analyses are considered incomplete.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript proposes a new technology to survey insects. They deployed optical sensors in agricultural landscapes and contrast their results to those in classical malaise and sweep nets survey methodologies. They found the results of optical sensors to be comparable with classical survey methodologies. The authors discuss pros and cons of their near-infrared sensor.

      Strengths:

      Contrasting the results with optical sensors with those in classical malaise and sweep nets was a clever idea.

      Weaknesses:

      The submitted materials on Revision 1 (in particular the response to reviewers) are difficult to follow. I encourage the authors to provide a point-by-point response to the first set of comments, as well as to this second review.

      A new version of the manuscript needs to make sure that variability in the system (different crops) is taken into consideration. Also, stronger analysis including our current understanding of biodiversity metrics (including measures of sample coverage, sample completeness, Hill numbers, among others) will be important to make sure your new methodology is properly capable to be used as a new standard methodology.

      While this new version is stronger and much clearer, I also agree with Reviewer 1 that the usage of terminology is weak. The paper and the new methodology is sound. It is is the application to real ecosystems/questions and datasets that is not properly addressed in the manuscript.

    1. eLife Assessment

      This paper explores the important question of how two major inhibitory interneuron classes in the neocortex differentially affect cortical dynamics. Using a linearized fixed point approach, they provide convincing evidence that the existence of multiple interneuron classes can explain the counterintuitive finding that inhibitory modulation can increase the gain of the excitatory cell population while also increasing the stability of the circuit's state to minor perturbations. Support for the main conclusions is solid, but could be strengthened by additional analyses.

    2. Reviewer #1 (Public review):

      Summary:

      This paper explores how diverse forms of inhibition impact firing rates in models for cortical circuits. In particular, the paper studies how the network operating point affects the balance of direct inhibition from SOM inhibitory neurons to pyramidal cells, and disinhibition from SOM inhibitory input to PV inhibitory neurons. This is an important issue as these two inhibitory pathways have largely been studies in isolation. Support for the main conclusions is generally solid, but could be strengthened by additional analyses.

      Strengths

      The paper has improved in revision, and the new intuitive summary statements added to the end of each results section are quite helpful.

      Weaknesses

      The concern about whether the results hold outside of the range in which neural responses are linear remains. This is particularly true given the discontinuity observed in the stability measure. I appreciate the concern (provided in the response to the first round of reviews) that studying nonlinear networks requires a lot of work. A more limited undertaking would be to test the behavior of a spiking network at a few key points identified by your linearization approach. Such tests could use relatively simple (and perhaps imperfect) measures of gain and stability. This could substantially enhance the paper, regardless of the outcome.

    3. Reviewer #2 (Public review):

      Summary:

      Bos and colleagues address the important question of how two major inhibitory interneuron classes in the neocortex differentially affect cortical dynamics. They address this question by studying Wilson-Cowan-type mathematical models. Using a linearized fixed point approach, they provide convincing evidence that the existence of multiple interneuron classes can explain the counterintuitive finding that inhibitory modulation can increase the gain of the excitatory cell population while also increasing the stability of the circuit's state to minor perturbations. This effect depends on the connection strengths within their circuit model, providing valuable guidance as to when and why it arises.

      Overall, I find this study to have substantial merit. I have some suggestions on how to improve the clarity and completeness of the paper.

      Strengths:

      (1) The thorough investigation of how changes in the connectivity structure affect the gain-stability relationship is a major strength of this work. It provides an opportunity to understand when and why gain and stability will or will not both increase together. It also provides a nice bridge to the experimental literature, where different gain-stability relationships are reported from different studies.

      (2) The simplified and abstracted mathematical model has the benefit of facilitating our understanding of this puzzling phenomenon. (I have some suggestions for how the authors could push this understanding further.) It is not easy to find the right balance between biologically-detailed models vs simple but mathematically tractable ones, and I think the authors struck an excellent balance in this study.

      Weaknesses:

      (1) The fixed-point analysis has potentially substantial limitations for understanding cortical computations away from the steady-state. I think the authors should have emphasized this limitation more strongly and possibly included some additional analyses to show that their conclusions extend to the chaotic dynamical regimes in which cortical circuits often live.

      (2) The authors could have discussed -- even somewhat speculatively -- how VIP interneurons fit into this picture. Their absence from this modelling framework stands out as a missed opportunity.

      (3) The analysis is limited to paths within this simple E, PV, SOM circuit. This misses more extended paths (like thalamocortical loops) that involve interactions between multiple brain areas. Including those paths in the expansion in Eqs. 11-14 (Fig. 1C) may be an important consideration.

      Comments on revisions:

      I think the authors have done a reasonable job of responding to my critiques, and the paper is in pretty good shape. (Also, thanks for correctly inferring that I meant VIP interneurons when I had written SST in my review! I have updated the public review accordingly.)

      I still think this line of research would benefit substantially from considering dynamic regimes including chaotic ones. I strongly encourage the authors to consider such an extension in future work.

    4. Reviewer #3 (Public review):

      Summary:

      Bos et al study a computational model of cortical circuits with excitatory (E) and two subtypes of inhibition - parvalbumin (PV) and somatostatin (SOM) expressing interneurons. They perform stability and gain analysis of simplified models with nonlinear transfer functions when SOM neurons are perturbed. Their analysis suggests that in a specific setup of connectivity, instability and gain can be untangled, such that SOM modulation leads to both increases in stability and gain, in contrast to the typical direction in neuronal networks where increased gain results in decreased stability.

      Strengths:

      - Analysis of the canonical circuit in response to SOM perturbations. Through numerical simulations and mathematical analysis, the authors have provided a rather comprehensive picture of how SOM modulation may affect response changes.<br /> - Shedding light on two opposing circuit motifs involved in the canonical E-PV-SOM circuitry - namely, direct inhibition (SOM -> E) vs disinhibition (SOM -> PV -> E). These two pathways can lead to opposing effects, and it is often difficult to predict which one results from modulating SOM neurons. In simplified circuits, the authors show how these two motifs can emerge and depend on parameters like connection weights.<br /> - Suggesting potentially interesting consequences for cortical computation. The authors suggest that certain regimes of connectivity may lead to untangling of stability and gain, such that increases in network gain are not compromised by decreasing stability. They also link SOM modulation in different connectivity regimes to versatile computations in visual processing in simple models.

      Weaknesses

      Computationally, the analysis is solid, but it's very similar to previous studies (del Molino et al, 2017). Many studies in the past few years have done the perturbation analysis of a similar circuitry with or without nonlinear transfer functions (some of them listed in the references). This study applies the same framework to SOM perturbations, which is a useful computational analysis, in view of the complexity of the high-dimensional parameter space.

      Link to biology: the most interesting result of the paper with regard to biology is the suggestion of a regime in which gain and stability can be modulated in an unconventional way - however, it is difficult to link the results to biological networks:<br /> - A general weakness of the paper is a lack of direct comparison to biological parameters or experiments. How different experiments can be reconciled by the results obtained here, and what new circuit mechanisms can be revealed? In its current form, the paper reads as a general suggestion that different combinations of gain modulation and stability can be achieved in a circuit model equipped with many parameters (12 parameters). This is potentially interesting but not surprising, given the high dimensional space of possible dynamical properties. A more interesting result would have been to relate this to biology, by providing reasoning why it might be relevant to certain circuits (and not others), or to provide some predictions or postdictions, which are currently missing in the manuscript.<br /> - For instance, a nice motivation for the paper at the beginning of the Results section is the different results of SOM modulation in different experiments - especially between L23 (inhibition) and L4 (disinhibition). But no further explanation is provided for why such a difference should exist, in view of their results and the insights obtained from their suggested circuit mechanisms. How the parameters identified for the two regimes correspond to different properties of different layers?<br /> - One of the key assumptions of the model is nonlinear transfer functions for all neuron types. In terms of modelling and computational analysis, a thorough analysis of how and when this is necessary is missing (an analysis similar to what has been attempted in Figure 6 for synaptic weights, but for cellular gains). A discussion of this, along with the former analysis to know which nonlinearities would be necessary for the results, is needed, but currently missing from the study. The nonlinearity is assumed for all subtypes because it seems to be needed to obtain the results, but it's not clear how the model would behave in the presence or absence of them, and whether they are relevant to biological networks with inhibitory transfer functions.<br /> - Tuning curves are simulated for an individual orientation (same for all), not considering the heterogeneity of neuronal networks with multiple orientation selectivity (and other visual features) - making the model too simplistic.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This paper explores how diverse forms of inhibition impact firing rates in models for cortical circuits. In particular, the paper studies how the network operating point affects the balance of direct inhibition from SOM inhibitory neurons to pyramidal cells, and disinhibition from SOM inhibitory input to PV inhibitory neurons. This is an important issue as these two inhibitory pathways have largely been studies in isolation. Support for the main conclusions is generally solid, but could be strengthened by additional analyses.

      Strengths:

      A major strength of the paper is the systematic exploration of how circuit architecture effects the impact of inhibition. This includes scans across parameter space to determine how firing rates and stability depend on effective connectivity. This is done through linearization of the circuit about an effective operating point, and then the study of how perturbations in input effect this linear approximation.

      Weaknesses:

      The linearization approach means that the conclusions of the paper are valid only on the linear regime of network behavior. The paper would be substantially strengthened with a test of whether the conclusions from the linearized circuit hold over a large range of network activity. Is it possible to simulate the full network and do some targeted tests of the conclusions from linearization? Those tests could be guided by the linearization to focus on specific parameter ranges of interest.

      We agree with the reviewer that it would be interesting to test if our results hold in a nonlinear regime of network behaviour (i.e. the chaotic regime, see also comment 1 by reviewer 2). As mentioned above, this requires a different type of model (either rate-based or spiking model with multiple neurons instead of modelling the mean population rate dynamics) which, in our opinion, exceeds the scope of this manuscript. Furthermore, the core measures of our study, network gain, and stability require linearization. In a chaotic regime where the linearization approach is impossible, we would need to consider/define new measures to characterize network response/activity. Therefore, while certainly being an interesting question to study, the broad scope of the studying networks in a nonlinear regime is better tackled in a separate study. We now acknowledge in the discussion of our manuscript that the linearization approach is a limitation in our study and that it would be an interesting future direction to investigate chaotic dynamics.

      The results illustrated in the figures are generally well described but there is very little intuition provided for them. Are there simplified examples or explanations that could be given to help the results make sense? Here are some places such intuition would be particularly helpful:

      page 6, paragraph starting ”In sum ...”

      Page 8, last paragraph

      Page 10, paragraph starting ”In summary ...”

      Page 11, sentence starting ”In sum ...”

      We agree with the reviewer that we didn’t provide enough intuition to our results. We now extended the paragraphs listed by the reviewer with additional information, providing a more intuitive understanding of the results presented in the respective chapter.

      Reviewer #2 (Public Review):

      Summary:

      Bos and colleagues address the important question of how two major inhibitory interneuron classes in the neocortex differentially affect cortical dynamics. They address this question by studying Wilson-Cowan-type mathematical models. Using a linearized fixed point approach, they provide convincing evidence that the existence of multiple interneuron classes can explain the counterintuitive finding that inhibitory modulation can increase the gain of the excitatory cell population while also increasing the stability of the circuit’s state to minor perturbations. This effect depends on the connection strengths within their circuit model, providing valuable guidance as to when and why it arises.

      Overall, I find this study to have substantial merit. I have some suggestions on how to improve the clarity and completeness of the paper.

      Strengths:

      (1) The thorough investigation of how changes in the connectivity structure affect the gain-stability relationship is a major strength of this work. It provides an opportunity to understand when and why gain and stability will or will not both increase together. It also provides a nice bridge to the experimental literature, where different gain-stability relationships are reported from different studies.

      (2) The simplified and abstracted mathematical model has the benefit of facilitating our understanding of this puzzling phenomenon. (I have some suggestions for how the authors could push this understanding further.) It is not easy to find the right balance between biologically detailed models vs simple but mathematically tractable ones, and I think the authors struck an excellent balance in this study.

      Weaknesses:

      (1) The fixed-point analysis has potentially substantial limitations for understanding cortical computations away from the steady-state. I think the authors should have emphasized this limitation more strongly and possibly included some additional analyses to show that their conclusions extend to the chaotic dynamical regimes in which cortical circuits often live.

      We agree with the reviewer that it would be interesting to test if our results hold in a chaotic regime of network behaviour (see also comment by reviewer 1). As mentioned above, this requires a different type of model (either rate-based or spiking model with multiple neurons instead of modelling the mean population rate dynamics) which, in our opinion, exceeds the scope of this manuscript. Furthermore, the core measures of our study, network gain, and stability require linearization. In a chaotic regime where the linearization approach is impossible, we would need to consider/define new measures to characterize network response/activity. Therefore, while certainly being an interesting question to study, the broad scope of the studying networks in a nonlinear regime is better tackled in a separate study. We now acknowledge in the discussion of our manuscript that the linearization approach is a limitation in our study and that it would be an interesting future direction to investigate chaotic dynamics.

      (2) The authors could have discussed – even somewhat speculatively – how SST interneurons fit into this picture. Their absence from this modelling framework stands out as a missed opportunity.

      We believe that the reviewer wanted us to speculate about VIP interneurons (and not SST interneurons, which we already do extensively in the manuscript). Previous models have included VIP neurons in the circuit (e.g. del Molino et al., 2017; Palmigiano et al., 2023; Waitzmann et al., 2024). While we do not model VIP cells explicitly, we implicitly assume that a possible source of modulation of SOM neurons comes from VIP cells. We have now added a short discussion on VIP cells in the last paragraph in our discussion section.

      (3) The analysis is limited to paths within this simple E,PV,SOM circuit. This misses more extended paths (like thalamocortical loops) that involve interactions between multiple brain areas. Including those paths in the expansion in Eqs. 11-14 (Fig. 1C) may be an important consideration.

      We agree with the reviewer that our framework can be extended to study many other different paths, like thalamocortical loops, cortical layer-specific connectivity motifs, or circuits with VIP or L1 inhibitory neurons. Studying these questions, however, are beyond the scope of our work. In our discussion, we now mention the possibility of using our framework to study those questions.

      Reviewer #3 (Public Review):

      Summary:

      Bos et al study a computational model of cortical circuits with excitatory (E) and two subtypes of inhibition parvalbumin (PV) and somatostatin (SOM) expressing interneurons. They perform stability and gain analysis of simplified models with nonlinear transfer functions when SOM neurons are perturbed. Their analysis suggests that in a specific setup of connectivity, instability and gain can be untangled, such that SOM modulation leads to both increases in stability and gain. This is in contrast with the typical direction in neuronal networks where increased gain results in decreased stability.

      Strengths:

      - Analysis of the canonical circuit in response to SOM perturbations. Through numerical simulations and mathematical analysis, the authors have provided a rather comprehensive picture of how SOM modulation may affect response changes.

      - Shedding light on two opposing circuit motifs involved in the canonical E-PV-SOM circuitry - namely, direct inhibition (SOM → E) vs disinhibition (SOM → PV → E). These two pathways can lead to opposing effects, and it is often difficult to predict which one results from modulating SOM neurons. In simplified circuits, the authors show how these two motifs can emerge and depend on parameters like connection weights.

      - Suggesting potentially interesting consequences for cortical computation. The authors suggest that certain regimes of connectivity may lead to untangling of stability and gain, such that increases in network gain are not compromised by decreasing stability. They also link SOM modulation in different connectivity regimes to versatile computations in visual processing in simple models.

      Weaknesses:

      The computational analysis is not novel per se, and the link to biology is not direct/clear.

      Computationally, the analysis is solid, but it’s very similar to previous studies (del Molino et al, 2017). Many studies in the past few years have done the perturbation analysis of a similar circuitry with or without nonlinear transfer functions (some of them listed in the references). This study applies the same framework to SOM perturbations, which is a useful and interesting computational exercise, in view of the complexity of the high-dimensional parameter space. But the mathematical framework is not novel per se, undermining the claim of providing a new framework (or ”circuit theory”).

      In the introduction we acknowledge that our analysis method is not novel but is rather based on previous studies (del Molino et al., 2017; Kuchibhotla et al., 2017; Kumar et al., 2023, Litwin-Kumar et al., 2016; Mahrach et al., 2020; Palmigiano et al., 2023; Veit et al., 2023; Waitzmann et al., 2024). We now rewrote parts of the introduction to make sure that it does not sound like the computational analysis has been developed by us, but that we rather use those previously developed frameworks to dissect stability and gain via SOM modulation.

      Link to biology: the most interesting result of the paper with regard to biology is the suggestion of a regime in which gain and stability can be modulated in an unconventional way - however, it is difficult to link the results to biological networks: - A general weakness of the paper is a lack of direct comparison to biological parameters or experiments. How different experiments can be reconciled by the results obtained here, and what new circuit mechanisms can be revealed? In its current form, the paper reads as a general suggestion that different combinations of gain modulation and stability can be achieved in a circuit model equipped with many parameters (12 parameters). This is potentially interesting but not surprising, given the high dimensional space of possible dynamical properties. A more interesting result would have been to relate this to biology, by providing reasoning why it might be relevant to certain circuits (and not others), or to provide some predictions or postdictions, which are currently missing in the manuscript.

      - For instance, a nice motivation for the paper at the beginning of the Results section is the different results of SOM modulation in different experiments - especially between L23 (inhibition) and L4 (disinhibition). But no further explanation is provided for why such a difference should exist, in view of their results and the insights obtained from their suggested circuit mechanisms. How the parameters identified for the two regimes correspond to different properties of different layers?

      As pointed out by the reviewer, the main goal of our manuscript is to provide a general understanding of how gain and stability depend on different circuit motifs (ie different connectivity parameters), and how circuit modulations via SOM neurons affect those measures. However, we agree with the reviewer that it would be useful to provide some concrete predictions or postdictions following from our study.

      An interesting example of a postdiction of our model is that the firing rate change of excitatory neurons in response to a change in the stimulus (which we define as network gain, Eq. 2) depends on firing rates of the excitatory, PV, and SOM neurons at the moment of stimulus presentation (Fig. 3ii; Fig. 4Aii,Bii,Cii; Fig. 5Aii, Bii, Cii). Hence any change in input to the circuit can affect the response gain to a stimulus presentation, in line with experimental evidence which suggests that changes in inhibitory firing rates and changes in the behavioral state of the animal lead to gain modifications (Ferguson and Cardin 2020).

      Another recent concrete example is the study of Tobin et al., 2023, in which the authors show that optogenetically activating SOM cells in the mouse primary auditory cortex (A1) decreases the excitatory responses to auditory stimuli. In our framework, this corresponds to the case of decreases in network gain (gE) for positive SOM modulation, as seen in the circuit with PV to SOM feedback connectivity (Suppl. Fig. S1).

      Another example is the study by Phillips and Hasenstaub 2016, in which the authors study the effect of optogenetic perturbations of SOM (and PV) cells on tuning curves of pyramidal cells in mouse A1. While they find large heterogeneity in additive/subtractive or multiplicative/divisive tuning curve changes following SOM inactivation, most cells have a purely multiplicative or purely additive component (and none of the cells have a divisive component). In our study, we see that large multiplicative responses of the excitatory population follow from circuits with strong E to SOM feedback connectivity.

      We note that in future computational studies, it would be useful to apply our framework with a focus on a specific brain region and add all relevant cell types (at a minimum E, PV, SOM, and VIP) plus a dendritic compartment, in order to formulate much more precise experimental predictions.

      We have now added additional information to the discussion section.

      - Another caveat is the range of parameters needed to obtain the unintuitive untangling as a result of SOM modulation. From Figure 4, it appears that the ”interesting” regime (with increases in both gain and stability) is only feasible for a very narrow range of SOM firing rates (before 3 Hz). This can be a problem for the computational models if the sweet spot is a very narrow region (this analysis is by the way missing, so making it difficult to know how robust the result is in terms of parameter regions). In terms of biology, it is difficult to reconcile this with the realistic firing rates in the cortex: in the mouse cortex, for instance, we know that SOM neurons can be quite active (comparable to E neurons), especially in response to stimuli. It is therefore not clear if we should expect this mechanism to be a relevant one for cortical activity regimes.

      We agree with the reviewer that it’s important to test the robustness of our results. As suggested by the reviewer, we now include a new supplementary figure (Suppl. Fig. S2) which measures the percentage of data points in the respective quadrant Q1-Q4 when changing the SOM firing rates (as done in Fig. 5). We see that the quadrants in which the network gain and stability change in the same direction (Q2 and Q3) remain high in the case for E to SOM feedback (Suppl. Fig. S2A) over SOM rates ranging over 0-10 Hz (and likely beyond).

      - One of the key assumptions of the model is nonlinear transfer functions for all neuron types. In terms of modelling and computational analysis, a thorough analysis of how and when this is necessary is missing (an analysis similar to what has been attempted at in Figure 6 for synaptic weights, but for cellular gains). In terms of biology, the nonlinear transfer function has experimentally been reported for excitatory neurons, so it’s not clear to what extent this may hold for different inhibitory subtypes. A discussion of this, along with the former analysis to know which nonlinearities would be necessary for the results, is needed, but currently missing from the study. The nonlinearity is assumed for all subtypes because it seems to be needed to obtain the results, but it’s not clear how the model would behave in the presence or absence of them, and whether they are relevant to biological networks with inhibitory transfer functions.

      It is true that the nonlinear transfer function is a key component in our model. We chose identical transfer functions for E, PV, and SOM (; Eq. 4) to simplify our analysis. If the transfer function of one of the neuron types would be linear (β \= 1), then the corresponding b terms (the slope of the nonlinearity at the steady state; b \= dfX/dqX; Fig. 1B; Eq. 4) would be equal to α. Therefore, if neurons had a linear transfer function in our model, there would not be a dependence of network gain on E and PV firing rate as studied in Fig. 3-5. This is because the relationship between PV rates and their gain would be constant (bP \= α) in Fig. 1B (bottom).

      If all the transfer functions were linear, changes in firing rates would not have an impact on network gain or stability. Changing the nonlinear transfer function by changing the α or β terms in Eq. 4 would only scale the way a change in the rates affects the b terms and hence the results presented in Fig. 3-5. More interesting would be to study how different types of nonlinearities, like sigmoidal functions or sublinear nonlinearities (i.e. saturating nonlinearities), would change our results. However, we think that such an investigation is out of scope for this study. We now added a comment to the Methods section.

      Experimentally, F-I curves have been measured also for PV and SOM neurons. For example, Romero-Sosa et al., 2021 measure the F-I curve of pyramidal, PV and SOM neurons in mouse cortical slices. They find that similar to pyramidal neurons, PV and SOM neurons show a nonlinear F-I curve. We now added the citation of Romero-Sosa et al., 2021 to our manuscript.

      - Tuning curves are simulated for an individual orientation (same for all), not considering the heterogeneity of neuronal networks with multiple orientation selectivity (and other visual features) - making the model too simplistic.

      The reviewer is correct that we only study changes in tuning curves in a simplistic model. In our model, the excitatory and PV populations are tuned to a single orientation (in the case of Fig. 7 to θ \= 90). While this is certainly an oversimplification, it allows us to understand how additive/subtractive and multiplicative/divisive changes in the tuning curves come about in networks with different connectivity motifs. To model heterogeneity of tuning responses within a network, it requires more complex models. A natural choice would be to extend a classical ring attractor model (Rubin et al., 2015) by splitting the inhibitory population into PV and SOM neurons, or study the tuning curve heterogeneity that occurs in balanced networks (Hansel and van Vreeswijk 2012). However, this model has many more parameters, like the spatial connectivity profiles from and onto PV and SOM neurons. While highly valuable, we believe that studying such models exceeds the scope of our current manuscript. We now added a paragraph in the discussion section, mentioning this as an interesting future direction.

      Reviewer #1 (Recommendations For The Authors):

      The last sentence of the abstract is hard to interpret before reading the rest of the paper - suggest replacing or rephrasing.

      We rephrased the sentence to make more clear what we mean.

      Page 3, last full paragraph: I think this assumes that phi is positive. What is the justification for that assumption? More generally, I think you could say a bit more about phi in the main text since it is a fairly complicated term.

      The reviewer is correct, for a stable system phi is always positive. We now clarify this and explain phi in more detail in the main text.

      Fig 1D: It would be helpful to identify when the stimulus comes on and be clearer about what the stimulus is. I assume it’s a step increase in S input at 0.05 s or so - but that should be immediately apparent looking at the figure.

      We agree with the reviewer and we added a dashed line at the time of stimulus onset in Fig. 1D.

      Page 5: ”To motivate our analysis we compare ... (Fig. 2A)” - Figure 2A does not show responses without modulation, so this sentence is confusing.

      The dashed lines in Fig. 2A (and Fig. 2C) actually represents the rate change without modulation.

      Page 6: sentence “The central goal of our study ...” seems out of place since this is pretty far into the results, and that goal should already be clear.

      We agree with the reviewer, hence we updated the sentence.

      Page 10, top: the green curve in panel Aii always has a negative slope - so I am confused by the statement that increasing wSE decreases both gain and stability.

      We thank the reviewer for pointing out this mistake. We now fixed it in the text.

      Figure 6: in general it is hard to see what is going on in this figure (the green and blue in particular are hard to distinguish). Some additional labels would be helpful, but I would also see if the color scheme can be improved.

      We added a zoom-in to the panels which were hard to distinguish.

      Reviewer #2 (Recommendations For The Authors):

      Major recommendations:

      (1) The authors should explain early on in the results section what the key factor(s) is that differentiates SOM from PV cells in their model. E.g., in Fig. 1A, the only obvious difference is that SOM cells don’t inhibit themselves. However, later on in the paper, the difference in external stimulus drive to these interneuron classes is more heavily emphasized. Given the importance of that difference (in external stim drive), I think this should be highlighted early on.

      We now mention the key factors that differentiate PV and SOM neurons already when describing Fig. 1A.

      (2) The result in Figs. 5,6 demonstrate that recurrent SOM connectivity is important for achieving increases in both gain and stability. This observation could benefit from some intuitive explanation. Perhaps the authors could find this explanation by looking at their series expansion (Eqs. 11-14, Fig. 1C) and determining which term(s) are most important for this effect. The corresponding paths through the circuit – the most important ones – could then be highlighted for the reader.

      We agree with the reviewer that our results benefit from more intuitive explanations. This has also been pointed out by reviewer 1 in their public review. We now extended the concluding paragraphs in the context of Fig. 4-6 with additional information, providing a more intuitive understanding of the results presented in the respective chapter. While it is possible to gain an intuitive understanding of how the network gain depends on rate and weight parameters (Eq. 2), this understanding is unfortunately missing in the case of stability. The maximum eigenvalue of the system have a complex relationship with all the parameters, and often have nonlinear dependencies on changes of a parameter (e.g. as we show in Fig. 3iv or one can see in Fig. 6). We now discuss this difficulty at the end of the section “Influence of weight strength on network gain vs stability”.

      (3) I think the authors should consider including some analyses that do not rely on the system being at or near a fixed point. I admit that such analysis could be difficult, and this could of course be done in a future study. Nevertheless, I want to reiterate that this addition could add a lot of value to this body of work.

      As outlined above, we decided to not include additional analysis on network behaviour in nonlinear regimes but we now acknowledge in the discussion of our manuscript that the linearization approach is a limitation in our study and that it would be an interesting future direction to investigate chaotic dynamics.

      Minor recommendations:

      (1) At the top of P. 6, when the authors first discuss the stability criterion involving eigenvalues, they should address the question ”eigenvalues of what?”. I suggest introducing the idea of the Jacobian matrix, and explaining that the largest eigenvalue of that matrix determines how rapidly the system will return to the fixed point after a small perturbation.

      We included an additional sentence in the respective paragraph explaining the link between stability and negative eigenvalues, and we also added a sentence in the Methods section stating the the largest real eigenvalue dominates the behavior of the dynamical system.

      (2) The panel labelling in Fig. 3 is unnecessarily confusing. It would be simpler (and thus better) to simply label the panels A,B,C,D, or i,ii,iii,iv, instead of the current labelling: Ai, Aii, Aiii, Aiv. (There are currently no panels ”B” in Fig. 3).

      We updated the figure accordingly.

      Reviewer #3 (Recommendations For The Authors):

      • Suggestions for improved or additional experiments, data or analyses.

      Analysis of the effect of different nonlinear transfer functions is necessary.

      Please see our detailed answer to the reviewer’s comment in the public review above.

      Analysis of gain modulation in models with more realistic tuning properties.

      Please see our detailed answer to the reviewer’s comment in the public review above.

      Mathematical analysis of the conditions to obtain ”untangled” gain and stability:

      One of the promises of the paper is that it is offering a computational framework or circuit theory for understanding the effect of SOM perturbation. However, the main result, namely the untangling of gain and stability, has only been reported in numerical simulations (e.g. Fig. 6). Different parameters have been changed and the results of simulations have been reported for different conditions. Given the simplified model, which allows for rigorous mathematical analysis, isn’t it possible to treat this phenomenon more analytically? What would be the conditions for the emergence of the untangled regime? This is currently missing from the analyses and results.

      We agree with the reviewer that our results benefit from more intuitive explanations. This has also been pointed out by reviewer 1 in their public review. We now extended the concluding paragraphs in the context of Fig. 4-6 with additional information, providing a more intuitive understanding of the results presented in the respective chapter. While it is possible understand analytically of how the network gain depends on rate and weight parameters (Eq. 2), this understanding is unfortunately missing in the case of stability. The maximum eigenvalue of the system have a complex relationship with all the parameters, and often have nonlinear dependencies on changes of a parameter (e.g. as we show in Fig. 3iv or one can see in Fig. 6). This doesn’t allow for a a deep analytical understanding of the entangling of gain and stability. We now discuss this difficulty at the end of the section “Influence of weight strength on network gain vs stability”.

      • Recommendations for improving the writing and presentation. The Results section is well written overall, but other parts, especially the Introduction and Discussion, would benefit from proof reading - there are many typos and problems with sentence structures and wording (some mentioned below).

      We have gone through the manuscript again and improved the writing.

      The presentation of the dependence on weight in Figure 6 can be improved. For instance, the authors talk about the optimal range of PV connectivity, but this is difficult to appreciate in the current illustration and with the current colour scheme.

      We added a zoom in to the panels which were hard to distinguish.

      • Minor corrections to the text and figures. Text:

      We thank the reviewer for their thorough reading of our manuscript. We fixed all the issues from below in the manuscript.

      Some examples of bad structure or wording:

      From the Abstract:

      ”We show when E - PV networks recurrently connect with SOM neurons then an SOM mediated modulation that leads to increased neuronal gain can also yield increased network stability.” From Introduction:

      Sentence starting with ”This new circuit reality ...”

      ”Inhibition is been long identified as a physiological or circuit basis for how cortical activity changes depending upon processing or cognitive needs ...”

      Sentence starting with ”Cortical models with both ...”

      ”... allowing SOM neurons the freedom to ..”

      From Results:

      ”... affects of SOM neurons on E ..”

      ”seem in opposition to one another, with SOM neuron activity providing either a source or a relief of E neuron suppression”. The sentence after is also difficult to read and needs to be simplified.

      P. 7: ”We first remark that ...”

      Difficult to read/understand - long and badly structured sentence.

      P. 8: ”adding a recurrent connection onto SOM neurons from the E-PV subcircuit” It’s from E (and not PV) to be more precise (Fig. 5).

      Discussion:

      ”Firstly, E neurons and PV neurons experience very similar synaptic environments.” What does it mean?

      ”Fortunately, PV neurons target both the cell bodies and proximal dendrites” Fortunately for whom or what? ”in line with arge heterogeneity”

      Methods:

      Matrix B is never defined - the diagonal matrix of b (power law exponents) I assume.

      Some of the other notations too, e.g. bs, etc (it’s implicit, but should be explained).

      Structure of sentence:

      ”Network gain is defined as ...” (p. 17)

      Figure:

      The schematics in Figure 4 can be tweaked to highlight the effect of input (rather than other components of the network, which are the same and repetitive), to highlight the main difference for the reader.

    1. eLife Assessment

      This useful manuscript presents findings on Tom1p's interaction with Spt6p and its role in chromatin dynamics, supported by structural analysis through CryoEM. The evidence for the conclusions is currently incomplete, lacking key experiments including continuation in vivo interaction and orthogonal binding assays (e.g., SPR, MST, ITC) to fully support the proposed mechanism. While the results are promising, further validation is needed to strengthen the evidence and improve the manuscript's overall cohesion.

    2. Reviewer #1 (Public review):

      Summary:

      In this preprint, Madrigal et al present "Tom1p ubiquitin ligase structure, interaction with Spt6p, and function in maintaining normal transcript levels and the stability of chromatin in promoters" which describes the identification of Tom1p, a conserved ubiquitin ligase, as a potential binding partner for the transcription elongation/histone chaperone Spt6p, and reveal the Tom1p structure as determined by CryoEM. Tom1p is a homolog of human HUWE1, which has been implicated in decay for a variety of basic protein substrates such as ribosomal proteins and histones. Structure-function analyses identify regions required for Spt6p interaction, suggesting that the interaction with Spt6p is phosphorylation dependent, and for interactions with histones, the latter of which confers phenotypes in vivo when mutated, suggesting that the Tom1p acidic region is important for its function. What is less clear is the function or interaction with Spt6p. The manuscript speculates that Spt6p-Tom1p interactions may tune Tom1p localization, and it is shown that Tom1p is recruited to transcribed genes by chromatin IP. In addition, the Tom1p structure will be valuable to those trying to understand the mechanisms of this very large ubiquitin ligase. Here, structures of homologs from other organisms have already been described elsewhere, however, the authors here indicate some details potentially not previously visualized in other structures.

      Strengths:

      It has not previously been known that the Spt6p tSH2 had any additional targets. Interaction with a ubiquitin ligase already implicated in histone turnover given Spt6p's role as histone chaperone is interesting. A structure of Tom1p also provides insight into this very large, conserved protein and structure-function analysis in a model system is a good start towards mechanistic dissection.

      Weaknesses:

      Some aspects of the manuscript seem less cohesive in that there are two halves of the manuscript and both don't quite solidify insights into the Spt6p relationship to Tom1p or deepen our understanding of Tom1p mechanism extensively, though results are a great start on both sides of the paper. There are several points that are less clear in that it is not known if Spt6p interacts with Tom1p and in what context. The interaction surface of Spt6p able to interact with Tom1p is the identical tSH2 that would be predicted to be occupied by phosphorylated RNAPII when Spt6p is incorporated into the RNAPII elongation complex. This means how and when Spt6p might be available to interact with Tom1p is not clear. Previous work from the Hill and Formosa groups on the tSH2 domain and its RNAPII linker target have suggested that phenotypes of mutants in the two are similar, suggesting that their main function is to interact with each other. A simple test of examining Tom1p interaction with genes in the tSH2 mutant was not done. Additionally, the Spt6p interacting surface on Tom1p is not narrowed to a specific putatively phosphorylated residue that it might target. It remains possible that mutations in other regions of Tom1p affect potential phosphorylation of this target, and therefore it is possible that some mutations that alter Spt6p interaction could do so indirectly. Finally, the authors might consider additional models for their discussion where Spt6p potentially could function to deliver histones to Tom1p.

    3. Reviewer #2 (Public review):

      Summary:

      Madrigal et al identified Tom1, a E3 ubiquitin ligase previously known to be involved in ribosome biogenesis, as a protein that binds to terminal tandem Src-homology 2 (tSH2) domain of Spt6. They mapped this interaction to the acid region of Tom1, which is also known to interact with histones. Cells with tom1 mutants that cannot bind Spt6 did not show temperature sensitive phenotypes displaced for tom1 null mutant. Using ChIP assays, they showed that Tom1 is enriched at gene bodies of highly transcribed genes, and a loss of tom1 leads to reduced nucleosomal changes at gene promoters. Finally, they also solved the structures of Tom1 lacking the acidic region and found that Tom1p can adopt a compact a-solenoidal "basket" similar to the previously described structure of HUWE1. Overall, this is an interesting study and I have the following suggestions to improve the manuscript.

      Major concerns:

      (1) Promoter regions are in general nucleosome free. How does Tom1 mutant affect nucleosome-sized fragments at the promoter regions?

      (2) While Tom1 antibodies may not specific, could the author perform Tom1 ChIP-seq in wild type and tom 1 null cells? This dataset may be more informative than tagged Tom1 that may not be functional.

    4. Reviewer #3 (Public review):

      Summary:

      The authors report a novel, direct interaction of Spt6p tSH2 domain to Tom1p. This extends the function of Spt6p from communication with factors associated with RNAPII transcription to processes of ubiquitination. Tom1p is known to ubiquitinate a large variety of substrates, but it is unknown how substrate recognition is done in a specific manner. The team identified a conserved central acidic region of Tom1p which is essential for in vivo functions and binds to histones and nucleosomes, as well as Spt6p. They further describe the Tom1p occupancy pattern on chromatin, assigning it a stabilizing effect on nucleosomes near promotors and a destabilizing effect on nucleosomes within the gene bodies. The authors were able to resolve two different conformational states of Tom1p which are likely connected to its activity, and possibly substrate selectivity.

      Overall, the authors show that an intrinsically disordered region in Tom1p is important for substrate interaction and function of Tom1p. The protein is further involved in chromatin architecture and structural transitions control its activity.

      Strengths:

      By revealing the interaction of Spt6p and Tom1p, the authors discover a novel connection between transcriptional elongation and processes of ubiquitination.<br /> In recent years, disordered regions of MDa protein complexes have become a focus of research projects. The effects of disordered regions on protein localization and specificity of binding interactions have been discussed in great extent, including proteins that are involved in chromatin remodeling and transcription. Adding to these current efforts, the authors assign a function to a highly conserved disordered region of Tom1p in technically clean experiments. Furthermore, with their data, they pin down a specific functional region in Tom1p which is relevant for the previously observed temperature sensitivity caused by Tom1p deletion in yeast.<br /> The team performs a thorough and complete analysis of the cryo-EM structure and they nicely model the hinge motion and details of an open and closed conformation.

      Weaknesses:

      Despite the high number of interesting findings, there is little connection between the individual sections of the manuscript. For example, many experiments are not related to Spt6p binding although this protein is presented as a major actor in this manuscript during the introduction. Furthermore, the structural analysis is well done, but it is also not quite clear how structural rearrangements are connected to Spt6 binding or chromatin remodeling. Some experimental results lack novelty, as similar data has previously been presented for the human homolog.

      To confirm the novel, direct binding interaction of Spt6p and Tom1p, no orthogonal binding assays (SPR, MST, ITC) have been performed to confirm the interaction. To me, this is insufficient, especially since the team has purified both proteins to high quality levels, or could use peptides to test the function of the relevant regions.<br /> Additionally, interaction of Tom1p with Spt6p in the context of transcription elongation is proposed. Yet it is not clear on the mechanistic level how this is regulated if Tom1p and Rpb1p bind in a competitive manner. How is Tom1p tethered to the elongation complex if not through Spt6p? In addition to WT vs. knockout, the authors should further perform the genetic analyses with the intΔ11 mutant. This way they might be able pin down which interactions on chromatin are mediated by Spt6 vs. by other factors and could strengthen the overall model involving Spt6P.

      Although the authors try to describe a final model in the discussion, this section is not easy to follow and needs more explanation, ideally drawn as a Figure of the proposed mechanism.

    5. Author response:

      We thank the reviewers’ for their helpful comments.  We will make several minor edits to the text to improve clarity. Further experiments are beyond the scope of the current study.

    1. eLife Assessment

      By combining the 'pinging' technique with fMRI-based multivariate decoding, this important study examined the nature of the representation of the attentional template during preparation. While the findings are very interesting and the experimental evidence is solid, the methodological (e.g., the manipulation of attention, the potential cross-contamination between attention and working memory, and the representational distance analysis) and interpretation confounds (e.g., more thorough clarification of "pinging" and dual-format attentional templates) need to be addressed. The work will be of interest to researchers in psychology, cognitive science, and neuroscience.

    2. Reviewer #1 (Public review):

      Summary:

      The aim of the experiment reported in this paper is to examine the nature of the representation of a template of an upcoming target. To this end, participants were presented with compound gratings (consisting of tilted to the right and tilted to the left lines) and were cued to a particular orientation - red left tilt or blue right tilt (counterbalanced across participants). There are two directly compared conditions: (i) no ping: where there was a cue, that was followed by a 5.5-7.5s delay, then followed by a target grating in which the cued orientation deviated from the standard 45 degrees; and (ii) ping condition in which all aspects were the same with the only difference that a ping (visual impulse presented for 100ms) was presented after the 2.5 seconds following the cue. There was also a perception task in which only the 45 degrees to the right or to the left lines were presented. It was observed that during the delay, only in the ping condition, were the authors able to decode the orientation of the to-be-reported target using the cross-task generalization. Attention decoding, on the other hand, was decoded in both ping and non-ping conditions. It is concluded that the visual system has two different functional states associated with a template during preparation: a predominantly non-sensory representation for guidance and a latent sensory-like for prospective stimulus processing.

      Strengths:

      There is so much to be impressed with in this report. The writing of the manuscript is incredibly clear. The experimental design is clever and innovative. The analysis is sophisticated and also innovative - the cross-task decoding, the use of Mahalanobis distance as a function of representational similarity, the fact that the question is theoretically interesting, and the excellent figures.

      Weaknesses:

      While I think that this is an interesting study that addresses an important theoretical question, I have several concerns about the experimental paradigm and the subsequent conclusions that can be drawn.

      (1) Why was V1 separated from the rest of the visual cortex, and why the rest of the areas were simply lumped into an EVC ROI? It would be helpful to understand the separation into ROIs.

      (2) It would have been helpful to have a behavioral measure of the "attended" orientation to show that participants in fact attended to a particular orientation and were faster in the cued condition. The cue here was 100% valid, so no such behavioral measure of attention is available here.

      (3) As I was reading the manuscript I kept thinking that the word attention in this manuscript can be easily replaced with visual working memory. Have the authors considered what it is about their task or cognitive demand that makes this investigation about attention or working memory?

      (4) If I understand correctly, the only ROI that showed a significant difference for the cross-task generalization is V1. Was it predicted that only V1 would have two functional states? It should also be made clear that the only difference where the two states differ is V1.

      (5) My primary concern about the interpretation of the finding is that the result, differences in cross-task decoding within V1 between the ping and no-ping condition might simply be explained by the fact that the ping condition refocuses attention during the long delay thus "resharpening" the template. In the no-ping condition during the 5.5 to 7.5 seconds long delay, attention for orientation might start getting less "crisp." In the ping condition, however, the ping itself might simply serve to refocus attention. So, the result is not showing the difference between the latent and non-latent stages, rather it is the difference between a decaying template representation and a representation during the refocused attentional state. It is important to address this point. Would a simple tone during the delay do the same? If so, the interpretation of the results will be different.

      (6) The neural pattern distances measured using Mahalanobis values are really great! Have the authors tried to use all of the data, rather than the high AMI and low AMI to possibly show a linear relationship between response times and AMI?

      (7) After reading the whole manuscript I still don't understand what the authors think the ping is actually doing, mechanistically. I would have liked a more thorough discussion, rather than referencing previous papers (all by the co-author).

    3. Reviewer #2 (Public review):

      Summary:

      In the present study, the authors investigated the nature of attentional templates during the preparatory period of goal-directed attention. By combining the use of 'pinging' the neural activity with a visual impulse and fMRI-based multivariate decoding, the authors found that the nature of the neural representations of the prospective feature target during the preparatory period was contingent on the presence of the 'pinging' impulse. While the preparatory representations contained highly similar information content as the perceptual representations when the pinging impulse was introduced, they fundamentally differed from perceptual representations in the absence of the pinging impulse. Based on these findings, the authors proposed a dual-format mechanism in which both a "non-sensory" template and a latent "sensory" template coexisted during attentional preparation. The former actively guides activity in the preparatory state, and the latter is utilized for future stimulus processing.

      Strengths:

      Overall, I think this is an interesting study that introduced a novel perspective concerning the nature of neural representations during attentional processing. Methodologically, the present study combines an innovative utilization of the pinging technique in working memory studies and fMRI-based multivariate pattern analysis. The method is sound and the results are convincing. While I appreciate the conceptual elegance of the dual-format idea proposed by the authors, there are several questions that need to be addressed more thoroughly to clarify some of the potential ambiguities of the results and to increase the plausibility of the author's theory.

      Weaknesses:

      (1) The origin of the latent sensory-like representation. By 'pinging' the neural activity with a high-contrast, task-irrelevant visual stimulus during the preparation period, the authors identified the representation of the attentional feature target that contains the same information as perceptual representations. The authors interpreted this finding as a 'sensory-like' template is inherently hosted in a latent form in the visual system, which is revealed by the pinging impulse. However, I am not sure whether such a sensory-like template is essentially created, rather than revealed, by the pinging impulses. First, unlike the classical employment of the pinging technique in working memory studies, the (latent) representation of the memoranda during the maintenance period is undisputed because participants could not have performed well in the subsequent memory test otherwise. However, this appears not to be the case in the present study. As shown in Figure 1C, there was no significant difference in behavioral performance between the ping and the no-ping sessions (see also lines 110-125, pg. 5-6). In other words, it seems to me that the subsequent attentional task performance does not necessarily rely on the generation of such sensory-like representations in the preparatory period and that the emergence of such sensory-like representations does not facilitate subsequent attentional performance either. In such a case, one might wonder whether such sensory-like templates are really created, hosted, and eventually utilized during the attentional process. Second, because the reference orientations (i.e. 45 degrees and 135 degrees) have remained unchanged throughout the experiment, it is highly possible that participants implicitly memorized these two orientations as they completed more and more trials. In such a case, one might wonder whether the 'sensory-like' templates are essentially latent working memory representations activated by the pinging as was reported in Wolff et al. (2017), rather than a functional signature of the attentional process.

      (2) The coexistence of the two types of attentional templates. The authors interpreted their findings as the outcome of a dual-format mechanism in which 'a non-sensory template' and a latent 'sensory-like' template coexist (e.g. lines 103-106, pg. 5). While I find this interpretation interesting and conceptually elegant, I am not sure whether it is appropriate to term it 'coexistence'. First, it is theoretically possible that there is only one representation in either session (i.e. a non-sensory template in the no-ping session and a sensory-like template in the ping session) in any of the brain regions considered. Second, it seems that there is no direct evidence concerning the temporal relationship between these two types of templates, provided that they commonly emerge in both sessions. Besides, due to the sluggish nature of fMRI data, it is difficult to tell whether the two types of templates temporally overlap.

      (3) The representational distance. The authors used Mahalanobis distance to quantify the similarity of neural representation between different conditions. According to the authors' hypothesis, one would expect greater pattern similarity between 'attend leftward' and 'perceived leftward' in the ping session in comparison to the no-ping session. However, this appears not to be the case. As shown in Figures 3B and C, there was no major difference in Mahalanobis distance between the two sessions in either ROI and the authors did not report a significant main effect of the session in any of the ANOVAs. Besides, in all the ANOVAs, the authors reported only the statistic term corresponding to the interaction effect without showing the descriptive statistics related to the interaction effect. It is strongly advised that these descriptive statistics related to the interaction effect should be included to facilitate a more effective and intuitive understanding of their data.

    4. Reviewer #3 (Public review):

      This paper discusses how non-sensory and latent, sensory-like attentional templates are represented during attentional preparation. Using multivariate pattern analysis, they found that visual impulses can enhance the decoding generalization from perception to attention tasks in the preparatory stage in the visual cortex. Furthermore, the emergence of the sensory-like template coincided with enhanced information connectivity between V1 and frontoparietal areas and was associated with improved behavioral performance. It is an interesting paper with supporting evidence for the latent, sensory-like attentional template, but several problems still need to be solved.

      (1) The title is "Dual-format Attentional Template," yet the supporting evidence for the non-sensory format and its guiding function is quite weak. The author could consider conducting further generalization analysis from stimulus selection to preparation stages to explore whether additional information emerges.

      (2) In Figure 2, the author did not find any decodable sensory-like coding in IPS and PFC, even during the impulse-driven session, indicating that these regions do not represent sensory-like information. However, in the final section, the author claimed that the impulse-driven sensory-like template strengthens informational connectivity between sensory and frontoparietal areas. This raises a question: how can we reconcile the lack of decodable coding in these frontoparietal regions with the reported enhancement in network communication? It would be helpful if the author provided a clearer explanation or additional evidence to bridge this gap.

      (3) Given that the impulse-driven sensory-like template facilitated behavior, the author proposed that it might also enhance network communication. Indeed, they observed changes in informational connectivity. However, it remains unclear whether these changes in network communication have a direct and robust relationship with behavioral improvements.

      (4) I'm uncertain about the definition of the sensory-like template in this paper. Is it referring to the Ping impulse-driven condition or the decodable performance in the early visual cortex? If it is the former, even in working memory, whether pinging identifies an activity-silent mechanism is currently debated. If it's the latter, the authors should consider whether a causal relationship - such as "activating the sensory-like template strengthens the informational connectivity between sensory and frontoparietal areas" - is reasonable.

    1. eLife Assessment

      What makes one member of the species behave differently from another? This is a core problem in behavioral neuroscience. This valuable study seeks an answer for the specific case of the fruit fly expressing preferences for one odor over another. By a combination of behavioral measurements, neurophysiology, and network modeling, the authors find solid evidence for at least one locus of individuality in the peripheral olfactory system.

    2. Joint Public Review:

      Summary:

      The authors aimed to identify the neural sources of behavioral variation in fruit flies deciding between odor and air, or between two odors.

      Strengths:

      - The question is of fundamental importance.<br /> - The behavioral studies are automated, and high-throughput.<br /> - The data analyses are sophisticated and appropriate.<br /> - The paper is clear and well-written aside from some initially strong wording.<br /> - The figures beautifully illustrate their results.<br /> - The modeling efforts mechanistically ground observed data correlations.

      Weaknesses:

      -The correlations between behavioral variations and neural activity/synapse morphology are relatively weak, and sometimes overstated in the wording that describes them.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors seek to establish what aspects of nervous system structure and function may explain behavioral differences across individual fruit flies. The behavior in question is a preference for one odor or another in a choice assay. The variables related to neural function are odor responses in olfactory receptor neurons or in the second-order projection neurons, measured via calcium imaging. A different variable related to neural structure is the density of a presynaptic protein BRP. The authors measure these variables in the same fly along with the behavioral bias in the odor assays. Then they look for correlations across flies between the structure-function data and the behavior.

      Strengths:

      Where behavioral biases originate is a question of fundamental interest in the field. In an earlier paper (Honegger 2019) this group showed that flies do vary with regard to odor preference, and that there exists neural variation in olfactory circuits, but did not connect the two in the same animal. Here they do, which is a categorical advance, and opens the door to establishing a correlation. The authors inspect many such possible correlations. The underlying experiments reflect a great deal of work, and appear to be done carefully. The reporting is clear and transparent: All the data underlying the conclusions are shown, and associated code is available online.

      We are glad to hear the reviewer is supportive of the general question and approach.

      Weaknesses:

      The results are overstated. The correlations reported here are uniformly small, and don't inspire confidence that there is any causal connection. The main problems are

      Our revision overhauls the interpretation of the results to prioritize the results we have high confidence in (specifically, PC 2 of our Ca++ data as a predictor of OCT-MCH preference) versus results that are suggestive but not definitive (such as PC 1 of Ca++ data as a predictor of Air-OCT preference).

      It’s true that the correlations are small, with R2 values typically in the 0.1-0.2 range. That said, we would call it a victory if we could explain 10 to 20% of the variance of a behavior measure, captured in a 3 minute experiment, with a circuit correlate. This is particularly true because, as the reviewer notes, the behavioral measurement is noisy.

      (1) The target effect to be explained is itself very weak. Odor preference of a given fly varies considerably across time. The systematic bias distinguishing one fly from another is small compared to the variability. Because the neural measurements are by necessity separated in time from the behavior, this noise places serious limits on any correlation between the two.

      This is broadly correct, though to quibble, it’s our measurement of odor preference which varies considerably over time. We are reasonably confident that more variance in our measurements can be attributed to sampling error than changes to true preference over time. As evidence, the correlation in sequential measures of individual odor preference, with delays of 3 hours or 24 hours, are not obviously different. We are separately working on methodological improvements to get more precise estimates of persistent individual odor preference, using averages of multiple, spaced measurements. This is promising, but beyond the scope of this study.

      (2) The correlations reported here are uniformly weak and not robust. In several of the key figures, the elimination of one or two outlier flies completely abolishes the relationship. The confidence bounds on the claimed correlations are very broad. These uncertainties propagate to undermine the eventual claims for a correspondence between neural and behavioral measures.

      We are broadly receptive to this criticism. The lack of robustness of some results comes from the fundamental challenge of this work: measuring behavior is noisy at the individual level. Measuring Ca++ is also somewhat noisy. Correlating the two will be underpowered unless the sample size is huge (which is impractical, as each data point requires a dissection and live imaging session) or the effect size is large (which is generally not the case in biology). In the current version we tried in some sense to avoid discussing these challenges head-on, instead trying to focus on what we thought were the conclusions justified by our experiments with sample sizes ranging from 20 to 60. Our revision is more candid about these challenges.

      That said, we believe the result we view as the most exciting — that PC2 of Ca++ responses predicts OCT-MCH preference — is robust. 1) It is based on a training set with 47 individuals and a test set composed of 22 individuals. The p-value is sufficiently low in each of these sets (0.0063 and 0.0069, respectively) to pass an overly stringent Bonferroni correction for the 5 tests (each PC) in this analysis. 2) The BRP immunohistochemistry provides independent evidence that is consistent with this result — PC2 that predicts behavior (p = 0.03 from only one test) and has loadings that contrast DC2 and DM2. Taken together, these results are well above the field-standard bar of statistical robustness.

      In our revision, we are explicit that this is the (one) result we have high confidence in. We believe this result convincingly links Ca++ and behavior, and warrants spotlighting. We have less confidence in other results, and say so, and we hope this addresses concerns about overstating our results.

      (3) Some aspects of the statistical treatment are unusual. Typically a model is proposed for the relationship between neuronal signals and behavior, and the model predictions are correlated with the actual behavioral data. The normal practice is to train the model on part of the data and test it on another part. But here the training set at times includes the testing set, which tends to give high correlations from overfitting. Other times the testing set gives much higher correlations than the training set, and then the results from the testing set are reported. Where the authors explored many possible relationships, it is unclear whether the significance tests account for the many tested hypotheses. The main text quotes the key results without confidence limits.

      Our primary analyses are exactly what the reviewer describes, scatter plots and correlations of actual behavioral measures against predicted measures. We produced test data in separate experiments, conducted weeks to months after models were fit on training data. This is more rigorous than splitting into training and test sets data collected in a single session, as batch/environmental effects reduce the independence of data collected within a single session.

      We only collected a test set when our training set produced a promising correlation between predicted and actual behavioral measures. We never used data from test sets to train models. In our main figures, we showed scatter plots that combined test and training data, as the training and test partitions had similar correlations.

      We are unsure what the reviewer means by instances where we explored many possible relationships. The greatest number of comparisons that could lead to the rejection of a null hypothesis was 5 (corresponding to the top 5 PCs of Ca++ response variation or Brp signal). We were explicit that the p-values reported were nominal. As mentioned above, applying a Bonferroni correction for n=5 comparisons to either the training or test correlations from the Ca++ to OCT-MCH preference model remains significant at alpha=0.05.

      Our revision includes confidence intervals around ⍴signal for the PN PC2 OCT-MCH model, and for the ORN Brp-Short PC2 OCT-MCH model (lines 170-172, 238)

      Reviewer #2 (Public Review):

      Summary:

      The authors aimed to identify the neural sources of behavioral variation in a decision between odor and air, or between two odors.

      Strengths:

      -The question is of fundamental importance.

      -The behavioral studies are automated, and high-throughput.

      -The data analyses are sophisticated and appropriate.

      -The paper is clear and well-written aside from some strong wording.

      -The figures beautifully illustrate their results.

      -The modeling efforts mechanistically ground observed data correlations.

      We are glad to read that the reviewer sees these strengths in the study. We hope the current revision addresses the strong wording.

      Weaknesses:

      -The correlations between behavioral variations and neural activity/synapse morphology are (i) relatively weak, (ii) framed using the inappropriate words "predict", "link", and "explain", and (iii) sometimes non-intuitive (e.g., PC 1 of neural activity).

      Taking each of these points in turn:

      i) It would indeed be nicer if our empirical correlations are higher. One quibble: we primarily report relatively weak correlations between measurements of behavior and Ca++/Brp. This could be the case even when the correlation between true behavior and Ca++/Brp is higher. Our analysis of the potential correlation between latent behavioral and Ca++ signals was an attempt to tease these relationships apart. The analysis suggests that there could, in fact, be a high underlying correlation between behavior and these circuit features (though the error bars on these inferences are wide).

      ii) We worked to ensure such words are used appropriately. “Predict” can often be appropriate in this context, as a model predicts true data values. Explain can also be appropriate, as X “explaining” a portion of the variance of Y is synonymous with X and Y being correlated. We cannot think of formal uses of “link,” and have revised the manuscript to resolve any inappropriate word choice.

      iii) If the underlying biology is rooted in non-intuitive relationships, there’s unfortunately not much we can do about it. We chose to use PCs of our Ca++/Brp data as predictors to deal with the challenge of having many potential predictors (odor-glomerular responses) and relatively few output variables (behavioral bias). Thus, using PCs is a conservative approach to deal with multiple comparisons. Because PCs are just linear transformations of the original data, interpreting them is relatively easy, and in interpreting PC1 and PC2, we were able to identify simple interpretations (total activity and the difference between DC2 and DM2 activation, respectively). All in all, we remain satisfied with this approach as a means to both 1) limit multiple comparisons and 2) interpret simple meanings from predictive PCs.

      No attempts were made to perturb the relevant circuits to establish a causal relationship between behavioral variations and functional/morphological variations.

      We did conduct such experiments, but we did not report them because they had negative results that we could not definitively interpret. We used constitutive and inducible effectors to alter the physiology of ORNs projecting to DC2 and DM2. We also used UAS-LRP4 and UAS-LRP4-RNAi to attempt to increase and decrease the extent of Brp puncta in ORNs projecting to DC2 and DM2. None of these manipulations had a significant effect on mean odor preference in the OCT-MCH choice, which was the behavioral focus of these experiments. We were unable to determine if the effectors had the intended effects in the targeted Gal4 lines, particularly in the LRP experiments, so we could not rule out that our negative finding reflected a technical failure.

      Author response image 1.

      We believe that even if these negative results are not technical failures, they are not necessarily inconsistent with the analyses correlating features of DC2 and DM2 to behavior. Specifically, we suspect that there are correlated fluctuations in glomerular Ca++ responses and Brp across individuals, due to fluctuations in the developmental spatial patterning of the antennal lobe. Thus, the DC2-DM2 predictor may represent a slice/subset of predictors distributed across the antennal lobe. This would also explain how we “got lucky” to find two glomeruli as predictors of behavior, when we were only able to image a small portion of the glomeruli.

      Reviewer #3 (Public Review):

      Churgin et. al. seeks to understand the neural substrates of individual odor preference in the Drosophila antennal lobe, using paired behavioral testing and calcium imaging from ORNs and PNs in the same flies, and testing whether ORN and PN odor responses can predict behavioral preference. The manuscript's main claims are that ORN activity in response to a panel of odors is predictive of the individual's preference for 3-octanol (3-OCT) relative to clean air, and that activity in the projection neurons is predictive of both 3-OCT vs. air preference and 3-OCT vs. 4-methylcyclohexanol (MCH). They find that the difference in density of fluorescently-tagged brp (a presynaptic marker) in two glomeruli (DC2 and DM2) trends towards predicting behavioral preference between 3-oct vs. MCH. Implementing a model of the antennal lobe based on the available connectome data, they find that glomerulus-level variation in response reminiscent of the variation that they observe can be generated by resampling variables associated with the glomeruli, such as ORN identity and glomerular synapse density.

      Strengths:

      The authors investigate a highly significant and impactful problem of interest to all experimental biologists, nearly all of whom must often conduct their measurements in many different individuals and so have a vested interest in understanding this problem. The manuscript represents a lot of work, with challenging paired behavioral and neural measurements.

      Weaknesses:

      The overall impression is that the authors are attempting to explain complex, highly variable behavioral output with a comparatively limited set of neural measurements.

      We would say that we are attempting to explain a simple, highly variable behavioral measure with a comparatively limited set of neural measurements, i.e. we make no claims to explain the complex behavioral components of odor choice, like locomotion, reversals at the odor boundary, etc.

      Given the degree of behavioral variability they observe within an individual (Figure 1- supp 1) which implies temporal/state/measurement variation in behavior, it's unclear that their degree of sampling can resolve true individual variability (what they call "idiosyncrasy") in neural responses, given the additional temporal/state/measurement variation in neural responses.

      We are confident that different Ca++ recordings are statistically different. This is borne out in the analysis of repeated Ca++ recordings in this study, which finds that the significant PCs of Ca++ variation contain 77% of the variation in that data. That this variation is persistent over time and across hemispheres was assessed in Honegger & Smith, et al., 2019. We are thus confident that there is true individuality in neural responses (Note, we prefer not to call it “individual variability” as this could refer to variability within individuals, not variability across individuals.) It is a separate question of whether individual differences in neural responses bear some relation to individual differences in behavioral biases. That was the focus of this study, and our finding of a robust correlation between PC 2 of Ca++ responses and OCT-MCH preference indicates a relation. Because behavior and Ca++ were collected with an hours-to-day long gap, this implies that there are latent versions of both behavioral bias and Ca++ response that are stable on timescales at least that long.

      The statistical analyses in the manuscript are underdeveloped, and it's unclear the degree to which the correlations reported have explanatory (causative) power in accounting for organismal behavior.

      With respect, we do not think our statistical analyses are underdeveloped, though we acknowledge that the detailed reviewer suggestions included the helpful suggestion to include uncertainty in the estimation of confidence intervals around the point estimate of the strength of correlation between latent behavioral and Ca++ response states – we have added these for the PN PC2 linear model (lines 170-172).

      It is indeed a separate question whether the correlations we observed represent causal links from Ca++ to behavior (though our yoked experiment suggests there is not a behavior-to-Ca++ causal relationship — at least one where odor experience through behavior is an upstream cause). We attempted to be precise in indicating that our observations are correlations. That is why we used that word in the title, as an example. In the revision, we worked to ensure this is appropriately reflected in all word choice across the paper.

      Recommendations for the Authors:

      Reviewer #1 (Recommendations for the Authors):

      Detailed comments: Many of the problems can be identified starting from Figure 4, which summarizes the main claims. I will focus on that figure and its tributaries.

      Acknowledging that the strength of several of our inferences are weak compared to what we consider the main result (the relationship between PC2 of Ca++ and OCT-MCH preference),we have removed Figure 4. This makes the focus of the paper much clearer and appropriately puts focus on the results that have strong statistical support.

      (1) The process of "inferring" correlation among the unobserved latent states for neural sensitivity and behavioral bias is unconventional and risky. The larger the assumed noise linking the latent to the observed variables (i.e. the smaller r_b and r_c) the bigger the inferred correlation rho from a given observed correlation R^2_cb. In this situation, the value of the inferred rho becomes highly dependent on what model one assumes that links latent to observed states. But the specific model drawn in Fig 4 suppl 1 is just one of many possible guesses. For example, models with nonlinear interactions could produce different inference.

      We agree with the reviewer’s notes of caution. To be clear, we do not intend for this analysis to be the main takeaway of the paper and have revised it to make this clear. The signal we are most confident in is the simple correlation between measured Ca++ PC2 and measured behavior. We have added more careful language saying that the attempt to infer the correlation between latent signals is one attempt at describing the data generation process (lines 166-172), and one possible estimate of an “underlying” correlation.

      (2) If one still wanted to go through with this inference process and set confidence bounds on rho, one needs to include all the uncertainties. Here the authors only include uncertainty in the value of R^2_c,b and they peg that at +/-20% (Line 1367). In addition there is plenty of uncertainty associated also with R^2_c,c and R^2_b,b. This will propagate into a wider confidence interval on rho.

      We have replaced the arbitrary +/- 20% window with bootstrapping the pairs of (predicted preference by PN PC2, measured preference) points and getting a bootstrap distribution of R2c,b, which is, not surprisingly, considerably wider. Still, we think there is some value in this analysis as the 90% CI of 𝜌signal under this model is 0.24-0.95. That is, including uncertainty about the R2b,b and R2c,c in the model still implies a significant relationship between latent calcium and behavior signals.

      (2.1) The uncertainty in R^2_cb is much greater than +/-20%. Take for example the highest correlation quoted in Fig 4: R^2=0.23 in the top row of panel A. This relationship refers to Fig 1L. Based on bootstrapping from this data set, I find a 90% confidence interval of CI=[0.002, 0.527]. That's an uncertainty of -100/+140%, not +/-20%. Moreover, this correlation is due entirely to the lone outlier on the bottom left. Removing that single fly abolishes any correlation in the data (R^2=0.04, p>0.3). With that the correlation of rho=0.64, the second-largest effect in Fig 4, disappears.

      We acknowledge that removal of the outlier in Fig 1L abolishes the correlation between predicted and measured OCT-AIR preference. We have thus moved that subfigure to the supplement (now Figure 1 – figure supplement 10B), note that we do not have robust statistical support of ORN PC1 predicting OCT-AIR preference in the results (lines 177-178), and place our emphasis on PN PC2’s capacity to predict OCT-MCH preference throughout the text.

      (2.2) Similarly with the bottom line of Fig 4A, which relies on Fig 1M. With the data as plotted, the confidence interval on R^2 is CI=[0.007, 0.201], again an uncertainty of -100/+140%. There are two clear outlier points, and if one removes those, the correlation disappears entirely (R^2=0.06, p=0.09).

      We acknowledge that removal of the two outliers in Fig 1M between predicted and measured OCT-AIR preference abolishes the correlation. We have also moved that subfigure to the supplement (now Figure 1 – figure supplement 10F) and do not claim to have robust statistical support of PN PC1 predicting OCT-AIR preference.

      (2.3) Similarly, the correlation R^2_bb of behavior with itself is weak and comes with great uncertainty (Fig 1 Suppl 1, panels B-E). For example, panel D figures prominently in computing the large inferred correlation of 0.75 between PN responses and OCT-MCH choice (Line 171ff). That correlation is weak and has a very wide confidence interval CI=[0.018, 0.329]. This uncertainty about R^2_bb should be taken into account when computing the likelihood of rho.

      We now include bootstrapping of the 3 hour OCT-MCH persistence data in our inference of 𝜌signal.

      (2.4) The correlation R^2_cc for the empirical repeatability of Ca signals seems to be obtained by a different method. Fig 4 suppl 1 focuses on the repeatability of calcium recording at two different time points. But Line 625ff suggests the correlation R^2_cc=0.77 all derives from one time point. It is unclear how these are related.

      Because our calcium model predictors utilize principal components of the glomerulus-odor responses (the mean Δf/f in the odor presentation window), we compute R2c,c through adding variance explained along the PCs, up to the point in which the component-wise variance explained does not exceed that of shuffled data (lines 609-620 in Materials and Methods). In this revision we now bootstrap the calcium data on the level of individual flies to get a bootstrap distribution of R2c,c, and propagate the uncertainty forward in the inference of 𝜌signal.

      (2.5) To summarize, two of the key relationships in Fig 1 are due entirely to one or two outlier points. These should not even be used for further analysis, yet they underlie two of the claims in Fig 4. The other correlations are weak, and come with great uncertainty, as confirmed by resampling. Those uncertainties should be propagated through the inference procedure described in Fig 4. It seems possible that the result will be entirely uninformative, leaving rho with a confidence interval that spans the entire available range [0,1]. Until that analysis is done, the claims of neuron-to-behavior correlation in this manuscript are not convincing.

      It is important to note that we never thought our analysis of the relationship between latent behavior and calcium signals should be interpreted as the main finding. Instead, the observed correlation between measured behavior and calcium is the take-away result. Importantly, it is also conservative compared to the inferred latent relationship, which in our minds was always a “bonus” analysis. Our revisions are now focused on highlighting the correlations between measured signals that have strong statistical support.

      As a response to these specific concerns, we have propagated uncertainty in all R2’s (calcium-calcium, behavior-behavior, calcium-behavior) in our new inference for 𝜌signal, yielding a new median estimate for PN PC 2 underlying OCT-MCH preference of 0.68, with a 90% CI of 0.24-0.95. (Lines 171-172 in results, Inference of correlation between latent calcium and behavior states section in Materials and Methods).

      (3) Other statistical methods:

      (3.1) The caption of Fig 4 refers to "model applied to train+test data". Does that mean the training data were included in the correlation measurement? Depending on the number of degrees of freedom in the model, this could have led to overfitting.

      We have removed Figure 4 and emphasize the key results in Figure 1 and 2 that we see statistically robust signal of PN PC 2 explaining OCT-MCH preference variation in both a training set and a testing set of flies (Fig 2 – figure supplement 1C-D).

      (3.2) Line 180 describes a model that performed twice as well on test data (31% EV) as it did on training data (15%). What would explain such an outcome? And how does that affect one's confidence in the 31% number?

      The test set recordings were conducted several weeks after the training set recordings, which were used to establish PN PC 2 as a correlate of OCT-MCH preference. The fact that the test data had a higher R2 likely reflects sampling error (these two correlation coefficients are not significantly different). Ultimately this gives us more confidence in our model, as the predictive capacity is maintained in a totally separate set of flies.

      (3.340 Multiple models get compared in performance before settling on one. For example, sometimes the first PC is used, sometimes the second. Different weighting schemes appear in Fig 2. Do the quoted p-values for the correlation plots reflect a correction for multiple hypothesis testing?

      For all calcium-behavior models, we restricted our analysis to 5 PCs, as the proportion of calcium variance explained by each of these PCs was higher than that explained by the respective PC of shuffled data — i.e., there were at most five significant PCs in that data. We thus performed at most 5 hypothesis tests for a given model. PN PC 2 explained 15% of OCT-MCH preference variation, with a p-value of 0.0063 – this p-value is robust to a conservative Bonferroni correction to the 5 hypotheses considered at alpha=0.05.

      The weight schemes in Figure 2 and Figure 1 – figure supplement 10 reflect our interpretations of the salient features of the PCs and are follow-up analysis of the single principal component hypothesis tests. Thus they do not constitute additional tests that should be corrected. We now state in the methods explicitly that all reported p-values are nominal (line 563).

      (3.4) Line 165 ff: Quoting rho without giving the confidence interval is misleading. For example, the rho for the presynaptic density model is quoted as 0.51, which would be a sizeable correlation. But in fact, the posterior on rho is almost flat, see caption of Fig 4 suppl 1, which lists the CI as [0.11, 0.85]. That means the experiments place virtually no constraint on rho. If the authors had taken no data at all, the posterior on rho would be uniform, and give a median of 0.5.

      We now provide a confidence interval around 𝜌signal for the PN PC 2 model (lines 170-172). But per above, and consistent with the new focus of this revision, we view the 𝜌signal inference as secondary to the simple, significant correlation between PN PC 2 and OCT-MCH preference.

      (4) As it stands now, this paper illustrates how difficult it is to come to a strong conclusion in this domain. This may be worth some discussion. This group is probably in a better position than any to identify what are the limiting factors for this kind of research.

      We thank the reviewer for this suggestion and have added discussion of the difficulties in detecting signals for this kind of problem. That said, we are confident in stating that there is a meaningful correlation between PC 2 of PN Ca++ responses and OCT-MCH behavior given our model’s performance in predicting preference in a test set of flies, and in the consistent signal in ORN Bruchpilot.

      Reviewer #3 (Recommendations for the Authors):

      Two major concerns, one experimental/technical and one conceptual:

      (1) I appreciate the difficulty of the experimental design and problem. However, the correlations reported throughout are based on neural measurements in only 5 glomeruli (~10% of the olfactory system) at early stages of olfactory processing.

      We acknowledge that only imaging 5 glomeruli is regrettable. We worked hard to develop image analysis pipelines that could reliably segment as many glomeruli as possible from almost all individual flies. In the end, we concluded that it was better to focus our analysis on a (small) core set of glomeruli for which we had high confidence in the segmentation. Increasing the number of analyzed glomeruli is high on the list of improvements for subsequent studies. Happily, we are confident that we are capturing a significant, biologically meaningful correlation between PC 2 of PN calcium (dominated by the responses in DC2 and DM2) and OCT-MCH preference.

      3-OCT and MCH activate many glomeruli in addition to the five studied, especially at the concentrations used. There is also limited odor-specificity in their response matrix: notably responses are more correlated in all glomeruli within an individual, compared to responses across individuals (they note this in lines 194-198, though I don't quite understand the specific point they make here). This is a sign of high experimental variability (typically the dynamic range of odor response within an individual is similar to the range across individuals) and makes it even more difficult to resolve underlying individual variation.

      We respectfully disagree with the reviewer’s interpretation here. There is substantial odor-specificity in our response matrix. This is evident in both the ORN and PN response matrices (and especially the PN matrix) as variation in the brightness across rows. Columns, which correspond to individuals, are more similar than rows, which correspond to odor-glomerulus pairs. The dynamic range within an individual (within a column, across rows) is indeed greater than the variation among individuals (within a row, across columns).

      As an (important) aside, the odor stimuli are very unusual in this study. Odors are delivered at extremely high concentrations (variably 10-25% sv, line 464, not exactly sure what "variably' means- is the stimulus intensity not constant?) as compared to even the highest concentrations used in >95% of other studies (usually <~0.1% sv delivered).

      We used these concentrations for a variety of reasons. First, following the protocol of Honegger and Smith (2020), we found that dilutions in this range produce a linear input-output relationship, i.e. doubling or halving one odorant yields proportionate changes in odor-choice behavior metrics. Second, such fold dilutions are standard for tunnel assays of the kind we used. Claridge-Chang et al. (2009) used 14% and 11% for MCH and OCT respectively, for instance. Finally, the specific dilution factor (i.e., within the range of 10-25%) was adjusted on a week-by-week basis to ensure that in an OCT-MCH choice, the mean preference was approximately 50%. This yields the greatest signal of individual odor preference. We have added this last point to the methods section where the range of dilutions is described (lines 442-445).

      A parsimonious interpretation of their results is that the strongest correlation they see (ORN PC1 predicts OCT v. air preference) arises because intensity/strength of ORN responses across all odors (e.g. overall excitability of ORNs) partially predicts behavioral avoidance of 3-OCT. However, the degree to which variation in odor-specific glomerular activation patterns can explain behavioral preference (3-OCT v. MCH) seems much less clear, and correspondingly the correlations are weaker and p-values larger for the 3-OCT v. MCH result.

      With respect, we disagree with this analysis. The correlation between ORN PC 1 and OCT v. air preference (R2 \= 0.23) is quite similar to that of PN PC 2 and OCT vs MCH preference (R2 \= 0.20). However, the former is dependent on a single outlying point, whereas the latter is not. The latter relationship is also backed up by the BRP imaging and modeling. Therefore in the revision we have de-emphasized the OCT v. air preference model and emphasized the OCT v. MCH preference models.

      (2) There is a broader conceptual concern about the degree of logical consistency in the authors' interpretation of how neural variability maps to behavioral variability. For instance, the two odors they focus on, 3-OCT and MCH, barely activate ORNs in 4 of the 5 glomeruli they study. Most of the correlation of ORN PC1 vs. behavioral choice for 3-OCT vs. air, then, must be driven by overall glomerular activation by other odors (but remains predictive since responses across odors appear correlated within an individual). This gives pause to the interpretation that 3-OCT-evoked ORN activity in these five glomeruli is the neural substrate for variability in the behavioral response to 3-OCT.

      Our interpretation of the ORN PC1 linear model is not that 3-OCT-evoked ORN activity is the neural substrate for variability – instead, it is the general responsiveness of an individual’s AL across multiple odors (this is our interpretation of the the uniformly positive loadings in ORN PC1). It is true that OCT and MCH do not activate ORNs as strongly as other odorants – our analysis rests on the loadings of the PCs that capture all odor/glomerulus combinations available in our data. All that said, since a single outlier in Figure 1L dominates the relationship, therefore we have de-emphasized these particular results in our revision.

      This leads to the most significant concern, which is that the paper does not provide strong evidence that odor-specific patterns of glomerular activation in ORNs and PNs underlie individual behavioral preference between different odors (that each drive significant levels of activity, e.g. 3-OCT v. MCH), or that the ORN-PN synapse is a major driver of individual behavioral variability. Lines 26-31 of the abstract are not well supported, and the language should be softened.

      We have modified the abstract to emphasize our confidence in PN calcium correlating with odor-vs-odor preference (removing the ORN & odor-vs-air language).

      Their conclusions come primarily from having correlated many parameters reduced from the ORN and PN response matrices against the behavioral data. Several claims are made that a given PC is predictive of an odor preference while others are not, however it does not appear that the statistical tests to support this are shown in the figures or text.

      For each linear model of calcium dynamics predicting preference, we restricted our analysis to the first 5 principal components. Thus, we do not feel that we correlated many parameters against the behavioral data. As mentioned below, the correlations identified by this approach comfortably survive a conservative Bonferroni correction. In this revision, a linear model with a single predictor – the projection onto PC 2 of PN calcium – is the result we emphasize in the text, and we report R2 between measured and predicted preference for both a training set of flies and for a test set of flies (Figure 1M and Figure 2 – figure supplement 1).

      That is, it appears that the correlation of models based on each component is calculated, then the component with the highest correlation is selected, and a correlation and p-value computed based on that component alone, without a statistical comparison between the predictive values of each component, or to account for effectively performing multiple comparisons. (Figure 1, k l m n o p, Figure 3, d f, and associated analyses).

      To reiterate, this was our process: 1) Collect a training data set of paired Ca++ recordings and behavioral preference scores. 2) Compute the first five PCs of the Ca++ data, and measure the correlation of each to behavior. 3) Identify the PC with the best correlation. 4) Collect a test data set with new experimental recordings. 5) Apply the model identified in step 3. For some downstream analyses, we combined test and training data, but only after confirming the separate significance of the training and test correlations.

      The p-values associated with the PN PC 2 model predicting OCT-MCH preference are sufficiently low in each of the training and testing sets (0.0063 and 0.0069, respectively) to pass a conservative Bonferroni multiple hypothesis correction (one hypothesis for each of the 5 PCs) at an alpha of 0.05.

      Additionally, the statistical model presented in Figure 4 needs significantly more explanation or should be removed- it's unclear how they "infer" the correlation, and the conclusions appears inconsistent with Figure 3 - Figure Supplement 2.

      We have removed Figure 4 and have improved upon our approach of inferring the strength of the correlation between latent calcium and behavior in the Methods, incorporating bootstrapping of all sources of data used for the inference (lines 622-628). At the same time, we now emphasize that this analysis is a bonus of sorts, and that the simple correlation between Ca++ and behavior is the main result.

      Suggestions:

      (1) If the authors want to make the claim that individual variation in ORN or PN odor representations (e.g. glomerular activation patterns) underlie differences in odor preference (MCH v. OCT), they should generalize the weak correlation between ORN/PN activity and behavior to additional glomeruli and pair of odors, where both odors drive significant activity. Otherwise, the claims in the abstract should be tempered.

      We have modified the abstract to focus on the effect we have the highest confidence in: contrasting PN calcium activation of DM2 and DC2 predicting OCT-MCH preference.

      (2) One of the most valuable contributions a study like this could provide is to carefully quantify the amount of measurement variation (across trials, across hemispheres) in neural responses relative to the amount of individual variation (across individuals). Beyond the degree of variation in the amplitude of odor responses, the rank ordering of odor response strength between repeated measurements (to try to establish conditions that account for adaptation, etc.), between hemispheres, and between individuals is important. Establishing this information is foundational to this entire field of study. The authors take a good first step towards this in Figure 1J and Figure 1, supplement 5C, but the plots do not directly show variance, and the comparison is flawed because more comparisons go into the individual-individual crunch (as evidenced by the consistently smaller range of quartiles). The proper way to do this is by resampling.

      We do not know what the reviewer means by “individual-individual crunch,” unfortunately. Thus, it is difficult to determine why they think the analysis is flawed. We are also uncertain about the role of resampling in this analysis. The medians, interquartile ranges and whiskers in the panels referenced by the reviewer are not confidence intervals as might be determined by bootstrap resampling. Rather, these are direct statistics on the coding distances as measured – the raw values associated with these plots are visualized in Figure 1H.

      In our revision we updated the heatmaps in Figure 1 – figure supplement 3 to include recordings across the lobes and trials of each individual fly, and we have added a new supplementary figure, Figure 1 – figure supplement 4, to show the correspondence between recordings across lobes or trials, with associated rank-order correlation coefficients. Since the focus of this study was whether measured individual differences predict individual behavioral preference, a full characterization of the statistics of variation in calcium responses was not the focus, though it was the focus of a previous study (Honegger & Smith et al., 2019).

      To help the reader understand the data, we would encourage displaying data prior to dimensionality reduction - why not show direct plots of the mean and variance of the neural responses in each glomerulus across repeats, hemispheres, individuals?

      We added a new supplementary figure, Figure 1 – figure supplement 4, to show the correspondence between recordings across lobes or trials.

      A careful analysis of this point would allow the authors to support their currently unfounded assertion that odor responses become more "idiosyncratic" farther from the periphery (line 135-36); presumably they mean beyond just noise introduced by synaptic transmission, e.g. "idiosyncrasy" is reproducible within an individual. This is a strong statement that is not well-supported at present - it requires showing the degree of similarity in the representation between hemispheres is more similar within a fly than between flies in PNs compared to ORNs (see Hige... Turner, 2015).

      Here are the lines in question: “PN responses were more variable within flies, as measured across the left and right hemisphere ALs, compared to ORN responses (Figure 1 – figure supplement 5C), consistent with the hypothesis that odor representations become more idiosyncratic farther from the sensory periphery.”

      That responses are more idiosyncratic farther from the periphery is therefore not an “unfounded assertion.” It is clearly laid out as a hypothesis for which we can assess consistency in the data. We stand by our original interpretation: that several observations are consistent with this finding, including greater distance in coding space in PNs compared to ORNs, particularly across lobes and across flies. In addition, higher accuracy in decoding individual identity from PN responses compared to ORN responses (now appearing as Figure 1 – figure supplement 6A) is also consistent with this hypothesis.

      Still, to make confusion at this sentence less likely, we have reworded it as “suggesting that odor representations become more divergent farther from the sensory periphery.” (lines 139-140)

      (3) Figure 3 is difficult to interpret. Again, the variability of the measurement itself within and across individuals is not established up front. Expression of exogenous tagged brp in ORNs is also not guaranteed to reflect endogenous brp levels, so there is an additional assumption at that level.

      Figure 3 – figure supplement 1 Panels A-C display the variability of measurements (Brp volume, total fluorescence and fluorescence density) both within (left/right lobes) and across individuals (the different data points). We agree that exogenous tagged Brp levels will not be identical to endogenous levels. The relationship appears significant despite this caveat.

      Again there are statistical concerns with the correlations. For instance, the claim that "Higher Brp in DM2 predicted stronger MCH preference... " on line 389 is not statistically supported with p<0.05 in the ms (see Figure 3 G as the closest test, but even that is a test of the difference of DM2 and DC2, not DM2 alone).

      We have changed the language to focus on the pattern of the loadings in PC 2 of Brp-Short density and replaced “predict.” (lines 366-369).

      Can the authors also discuss what additional information is gained from the expansion microscopy in the figure supplement, and how it compares to brp density in DC2 using conventional methods?

      The expansion microscopy analysis was an attempt to determine what specific aspect of Brp expression was predictive of behavior, on the level of individual Brp puncta, as a finer look compared to the glomerulus-wide fluorescence signal in the conventional microscopy approach. Since this method did not yield a large sample size, at best we can say it provided evidence consistent with the observation from confocal imaging that Brp fluorescent density was the best measure in terms of predicting behavior.

      I would prefer to see the calcium and behavioral datasets strengthened to better establish the relationship between ORN/PN responses and behavior, and to set aside the anatomical dataset for a future work that investigates mechanisms.

      We are satisfied that our revisions put appropriate emphasis on a robust result relating calcium and behavior measurements: the relationship between OCT-MCH preference and idiosyncratic PN calcium responses. Finding that idiosyncratic Brp density has similar PC 2 loadings that also significantly predict behavior is an important finding that increases confidence in the calcium-behavior finding. We agree with the reviewer that these anatomical findings are secondary to the calcium-behavior analyses, but think they warrant a place in the main findings of the study. As the reviewer suggests, we are conducting follow-on studies that focus on the relationship between neuroanatomical measures and odor preference.

      (4) The mean imputation of missing data may have an effect on the conclusions that it is possible to draw from this dataset. In particular, as shown in Figure 1, supplemental figure 3, there is a relatively large amount of missing data, which is unevenly distributed across glomeruli and between the cell types recorded from. Strikingly, DC2 is missing in a large fraction of ORN recordings, while it is present in nearly all the PN recordings. Because DC2 is one of the glomeruli implicated in predicting MCH-OCT preference, this lack of data may be particularly likely to effect the evaluation of whether this preference can be predicted from the ORN data. Overall, mean imputation of glomerulus activity prior to PCA will artificially reduce the amount of variance contributed by the glomerulus. It would be useful to see an evaluation of which results of this paper are robust to different treatments of this missing data.

      We confirmed that the linear model of predicted OCT-MCH using PN PC2 calcium was minimally altered when we performed imputation via alternating least squares using the pca function with option ‘als’ to infill missing values on the calcium matrix 1000 times and taking the mean infilled matrix (see MATLAB documentation and Figure 1 – figure supplement 5 of Werkhoven et al., 2021). Fitted slope value for model using mean-infilled data presented in article: -0.0806 (SE = 0.028, model R2 \= 0.15), fitted slope value using ALS-imputed model: -0.0806 (SE 0.026, model R2 \= 0.17).

      Additional comments:

      (1) On line 255 there is an unnecessary condition: "non-negative positive".

      Thank you – non-negative has been removed.

      (2) In Figure 4 and the associated analysis, selection of +/- 20% interval around the observed $R^2$ appears arbitrary. This could be based on the actual confidence interval, or established by bootstrapping.

      We have replaced the +/- 20% rule by bootstrapping the calculation of behavior-behavior R2, calcium-calcium R2, and calcium-behavior R2 and propagating the uncertainties forward (Inference of correlation between latent calcium and behavior states section in Materials and Methods).

      (3) On line 409 the claim is made "These sources of variation specifically implicate the ORN-PN synapse..." While the model recapitulates the glomerulus specific variation of activity under PN synapse density variation, it also occurs under ORN identity variation, which calls into question whether the synapse distribution itself is specifically implicated, or if any variation that is expected to be glomerulus specific would be equally implicated.

      We agree with this observation. We found that varying either the ORNs or the PNs that project to each glomeruli can produce patterns of PN response variation similar to what is measured experimentally. This is consistent with the idea that the ORN-PN synapse is a key site of behaviorally-relevant variation.

      (4) Line 214 "... we conclude that the relative responses of DM2 vs DC2 in PNs largely explains an individual's preference." is too strong of a claim, based on the fact that using the PC2 explains much more of the variance, while using the stated hypothesis noticeable decreases the predictive power ($R^2$ = 0.2 vs $R^2$ = 0.12 )

      We have changed the wording here to “we conclude that the relative responses of DM2 vs DC2 in PNs compactly predict an individual’s preference.” (lines 192-193)

    1. Author response:

      Reviewer #1:

      We thank the reviewer for recognizing the impact of our work on the pivotal roles of N-glycan-dependent ERQC in cellular fitness and pathogenicity and providing valuable comments to be considered to improve the manuscript. As suggested, we will rearrange data, reduce text volume, and discuss the possibility of how ERQC mutation decreases EV secretion without significant defect in conventional secretion. Regarding the proteomics data, we have already initiated a comparative analysis of total intracellular and EV-associated proteins to determine whether the reduced cargo loading in the Ugg1 mutant is specific to EV-associated proteins. Additionally, we may extend the analysis to include total secretion, enabling a clearer comparison between classical secretion and EV-mediated secretion to better evaluate the extent of classical secretion defects in the Ugg1 mutant.

      Reviewer #2:

      We sincerely thank the reviewer for the positive evaluation of our work. As recommended, we will reduce the text and reorganize the data to enhance the manuscript's readability.

      Reviewer #3:

      We sincerely thank the reviewer for the high appreciation of our work. As recommended, we will provide a more detailed explanation of the results with improved interpretation, strongly grounded on the obtained data.

    2. eLife Assessment

      This important study confirms the molecular function of putative components of the N-glycan-dependent endoplasmic reticulum protein quality control (ERQC) system in the pathogen Cryptococcus neoformans. The study demonstrates an involvement in fitness, virulence, and the secretion and composition of extracellular vesicles, albeit in ways that are not yet fully understood. The evidence provided is largely convincing, with rigorous well-controlled assays and the use of complemented strains.

    3. Reviewer #1 (Public review):

      Summary:

      Using gene deletion analysis, the authors confirm the molecular function of putative components of an N-glycan-dependent endoplasmic reticulum protein quality control (ERQC) system (UGG1, MNS1, MNS101, MNL1, and MNL2), in the basidiomycetous fungal pathogen Cryptococcus neoformans. Specifically, they confirm the essential role of these components in the ERQC system and their role in ER stress which contributes to cellular fitness and pathogenicity.

      The second part of the study links the components to secretion, mainly EV biogenesis and composition. However, this part of the study is less convincing.

      Strengths:

      Although it is unclear why ER stress in the mutants would not manifest into a classical secretion defect, this is a rigorous, well-controlled study, with the use of complemented strains that demonstrate phenotypic restoration. The diagram in Figure 1 is very useful in orientating the reader to a complex subject matter, although the legend could be more descriptive.

      Weaknesses:

      A major weakness is the sheer volume of data presented (in the main text and supplement), which makes the results difficult to follow and retain: the work could essentially be two separate studies.<br /> Another major weakness is the lack of mechanistic insight into the role of the ERQC system in EV secretion and its disconnection to "classical" secretion, which is difficult to reconcile. Some insight into why EV secretion is decreased, and classical secretion is unaffected, would strengthen the significance of the findings. No mechanism is provided to explain why the ERQC mutants (Ugg1 mutant in particular) would have reduced and heterogeneously sized EVs. Furthermore, it is not convincing that the EV content changes would greatly impact fitness and virulence. The proteomics data showing reduced cargo in the Ugg1 mutant is not convincing and difficult to follow.

    4. Reviewer #2 (Public review):

      Summary:

      This study investigates the molecular function of the N-glycan-dependent endoplasmic reticulum protein quality control system (ERQC) in Cryptococcus neoformans and correlates this pathway with key features of C. neoformans virulence, especially those mediated by extracellular vesicle transport. The findings provide valuable insights into the connection between this pathway and the biogenesis of C. neoformans extracellular vesicles.

      Strengths:

      The strength of this study lies primarily in the careful selection of appropriate and current methodologies, which provide a solid foundation for the authors' results and conclusions across all presented data. All experiments are supported by well-designed and established controls in the study of C. neoformans, further strengthening the validity of the results and conclusions drawn from them. The study presents novel data on this important pathway in C. neoformans, establishing its connection with C. neoformans virulence. Interestingly, the findings led the authors to understand the relationship between this pathway and the transport of key fungal virulence factors via extracellular vesicles. This was demonstrated in the study, paving the way for a deeper understanding of extracellular vesicle biogenesis-a field still filled with gaps but one to which this study contributes solid data, helping to clarify aspects of this process.

      Weaknesses:

      I do not see significant weaknesses in this study. The experiments are well-grounded, and the results are clearly presented. I believe the only weakness is that the paper could be condensed. Sections like the discussion, for instance, are extremely lengthy, which may make reading and, consequently, understanding more challenging for many readers. Regarding the presentation of the results, while clear, the figures contain a lot of information, and I believe that some of this content could be moved to supplementary figures.

    5. Reviewer #3 (Public review):

      Summary:

      Cryptococcus neoformans is a global critical threat pathogen and the manuscript by Mota et al demonstrates that the pathogen's N-glycan-dependent protein quality control system regulates the capacity of the fungus to cause disease. The system makes sure that glycoproteins are folded correctly. The system is involved in the fitness and virulence of the fungus by regulating aspects of cellular robustness and the trafficking of virulence-associated compounds outside of the cell via transport in extracellular vesicles.

      Strengths:

      The investigators use multiple modalities to demonstrate that the system is involved in cryptococcal pathogenesis. The investigators generated mutant C. neoformans to explore the role of genes involved in the protein folding system. Basic microbiology, genetic analyses, proteomics, fluorescence and transmission microscopy, nanotracking analyses, and murine studies were performed. The validity of the findings are thus very high. Hypotheses are robustly demonstrated.

      Weaknesses:

      Aspects of the results should be better explained. Some results are extrapolated in their meaning beyond the extent of the data.

    1. eLife Assessment

      This study provides valuable evidence for the mechanism underlying KCNC1-related developmental and epileptic encephalopathy. The authors have generated and characterized a new knock-in mouse with a pathogenic mutation found in patients to determine the synaptic and circuit mechanisms contributing to KCNC1-associated epilepsy. They provide convincing evidence for reduced excitability of parvalbumin-positive fast-spiking interneurons, but not in neighboring excitatory neurons, and suggest that this may contribute to seizures and premature death in the mice.

    2. Reviewer #1 (Public review):

      Summary:

      The authors have created a new model of KCNC1-related DEE in which a pathogenic patient variant (A421V) is knocked into a mouse in order to better understand the mechanisms through which KCNC1 variants lead to DEE.

      Strengths:

      (1) The creation of a new DEE model of KCNC1 dysfunction.

      (2) InVivo phenotyping demonstrates key features of the model such as early lethality and several types of electrographic seizures.

      (3) The ex vivo cellular electrophysiology is very strong and comprehensive including isolated patches to accurately measure K+ currents, paired recording to measure evoked synaptic transmission, and the measurement of membrane excitability at different time points and in two cell types.

      Weaknesses:

      (1) The assertion that membrane trafficking is impaired by this variant could be bolstered by additional data.

      (2) In some experiments details such as the age of the mice or cortical layer are emphasized, but in others, these details are omitted.

      (3) The impairments in PV neuron AP firing are quite large. This could be expected to lead to changes in PV neuron activity outside of the hypersynchronous discharges that could be detected in the 2-photon imaging experiments, however, a lack of an effect on PV neuron activity is only loosely alluded to in the text. A more formal analysis is lacking. An important question in trying to understand mechanisms underlying channelopathies like KCNC1 is how changes in membrane excitability recorded at the whole cell level manifest during ongoing activity in vivo. Thus, the significance of this work would be greatly improved if it could address this question.

      (4) Myoclonic jerks and other types of more subtle epileptiform activity have been observed in control mice, but there is no mention of littermate control analyzed by EEG.

    3. Reviewer #2 (Public review):

      Summary:

      Wengert et al. generated and thoroughly characterized the developmental epileptic encephalopathy phenotype of Kcnc1A421V/+ knock-in mice. The Kcnc1 gene encodes the Kv3.1 channel subunit. Analogous to the role of BK channels in excitatory neurons, Kv3 channels are important for the recurrent high-frequency discharge in interneurons by accelerating the downward hyperpolarization of the individual action potential. Various Kcnc1 mutations are associated with developmental epileptic encephalopathy, but the effect of a recurrent A421V mutation was somewhat controversial and its influence on neuronal excitability has not been fully established. In order to determine the neurological deficits and underlying disease mechanisms, the authors generated cre-dependent KI mice and characterized them using neonatal neurological examination, high-quality in vitro electrophysiology, and in vivo imaging/electrophysiology analyses. These analyses revealed excitability defects in the PV+ inhibitory neurons associated with the emergence of epilepsy and premature death. Overall, the experimental data convincingly support the conclusion.

      Strengths:

      The study is well-designed and conducted at high quality. The use of the Cre-dependent KI mouse is effective for maintaining the mutant mouse line with premature death phenotype, and may also minimize the drift of phenotypes which can occur due to the use of mutant mice with minor phenotype for breeding. The neonatal behavior analysis is thoroughly conducted, and the in vitro electrophysiology studies are of high quality.

      Weaknesses:

      While not critically influencing the conclusion of the study, there are several concerns.

      In some experiments, the age of the animal in each experiment is not clearly stated. For example, the experiments in Figure 2 demonstrate impaired K+ conductance and membrane localization, but it is not clear whether they correlated with the excitability and synaptic defects shown in subsequent figures. Similarly, it is unclear how old mice the authors conducted EEG recordings, and whether non-epileptic mice are younger than those with seizures.

      The trafficking defect of mutant Kv3.1 proposed in this study is based only on the fluorescence density analysis which showed a minor change in membrane/cytosol ratio. It is not very clear how the membrane component was determined (any control staining?). In addition to fluorescence imaging, an addition of biochemical analysis will make the conclusion more convincing (while it might be challenging if the Kv3.1 is expressed only in PV+ cells).

      While the study focused on the superficial layer because Kv3.1 is the major channel subunit, the PV+ cells in the deeper cortical layer also express Kv3.1 (Chow et al., 1999) and they may also contribute to the hyperexcitable phenotype via negative effect on Kv3.2; the mutant Kv3.1 may also block membrane trafficking of Kv3.1/Kv3.2 heteromers in the deeper layer PV cells and reduce their excitability. Such an additional effect on Kv3.2, if present, may explain why the heterozygous A421V KI mouse shows a more severe phenotype than the Kv3.1 KO mouse (and why they are more similar to Kv3.2 KO). Analyzing the membrane excitability differences in the deep-layer PV cells may address this possibility.

      In Table 1, the A421V PV+ cells show a depolarized resting membrane potential than WT by ~5 mV which seems a robust change and would influence the circuit excitability. The authors measured firing frequency after adjusting the membrane voltage to -65mV, but are the excitability differences less significant if the resting potential is not adjusted? It is also interesting that such a membrane potential difference is not detected in young adult mice (Table 2). This loss of potential compensation may be important for developmental changes in the circuit excitability. These issues can be more explicitly discussed.

    4. Reviewer #3 (Public review):

      Summary:

      Here Wengert et al., establish a rodent model of KCNC1 (Kv3.1) epilepsy by introducing the A421V mutation. The authors perform video-EEG, slice electrophysiology, and in vivo 2P imaging of calcium activity to establish disease mechanisms involving impairment in the excitability of fast-spiking parvalbumin (PV) interneurons in the cortex and thalamic PV cells.

      Outside-out nucleated patch recordings were used to evaluate the biophysical consequence of the A421V mutation on potassium currents and showed a clear reduction in potassium currents. Similarly, action potential generation in cortical PV interneurons was severely reduced. Given that both potassium currents and action potential generation were found to be unaffected in excitatory pyramidal cells in the cortex the authors propose that loss of inhibition leads to hyperexcitability and seizure susceptibility in a mechanism similar to that of Dravet Syndrome.

      Strengths:

      This manuscript establishes a new rodent model of KCNC1-developmental and epileptic encephalopathy. The manuscript provides strong evidence that parvabumin-type interneurons are impaired by the A421V Kv3.1 mutation and that cortical excitatory neurons are not impaired. Together these findings support the conclusion that seizure phenotypes are caused by reduced cortical inhibition.

      Weaknesses:

      The manuscript identifies a partial mechanism of disease that leaves several aspects unresolved including the possible role of the observed impairments in thalamic neurons in the seizure mechanism. Similarly, while the authors identify a reduction in potassium currents and a reduction in PV cell surface expression of Kv3.1 it is not clear why these impairments would lead to a more severe disease phenotype than other loss-of-function mutations which have been characterized previously. Lastly, additional analysis of video-EEG data would be helpful for interpreting the extent of the seizure burden and the nature of the seizure types caused by the mutation.

    5. Author response:

      Reviewer #1 (Public review):

      Weaknesses:

      (1) The assertion that membrane trafficking is impaired by this variant could be bolstered by additional data.

      We agree with this comment and will perform additional analysis and experiments to support the assertion that membrane trafficking is impaired. As noted by the Reviewers, standard biochemical approaches to obtain such data may be challenging due to the fact that Kv3.1 is expressed in only a subset of cells and that we do not have a Kv3.1-A421V specific antibody.

      (2) In some experiments details such as the age of the mice or cortical layer are emphasized, but in others, these details are omitted.

      We appreciate that the Reviewer has noted this omission. We will include such details in the resubmission.

      (3) The impairments in PV neuron AP firing are quite large. This could be expected to lead to changes in PV neuron activity outside of the hypersynchronous discharges that could be detected in the 2-photon imaging experiments, however, a lack of an effect on PV neuron activity is only loosely alluded to in the text. A more formal analysis is lacking. An important question in trying to understand mechanisms underlying channelopathies like KCNC1 is how changes in membrane excitability recorded at the whole cell level manifest during ongoing activity in vivo. Thus, the significance of this work would be greatly improved if it could address this question.

      Yes, the impairments in neocortical PV-IN excitability are more marked than any other PV interneuronopathy that we have studied. We will include a more extensive analysis of the 2-photon imaging data in the resubmission. However, there are limitations to the inferences that can be made as to firing patterns based on 2-photon calcium imaging data, particularly for interneurons.

      (4) Myoclonic jerks and other types of more subtle epileptiform activity have been observed in control mice, but there is no mention of littermate control analyzed by EEG.

      We did not observe myoclonic jerks in control mice. This data will be included in the resubmission.

      Reviewer #2 (Public review):

      Weaknesses:

      In some experiments, the age of the animal in each experiment is not clearly stated. For example, the experiments in Figure 2 demonstrate impaired K+ conductance and membrane localization, but it is not clear whether they correlated with the excitability and synaptic defects shown in subsequent figures. Similarly, it is unclear how old mice the authors conducted EEG recordings, and whether non-epileptic mice are younger than those with seizures.

      We will include explicit information as to the age of the animals used for each experiment in the resubmission.

      The trafficking defect of mutant Kv3.1 proposed in this study is based only on the fluorescence density analysis which showed a minor change in membrane/cytosol ratio. It is not very clear how the membrane component was determined (any control staining?). In addition to fluorescence imaging, an addition of biochemical analysis will make the conclusion more convincing (while it might be challenging if the Kv3.1 is expressed only in PV+ cells).

      We will include additional information in the Methods section as to how the membrane component was determined in a revised version of the manuscript. We agree with Reviewer #2 regarding the limitations in the ability to further evaluate this.

      While the study focused on the superficial layer because Kv3.1 is the major channel subunit, the PV+ cells in the deeper cortical layer also express Kv3.1 (Chow et al., 1999) and they may also contribute to the hyperexcitable phenotype via negative effect on Kv3.2; the mutant Kv3.1 may also block membrane trafficking of Kv3.1/Kv3.2 heteromers in the deeper layer PV cells and reduce their excitability. Such an additional effect on Kv3.2, if present, may explain why the heterozygous A421V KI mouse shows a more severe phenotype than the Kv3.1 KO mouse (and why they are more similar to Kv3.2 KO). Analyzing the membrane excitability differences in the deep-layer PV cells may address this possibility.

      We will include recordings from PV-INs in deeper layers of the neocortex in the revised version of the manuscript, as requested.

      In Table 1, the A421V PV+ cells show a depolarized resting membrane potential than WT by ~5 mV which seems a robust change and would influence the circuit excitability. The authors measured firing frequency after adjusting the membrane voltage to -65mV, but are the excitability differences less significant if the resting potential is not adjusted? It is also interesting that such a membrane potential difference is not detected in young adult mice (Table 2). This loss of potential compensation may be important for developmental changes in the circuit excitability. These issues can be more explicitly discussed.

      We will include a more thorough discussion of this finding in the revised version of the manuscript. However, we do not completely understand this finding. It could be compensatory, as suggested by the Reviewer; however, it is transient and seems to be an isolated finding (i.e., there does not appear to be parallel “compensation” in other properties). Alternatively, it could be that impaired excitability of the Kcnc1-A421V/+ PV-INs may reflect impaired/delayed development, which itself is known to be activity-dependent.

      Reviewer #3 (Public review):

      Weaknesses:

      The manuscript identifies a partial mechanism of disease that leaves several aspects unresolved including the possible role of the observed impairments in thalamic neurons in the seizure mechanism. Similarly, while the authors identify a reduction in potassium currents and a reduction in PV cell surface expression of Kv3.1 it is not clear why these impairments would lead to a more severe disease phenotype than other loss-of-function mutations which have been characterized previously. Lastly, additional analysis of video-EEG data would be helpful for interpreting the extent of the seizure burden and the nature of the seizure types caused by the mutation.

      We agree with this comment. We studied neurons in the reticular thalamus as these cells are known to express Kv3.1 and are linked to epilepty pathogenesis. Yet, we focused on neocortical PV-INs over other Kv3.1-expressing neurons such as neurons of the reticular thalamus because we evaluated the impairments of intrinsic excitability to be more profound in neocortical PV-INs. Cross of Kcnc1-Flox(A421V)/+ mice to a cerebral cortex interneuron-specific driver that would avoid recombination in thalamus – such as Ppp1r2-Cre (RRID:IMSR_JAX:012686) – could assist in determining the relative contribution of thalamic reticular nucleus dysfunction to the overall phenotype, as performed by Makinson et al (2017) to address a similar question. There are of course other Kv3.1-expressing neurons in the brain, including in GABAergic interneurons in hippocampus and amygdala. We will include additional discussion in a revised version of the manuscript as to why we think there is more severe impairment in our Kcnc1-Flox(A421V)/+ mice relative to Kv3.1 and Kv3.2 knockout mice. We will include additional data on the epilepsy phenotype in the revised version of the manuscript, as requested.

    1. eLife Assessment

      This important study by Bi and colleagues employed a clever genetics screen to uncover the role of the GidB rRNA methylase in translation fidelity, under certain conditions, in Mycobacterium smegmatis. The findings are solid, supporting the conclusions, but the structural analyses lack the necessary rigor and depth to provide a clear mechanism. The work will be of interest to microbiologists.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Javid and colleagues worked to understand the molecular mechanisms involved in mistranslation in mycobacteria. They had previously discovered that mistranslation is an important mechanism underlying antibiotic tolerance in mycobacteria. Using a clever genetic screen they identify that deletion of gidB, a 16S ribosomal RNA methyltransferase, leads to lowered mistranslation (i.e. higher translational fidelity), but only in genetic backgrounds or environmental conditions that increase mistranslation rates.

      Strengths:

      The strengths of this manuscript are the clever genetic screen, the powerful mistranslation assays, and the clear writing and figures explaining a complex biological problem. Their identification of gidB as a factor important for mistranslation deepens our knowledge about this interesting phenomenon.

      Weaknesses:

      The structural work at the end feels like both an afterthought in terms of the science and the writing. I would suggest re-writing that section to be clearer about what the figure says and does not say. For example, the caption of Figure 6 appears to be more informative than the text and refers to concepts not present in the main text. In general, I found this section to be the most difficult to understand.

    3. Reviewer #2 (Public review):

      Summary:

      Protein synthesis - translation - involves repeated recognition and incorporation of amino-acyl-tRNAs by the ribosome. This process is a trade-off between the rate and accuracy of selection (for review see (Johansson et al, 2008; Wohlgemuth et al, 2011)). The ribosome does not just maximise the rate or the accuracy, it balances the two. Therefore, it is possible to select mutants that translate faster than the wt (but are sloppy) or that are very accurate (more than the wt) but translate slower. Slow translation is detrimental as it limits the rate of protein synthesis (and, therefore, growth) and hyper-accurate mutants accumulate mis-translated proteins, which is detrimental for the cell.

      Bi and colleagues employ genetics, MIC measurements, reporter assays, and structural biology to characterise the role of GidB rRNA methylase in translational accuracy in Mycobacterium smegmatis.

      Strengths:

      The genetics and phenotypic assays are convincing and establish the biological role of the methylase. The authors use a powerful set of complementary assays that convincingly demonstrate that the loss of GidB results in mistranslation.

      Weaknesses:

      (1) It would be essential to provide information regarding the growth rate and, ideally, translation rates in the gidB KO and the isogenic WT. As translation balances accuracy and speed, only characterising the speed is not sufficient to understand the phenomenon.

      (2) Cryo-EM analysis of vacant 70S ribosomes is not sufficient for understanding the mechanisms underlying the accuracy defects in the gidB KO. One should assemble and solve structurally near-cognate and non-cognate complexes. I believe the authors are over-interpreting the scant structural data they have. Furthermore, current representation makes it impossible to assess the resolution of the structure, especially in the areas of interest.

      References:

      Johansson M, Lovmar M, Ehrenberg M (2008) Rate and accuracy of bacterial protein synthesis revisited. Curr Opin Microbiol 11: 141-147<br /> Wohlgemuth I, Pohl C, Mittelstaet J, Konevega AL, Rodnina MV (2011) Evolutionary optimization of speed and accuracy of decoding on the ribosome. Philos Trans R Soc Lond B Biol Sci 366: 2979-2986.

    4. Author response:

      We thank the Dr. Ealand and Reviewers for their thoughtful comments on our submitted manuscript. We are in the process of revising our manuscript in light of the comments received, outlined below.

      In addition to the requested revisions, we have new data with M. tuberculosis strain H37Rv +/- gidB deletion (and complementation), confirming that deletion of gidB sensitizes the strain to rifampicin, and extending our findings to pathogenic tuberculosis. This will also be incorporated into the revised manuscript.

      Reviewer #1:

      (1) The structural work at the end feels like both an afterthought in terms of the science and the writing. I would suggest re-writing that section to be clearer about what the figure says and does not say. For example, the caption of Figure 6 appears to be more informative than the text and refers to concepts not present in the main text. In general, I found this section to be the most difficult to understand.

      We are rewriting this section to make it more coherent with the rest of the manuscript.

      (2) "delta-gidB" is written out in the caption of Figure 6. Line 234: gidB not italics.

      Thank you, these changes will be incorporated in the revised manuscript.

      Reviewer #2:

      (1) It would be essential to provide information regarding the growth rate and, ideally, translation rates in the gidB KO and the isogenic WT. As translation balances accuracy and speed, only characterising the speed is not sufficient to understand the phenomenon.

      We are performing these assays and will incorporate them in the revised manuscript.

      (2) Cryo-EM analysis of vacant 70S ribosomes is not sufficient for understanding the mechanisms underlying the accuracy defects in the gidB KO. One should assemble and solve structurally near-cognate and non-cognate complexes. I believe the authors are over-interpreting the scant structural data they have. Furthermore, current representation makes it impossible to assess the resolution of the structure, especially in the areas of interest.

      While we agree with the Reviewer that structures of translating ribosomes will be most informative in elucidating the molecular mechanism(s) by which methylation (or not) by GidB contributes to mistranslation, those experiments are ongoing and beyond the scope of the current study. Unlike E. coli ribosomes, for which there are a plethora of structures for mutants available, there are very structures of mycobacterial ribosomes beyond wild-type apo ribosomes. Therefore we feel that the structures of apo mycobacterial ribosomes +/- GidB-mediated methylation are still of value, and a necessary “first step” for the mechanistic work alluded to above. Secondly, the apo ribosome structures still hint at potential mechanisms by which mistranslation and 16S rRNA methylation may impact on each other – as in the comments to R#1 above, we are revising the text to increase clarity and coherence of this section.

    1. eLife Assessment

      The study follows up on previous work suggesting that lower glucose concentrations are protective from sepsis but put the patient at risk for hypoglycemia. In this paper, the authors identify that a slightly higher dose of glucose is still protective but no longer puts the patients at risk for hypoglycemia. The study is important, supported by convincing data, and will be of interest to a broad audience.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript the authors follow up on their published observation that providing a lower glucose parental nutrition (PN) reduces sepsis from a common pathogen [Staphylococcus epidermitis (SE)] in preterm piglets. Here they found that a slightly higher dose of glucose could thread the needle and get the protective effects of low glucose without incurring significant hypoglycemia. They then investigate whether change in low glucose PN impacts metabolism to confer this benefit. The finding that lower glucose reduces sepsis is important as sepsis is a major cause of morbidity and mortality in preterm infants, and adjusting PN composition is a feasible intervention.

      Strengths:

      (1) They address a highly significant problem of neonatal sepsis in preterm infants using a preterm piglet model.<br /> (2) They have compelling data in this paper (and in a previous publication, ref 27) that low glucose PN confers a survival advantage. A downside of the low glucose PN is hypoglycemia which they mitigate in this paper by using a slightly high amount of glucose in the PN.<br /> (3) The experiment where they change PN from high to low glucose after infection is very important to determine if this approach might be used clinically. Unfortunately, this did not show an ability to reduce sepsis risk with this approach.<br /> (4) They produce an impressive multiomics data set from this model of preterm piglet sepsis which is likely to provide additional insights into the pathogenesis of preterm neonatal sepsis.

      Weaknesses:

      (1) Piglets on the low glucose PN had consistently lower density of SE (~1 log) across all timepoints. This may be due to changes in immune response leading to better clearance or it could be due to slower growth in lower glucose environment. These possibilities are not fully disentangled in this study.

      (2) Many differences in the different omics (transcriptomics, metabolomics, proteomics) were identified in the SE-LOW vs SE-HIGH comparison. Since the bacterial load is very different between these conditions, could the changes be due to bacterial load rather than metabolic reprograming from the low glucose PN? The authors argue in supplementary figure 1F that density of SE in blood does not correlate with sepsis implying that bacterial load is not the driver of outcome. The authors recently published some additional analysis that may be helpful to reference in this manuscript.

      (3) Further, expanding upon a model to better understand the complex relationship between differences in supplemental glucose infusion, blood glucose levels, bacterial load, host responses and how they impact the development of sepsis would be helpful. These complex relationships are difficult to fully disentangle, but one could consider infusing the same quantity of heat-killed bacteria under different glucose conditions to see if the glucose levels drive outcomes independently of bacterial burden.

    3. Reviewer #2 (Public review):

      The authors demonstrate that a low parenteral glucose regimen can lead to improved bacterial clearance and survival from Staph epi sepsis in newborn pigs without inducing hypoglycemia, as compared to a high glucose regimen. Using RNA-seq, metabolomic, and proteomic data, the authors conclude that this is primarily mediated by altered hepatic metabolism.

      The authors have addressed the concerns raised by the reviewers in their revised manuscript and have added additional information in the results and discussion part.<br /> Please address in Fig. 3- the genes PGM2 and GCK, which the authors mention, are downregulated in SE-Low compared to SE-high, but these are actually less downregulated in the SE group compared to Control group, where the the Con-low shows even more decrease in these genes compared to Con-high. So if anything, these genes are getting upregulated by infection.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors follow up on their published observation that providing a lower glucose parental nutrition (PN) reduces sepsis from a common pathogen [Staphylococcus epidermitis (SE)] in preterm piglets. Here they found that a higher dose of glucose could thread the needle and get the protective effects of low glucose without incurring significant hypoglycemia. They then investigate whether the change in low glucose PN impacts metabolism to confer this benefit. The finding that lower glucose reduces sepsis is important as sepsis is a major cause of morbidity and mortality in preterm infants, and adjusting PN composition is a feasible intervention.

      Strengths:

      (1) They address a highly significant problem of neonatal sepsis in preterm infants using a preterm piglet model.

      (2) They have compelling data in this paper (and in a previous publication, ref 27) that low glucose PN confers a survival advantage. A downside of the low glucose PN is hypoglycemia which they mitigate in this paper by using a slightly high amount of glucose in the PN.

      (3) The experiment where they change PN from high to low glucose after infection is very important to determine if this approach might be used clinically. Unfortunately, this did not show an ability to reduce sepsis risk with this approach. Perhaps this is due to the much lower mortality in the high glucose group (~20% vs 87% in the first figure).

      (4) They produce an impressive multiomics data set from this model of preterm piglet sepsis which is likely to provide additional insights into the pathogenesis of preterm neonatal sepsis.

      Weaknesses:

      (1) The high glucose control gives very high blood glucose levels (Figure 1C). Is this the best control for typical PN and glucose control in preterm neonates? Is the finding that low glucose is protective or high glucose is a risk factor for sepsis?

      This work is a follow-up from our previous work where we explored different PN glucose regimens. Taken together our experiments heavily imply that glucose provision is associated to severity in a seemingly linear manner. In the clinical setting, there is no fixed glucose provision, but guidelines specify ranges that are acceptable. However, these guidelines do not take possible infections into account and are designed to optimize growth outcomes. Increased provision of glucose to preterm neonates may therefore increase their infection risk, but parenteral glucose cannot be entirely avoided as it would lead to hypoglycaemia and associated brain damage. In the present paper the reduced glucose PN reflects the lowest end of the recommended PN glucose intake. More work is needed to figure out the best glucose provision to infected preterm newborns, balancing positive and negative factors.

      (2) In Figure 1B, preterm piglets provided the high glucose PN have 13% survival while preterm piglets on the same nutrition in Figure 6B have ~80% survival. Were the conditions indeed the same? If so, this indicates a large amount of variation in the outcome of this model from experiment to experiment.

      In the follow-up experiment outlined in Figure 6 we reduced the follow-up time to 12 hours in an effort to minimize the suffering of the animals. We did this because we could detect relevant differences in the immune response between High and low glucose infected pigs as 12 hours. If we had extended the follow-up experiment to 22 hours we would likely have seen a much increased mortality.

      (3) Piglets on the low glucose PN had consistently lower density of SE (~1 log) across all time points. This may be due to changes in immune response leading to better clearance or it could be due to slower growth in a lower glucose environment.

      We agree with this assessment and have adjusted our result section to reflect this.

      (4) Many differences in the different omics (transcriptomics, metabolomics, proteomics) were identified in the SE-LOW vs SE-HIGH comparison. Since the bacterial load is very different between these conditions, could the changes be due to bacterial load rather than metabolic reprogramming from the low glucose PN?

      We analyzed the relationship between bacterial burdens and mortality and found that it did not correlate within each of the treatment groups. We have now added this data to the results section as supplemental and report this fact in the section called “Reduced glucose supply increases hepatic OXPHOS and gluconeogenesis and attenuates inflammatory pathways”. This finding inspired us to further explore the relationship between bacterial burdens and infection responses in our model which has resulted in our recent preprint: Wu et at. Regulation of host metabolism and defense strategies to survive neonatal infection. BioRxiv 2024.02.23.581534; doi: https://doi.org/10.1101/2024.02.23.581534

      Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate that a low parenteral glucose regimen can lead to improved bacterial clearance and survival from Staph epi sepsis in newborn pigs without inducing hypoglycemia, as compared to a high glucose regimen. Using RNA-seq, metabolomic, and proteomic data, the authors conclude that this is primarily mediated by altered hepatic metabolism.

      Strengths:

      Well-defined controls for every time point, with multiple time points and biological replicates. The authors used different experimental strategies to arrive at the same conclusion, which lends credibility to their findings. The authors have published the negative findings associated with their study, including the inability to reverse sepsis-related mortality after switching from SE-high to SE-low at 3h or 6h and after administration of hIAIP.

      Weaknesses:

      (1) The authors mention, and it is well-known, that Staph epi is primarily involved in late-onset sepsis. The model of S. epi sepsis used in this study clearly replicates early-onset sepsis, but S. epi is extremely rare in this time period. How do the authors justify the clinical relevance of this model?

      The distinction between early and late onset sepsis makes sense clinically because they are likely to be caused by different organisms and therefore require different empirical antibiotic regimes. Early onset sepsis is caused by organisms transferred perinatally often following chorioamnionitis or uro-gential maternal infections (Strep. agalacticae/E. coli) whereas Late onset sepsis is likely caused by organisms from indwelling catheters or mucosal surfaces, most often coagulase negative staphylococci. Timing of an infection after birth of course plays a role, but the virulence factors of the pathogen probably plays a large role in shaping the immune response. Therefore, even though the infection in our model is initiated on the first day after birth, the organism that we use, Staph epidermidids, makes it a better model for pathogenesis of late onset sepsis. However, it is also important to acknowledge that the pathophysiology of “sepsis” may be similar despite timing and pathogen and depends on the degree of immune activation and downstream effects on organs.

      (2) The authors find that the neutrophil subset of the leukocyte population is diminished significantly in the SE-low and SE-high populations. However, they conclude on page 10 that "modulations of hepatic, but not circulating immune cell metabolism, by reduced glucose supply..." and this is possible because the authors have looked at the entire leukocyte transcriptome. I am curious about why the authors did not sequence the neutrophil-specific transcriptome.

      We collected the whole blood transcript during the experiments, which reflect the transcription profile of all the circulating leucocytes. Since we did not do single cell RNA sequencing during the experiment there is no possibility of isolating the neutrophil transcriptome at this time. Your point however is valid and we will reconsider incorporating single cell transcriptomics in future experiments.

      (3) The authors use high (30g/k/d) and low (7.2g/k/d) glucose regimens. These translate into a GIR of 21 and 5 mg/k/min respectively. A normal GIR for a preterm infant is usually 5-8, and sometimes up to 10. Do the authors have a "safe GIR" or a threshold they think we cannot cross? Maybe a point where the metabolism switch takes place? They do not comment on this, especially as GIR and glucose levels are continuous variables and not categorical.

      Our reduced glucose PN was chosen as it corresponded with the low end of recommended guidelines for PN glucose intake. There likely is not a “safe GIR” as the clinical responses to glucose intake during infections do not seem binary but increase with glucose intake. It is also important to remember that the reduced glucose intervention still resulted in significant morbidity and a 25% mortality within 22 hours. There is therefore still vast room for improvement, but even though further reduction in PN glucose would probably provide further protection it would entail dangerous hypoglycaemia (as described in our previous paper). The findings in this current paper has prompted us to explore several strategies to replace parenteral glucose with alternative macronutrients. Thus, the optimal PN for infected newborns would probably differ from standard PN in all macronutrients and will require much more pre- and clinical research.

      (4) In Figures 2B and C the authors show that SE-high and SE-low animals have differences in the oxphos, TCA, and glycolytic pathways. The authors themselves comment in the Supplementary Table S1B, E-F that these same metabolic pathways are also different in the Con-Low and Con-high animals, it is just the inflammatory pathways that are not different in the non-infected animals. How can they then justify that it is these metabolic pathways specifically which lead to altered inflammatory pathways, and not just the presence of infection along with some other unfound mechanism?

      It is to be expected that the inflammatory pathways do not differ between the Con-Low and Con-High groups as there is no infection to induce these pathways. The identified metabolic pathways that differ between SE-High and SE-Low animals seem to us the best explanation of the differences in clinical phenotype.

      (5) The authors mention in Figure 1F that SE-low animals had lower bacterial burdens than SE-high animals, but then go on to infer that the inflammatory cytokine differences are attributed to a rewiring of the immune response. However, they have not normalized the cytokine levels to the bacterial loads, as the differences in the cytokines might be attributed purely to a difference in bacterial proliferation/clearing.

      Please see our response to reviewer #1

      (6) The authors mention that switching from SE-high to SE-low at 3 or 6 h time points does not reduce mortality. Have the authors considered the reverse? Does hyperglycemia after euglycemia initially, worsen mortality? That would really conclude that there is some metabolic reprogramming happening at the very onset of sepsis and it is a lost battle after that.

      A very good point that we have not explored yet, we have added this consideration to the discussion and slightly amended our conclusions of this follow-up experiment.

      Reviewer #3 (Public Review):

      Summary:

      Baek and colleagues present important follow-up work on the role of serum glucose in the management of neonatal sepsis. The authors previously showed high glucose administration exacerbated neonatal sepsis, while strict glucose control improved outcomes but caused hypoglycemia. In the current report they examined the effect of a more tailored glucose management approach on outcomes and examined hepatic gene expression, plasma metabolome/proteome, blood transcriptome, as well as the the therapeutic impact of hIAIP. The authors leverage multiple powerful approaches to provide robust descriptive accounts of the physiologic changes that occur with this model of sepsis in these various conditions. Strengths:

      (1) Use of preterm piglet model.

      (2) Robust, multi-pronged approach to address both hepatic and systemic implications of sepsis and glucose management.

      (3) Trial of therapeutic intervention - glucose management (Figure 6), hIAIP (Figure 7).

      Weaknesses:

      (1) The translational role of the model is in question. CONS is rarely if ever a cause of EOS in preterm neonates. The model. uses preterm pigs exposed at 2 hours of age. This model most likely replicates EOS.

      Please see our response to Reviewer #2

      (2) Throughout the manuscript it is difficult to tell from which animals the data are derived. Given the ~90% mortality in the experimental CONS group, and 25% mortality in the intervention group, how are the data from animals "at euthanasia" considered? Meaning - are data from survivors and those euthanized grouped together? This should be clarified as biologically these may be very different populations (ie, natural survivor vs death).

      This is a very valid point. For all endpoints that are analyzed “at euthanasia” the age of the animal will vary. Some will have been euthanized early due to clinical deterioration and some will have survived all the way to the end of the experiment. This needs to be kept in mind when interpreting the results. We have further highlighted this point in the discussion and made it clear to the reader at what time-point each analysis was performed.

      (3) With limited time points (at euthanasia ) for hepatic transcriptomics (Figure 2), plasma metabolite (Figure 3) blood transcriptome (Figure 4), and plasma proteome (Figure 5) it is difficult to make conclusions regarding mechanisms preceding euthanasia. Per methods, animals were euthanized with acidosis or clinical decompensation. Are the reported findings demonstrative of end-organ failure and deterioration leading to death, or reflective of events prior?

      Yes, all organ specific endpoints are snapshots of the state of the animals at the time of euthanasia, pooling together animals that succumbed to sepsis and those that survived to 22 hours post infection. These results therefore reflect the end-state of the infection we cannot be sure when the differences between groups manifested themselves. However, given the stark differences in plasma lactate at 12 hours post infection it is likely that changes to metabolism occurred before most of animals succumbed to sepsis.

      We agree this is a weakness in our model, but we have since published a pre-print where we have further explored how metabolic adaptations shape the fate of similarly infected preterm pigs: BioRxiv 2024.02.23.581534; doi: https://doi.org/10.1101/2024.02.23.581534

      (4) Data are descriptive without corresponding "omics" from interventions (glucose management and/or hIAIP) or at least targeted assessment of key differences.

      We only did in-depth analysis of the glucose intervention as this showed the most promising clinical effects that warranted further in-depth investigation. It is possible that further insights could be gained from in-depth analysis of the other interventions but given that there were no obvious clinical befits we refrained from that.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I am intrigued that mortality was not correlated to bacterial burden. Please provide the "data not shown" as this would help the reader understand better whether the difference in bacterial burden is driving the phenotypes and findings of the low glucose group.

      We have added this data to supplementary figure 1.  

      Reviewer #2 (Recommendations For The Authors):

      (1) I would urge the authors to consider a neutrophil-specific transcriptomic analysis. I understand that this would add significantly to the resubmission process. If the authors wish to include that as a future direction instead, they need to specifically mention the limitations of whole blood transcriptomics and how different immune cell types react differently to bacterial antigens.

      We agree with your considerations but we cannot include that data using the whole blood method applied in the experiment. We have added your consideration to the discussions.

      (2) I urge the authors to remove any impression that this is a model of late-onset sepsis, which is implied from the introduction, lines 3 and 4.

      Our intention was not to directly suggest that our model is a perfect reflection of late-onset sepsis but rather to highlight the relevance of using a pathogen commonly associated with LOS. We believe our model primarily captures the effects of intense pro-inflammatory immune activation, which may have parallels with various forms of sepsis, including LOS.

      Reviewer #3 (Recommendations For The Authors):

      Drawing on the robust nature of your "omics", identify key measures and test whether they are altered earlier in the development of clinical sepsis. Test whether these are altered by the intervention.

      A very valid point, at the moment it is not possible for us to explore this within the confines of these experiments. But, building upon these findings and the ones in our recent preprint we are confident that shifts in hepatic ratio of Oxidative phosphorylation and gluconeogenesis vs glycolysis shape the immune response to infections in neonates. In our upcoming experiments we are planning to incorporate plasma metabolomics at earlier timepoints to monitor when shifts in metabolism occur. However, given the heterogeneity of pigs, as opposed to inbred rodent models, sacrificing animals at fixed timepoints to gauge their organ function will be hard to interpret as it is impossible to know what the end state of the particular animal would have been. Therefore longitudinal sampling of liver tissue, during the course of infection would be challenging.

    1. eLife Assessment

      In this important study, significant advancements are made in how cell division in Chlamydia trachomatis, lacking FtsZ, is mediated. With the careful use of fluorescence microscopy and genetic tools, the evidence identifying the DNA translocase, FtsK, as an early and essential component of the divisome, is convincing. As this role is distinct from what has been found in most other bacteria, this study will be of broad interest to microbiologists and molecular biologists.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, Harpring et al. investigated divisome assembly in Chlamydia trachomatis serovar L2 (Ct), an obligate intracellular bacterium that lacks FtsZ, the canonical master regulator of bacterial cell division. They find that divisome assembly is initiated by the protein FtsK in Ct by showing that it forms discrete foci at the septum and future division sites. Additionally, knocking down ftsK prevents divisome assembly and inhibits cell division, further supporting their hypothesis that FtsK regulates divisome assembly. Finally, they show that MreB is one of the last chlamydial divisome proteins to arrive at the site of division and is necessary for the formation of septal peptidoglycan rings but does not act as a scaffold for division assembly as previously proposed.

      Strengths:

      The authors use microscopy to clearly show that FtsK forms foci both at the septum as well as at the base of the progenitor cell where the next septum will form. They also show that the Ct proteins PBP2, PBP3, MreC, and MreB localize to these same sites suggesting they are involved in the divisome complex.

      Using CRISPRi the authors knock down ftsK and find that most cells are no longer able to divide and that PBP2 and PBP3 no longer localized to sites of division suggesting that FtsK is responsible for initiating divisome assembly. They also performed a knockdown of pbp2 using the same approach and found that this also mostly inhibited cell division. Additionally, FtsK was still able to localize in this strain, however PBP3 did not, suggesting that FtsK acts upstream of PBP2 in the divisome assembly process while PBP2 is responsible for the localization of PBP3.

      The authors also find that performing a knockdown of ftsK also prevents new PG synthesis further supporting the idea that FtsK regulates divisome assembly. They also find that inhibiting MreB filament formation using A22 results in diffuse PG, suggesting that MreB filament formation is necessary for proper PG synthesis to drive cell division.

      Overall the authors propose a new hypothesis for divisome assembly in an organism that lacks FtsZ and use a combination of microscopy and genetics to support their model that is rigorous and convincing. The finding that FtsK, rather than a cytoskeletal or "scaffolding" protein is the first division protein to localize to the incipient division site is unexpected and opens up a host of questions about its regulation. The findings will progress our understanding of how cell division is accomplished in bacteria with non-canonical cell wall structure and/or that lack FtsZ.

      Weaknesses:

      No major weaknesses were noted in the data supporting the main conclusions. However, there was a claim of novelty in showing that multiple divisome complexes can drive cell wall synthesis simultaneously that was not well-supported (i.e. this has been shown previously in other organisms). In addition, there were minor weaknesses in data presentation that do not substantially impact interpretation (e.g. presenting the number of cells rather than the percentage of the population when quantifying phenotypes and showing partial western blots instead of total western blots).

    3. Reviewer #2 (Public review):

      Summary:

      Chlamydial cell division is a peculiar event, whose mechanism was mysterious for many years. C. trachomatis division was shown to be polar and involve a minimal divisome machinery composed of both homologues of divisome and elongasome components, in the absence of an homologue of the classical division organizer FtsZ. In this paper, Harpring et al., show that FtsK is required at an early stage of the chlamydial divisome formation.

      Strengths:

      The manuscript is well-written and the results are convincing. Quantification of divisome component localization is well performed, number of replicas and number of cells assessed are sufficient to get convincing data. The use of a CRISPRi approach to knock down some divisome components is an asset and allows a mechanistic understanding of the hierarchy of divisome components.

      Weaknesses:

      The authors did not analyse the role of all potential chlamydial divisome components and did not show how FtsK may initiate the positioning of the divisome. Their conclusion that FtsK initiates the assembly of the divisome is an overinterpretation and is not backed by the data. However, data show convincingly that FtsK, if perhaps not the initiator of chlamydial division, is definitely an early and essential component of the chlamydial divisome.

    4. Reviewer #3 (Public review):

      Summary:

      The obligate intracellular bacterium Chlamydia trachomatis (Ct) divides by binary fission. It lacks FtsZ, but still has many other proteins that regulate the synthesis of septal peptidoglycan, including FtsW and FtsI (PBP3) as well as divisome proteins that recruit and activate them, such as FtsK and FtsQLB. Interestingly, MreB is also required for the division of Ct cells, perhaps by polymerizing to form an FtsZ-like scaffold. Here, Harpring et al. show that MreB does not act early in division and instead is recruited to a protein complex that includes FtsK and PBP2/PBP3. This indicates that Ct cell division is organized by a chimera between conserved divisome and elongasome proteins. Their work also shows convincingly that FtsK is the earliest known step of divisome activity, potentially nucleating the divisome as a single protein complex at the future division site. This is reminiscent of the activity of FtsZ, yet fundamentally different.

      Strengths:

      The study is very well written and presented, and the data are convincing and rigorous. The data underlying the proposed localization dependency order of the various proteins for cell division is well justified by several different approaches using small molecule inhibitors, knockdowns, and fluorescent protein fusions. The proposed dependency pathway of divisome assembly is consistent with the data and with a novel mechanism for MreB in septum synthesis in Ct.

      Weaknesses:

      The paper could be improved by including more information about FtsK, the "focus" of this study. For example, if FtsK really is the FtsZ-like nucleator of the Ct divisome, how is the Ct FtsK different sequence-wise or structurally from FtsK of, e.g. E. coli? Is the N-terminal part of FtsK sufficient for cell division in Ct like it is in E. coli, or is the DNA translocase also involved in focus formation or localization? Addressing those questions would put the proposed initiator role of FtsK in Ct in a better context and make the conclusions more attractive to a wider readership.

      Another weakness is that the title of the paper implies that FtsK alone initiates divisome assembly. However, the data indicate only that FtsK is important at an early stage of divisome assembly, not that it is THE initiator. I suggest modifying the title to account for this--perhaps "FtsK is required to initiate....".

    1. eLife Assessment

      This valuable study uses steered molecular dynamics simulations to interrogate force transmission in the mechanosensitive NOMPC channel, which plays roles including soft-touch perception, auditory function, and locomotion. The finding that the ankyrin spring transmits force through torsional rather than compression forces may help understand the entire TRP channel family. The evidence is, however, considered to be still incomplete. It could be strengthened by testing how the channel responds to different twisting and compressional force magnitudes over longer simulation times to see a full gating motion, or to prove that the partial or initial motion observed relates to physiological gating. Experimental validation of reduced mechano-sensitivity through mutagenesis of proposed ankyrin/TRP domain coupling interactions would be best to enhance the manuscript.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript uses molecular dynamics simulations to understand how forces felt by the intracellular domain are coupled to the opening of the mechanosensitive ion channel NOMPC. The concept is interesting - as the only clearly defined example of an ion channel that opens due to forces on a tethered domain, the mechanism by which this occurs is yet to be fully elucidated. The main finding is that twisting of the transmembrane portion of the protein - specifically via the TRP domain that is conserved within the broad family of channels- is required to open the pore. That this could be a common mechanism utilised by a wide range of channels in the family, not just mechanically gated ones, makes the result significant. It is intriguing to consider how different activating stimuli can produce a similar activating motion within this family. However, the support for the finding can be strengthened as the authors cannot yet exclude that other forces could open the channel if given longer or at different magnitudes. In addition, they do not see the full opening of the channel, only an initial dilation. Even if we accept that twist is essential for this, it may be that it is not sufficient for full opening, and other stimuli are required.

      Strengths:

      Demonstrating that rotation of the TRP domain is the essential requirement for channel opening would have significant implications for other members of this channel family.

      Weaknesses:

      The manuscript centres around 3 main computational experiments. In the first, a compression force is applied on a truncated intracellular domain and it is shown that this creates both a membrane normal (compression) and membrane parallel (twisting) force on the TRP domain. This is a point that was demonstrated in the authors' prior eLife paper - so the point here is to quantify these forces for the second experiment.

      The second experiment is the most important in the manuscript. In this, forces are applied directly to two residues on the TRP domain with either a membrane normal (compression) or membrane parallel (twisting) direction, with the magnitude and directions chosen to match that found in the first experiment. Only the twisting force is seen to widen the pore in the triplicate simulations, suggesting that twisting, but not compression can open the pore. This result is intriguing and there appears to be a significant difference between the dilation of pore with the two force directions. However, there are two caveats to this conclusion. Firstly, is the magnitude of the forces - the twist force is larger than the applied normal force to match the result of experiment 1. However, it is possible that compression could also open the pore at the same magnitude or if given longer. It may be that twist acts faster or more easily, but I feel it is not yet possible to say it is the key and exclude the possibility that compression could do something similar. I also note that when force was applied to the AR domain in experiment 1, the pore widened more quickly than with the twisting force alone, suggesting that compression is doing something to assist with opening. Given that the forces are likely to be smaller in physiological conditions it could still be critical to have both twist and compression present. As this is the central aspect of the study, I believe that examining how the channel responds to different force magnitudes could strengthen the conclusions and recommend additional simulations be done to examine this.

      The second important consideration is that the study never sees a full pore opening, but rather a widening that is less than that seen in open state structures of other TRP channels and insufficient for rapid ion currents. This is something the authors acknowledge in their prior manuscript in eLife 2021. While this may simply be due to the limited timescale of the simulations, it needs to be clearly stated as a caveat to the conclusions. Twist may be the key to getting this dilation, but we don't know if it is the key to full pore opening. To demonstrate that the observed dilation is a first step in pore opening, then a structural comparison to open-state TRP channels would be beneficial to provide evidence that this motion is along the expected pathway of channel gating.

      Experiment three considers the intracellular domain and determines the link between compression and twisting of the intracellular AR domain. In this case, the end of the domain is twisted and it is shown that the domain compresses, the converse to the similar study previously done by the authors in which compression of the domain was shown to generate torque. While some additional analysis is provided on the inter-residue links that help generate this, this is less significant than the critical second experiment.

    3. Reviewer #2 (Public review):

      This study uses all-atom MD simulation to explore the mechanics of channel opening for the NOMPC mechanosensitive channel. Previously the authors used MD to show that external forces directed along the long axis of the protein (normal to the membrane) result in AR domain compression and channel opening. This force causes two changes to the key TRP domains adjacent to the channel gate: 1) a compressive force pushes the TRP domain along the membrane normal, while 2) a twisting torque induces a clock-wise rotation on the TRP domain helix when viewing the bottom of the channel from the cytoplasm. Here, the authors wanted to understand which of those two changes is responsible for increasing the inner pore radius, and they show that it is the torque. The simulations in Figure 2 probe this question with different forces, and we can see the pore open with parallel forces in the membrane, but not with the membrane-normal forces. I believe this result as it is reproducible, the timescales are reaching 1 microsecond, and the gate is clearly increasing diameter to about 4 Å. This seems to be the most important finding in the paper, but the impact is limited since the authors already show how forces lead to channel opening, and this is further teasing apart the forces and motions that are actually the ones that cause the opening.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript by Duan and Song interrogates the gating mechanisms and specifically force transmission in mechanosensitive NOMPC channels using steered molecular dynamics simulations. They propose that the ankyrin spring can transmit force to the gate through torsional forces adding molecular detail to the force transduction pathways in this channel.

      3. Constant velocity or constant force<br /> For the SMD the authors write "and a constant velocity or constant force". It's unclear from this reviewer's perspective what is used to generate the simulation data.

      Strengths:

      Detailed, rigorous simulations coupled with a novel model for force transduction.

      Weaknesses:

      Experimental validation of reduced mechanosensitivity through mutagenesis of proposed ankyrin/TRP domain coupling interactions would greatly enhance the manuscript. I have some additional questions documented below:

      (1) The membrane-parallel torsion force can open NOMPC<br /> How does the TRP domain interact with the S4-S5 linker? In the original structural studies, the coordination of lipids in this region seems important for gating. In this manner does the TRP domain and S4-S5 linker combined act like an amphipathic helix as suggested first for MscL (Bavi et al., 2016 Nature Communications) and later identified in many MS channels (Kefauver et al., 2020 Nature).

      (2) Torsional forces on shorter ankyrin repeats of mammalian TRP channels<br /> Is it possible torsional forces applied to the shorter ankyrin repeats of mammalian TRPs may also convey force in a similar manner?

      (3) Constant velocity or constant force<br /> For the SMD the authors write "and a constant velocity or constant force". It's unclear from this reviewer's perspective which is used to generate the simulation data.

    1. eLife Assessment

      This valuable study suggests that Naa10, an N-α-acetyltransferase with known mutations that disrupt neurodevelopment, acetylates Btbd3, which has been implicated in neurite outgrowth and obsessive-compulsive disorder, in a manner that regulates F-actin dynamics to facilitate neurite outgrowth. While the study provides promising insights and biochemical, co-immunoprecipitation, and proteomic data that enhance our understanding of protein N-acetylation in neuronal development, the evidence supporting larger claims is incomplete. Nonetheless, the implications of these findings are noteworthy, particularly regarding neurodevelopmental and psychiatric conditions tied to altered expression of Naa10 or Btbd3.

    2. Reviewer #1 (Public review):

      The manuscript examines the role of Naa10 in cKO animals, in immortalized neurons, and in primary neurons. Given that Naa10 mutations in humans produce defects in nervous system function, the authors used various strategies to try to find a relevant neuronal phenotype and its potential molecular mechanism.

      This work contains valuable findings that suggest that the depletion of Naa10 from CA1 neurons in mice exacerbates anxiety-like behaviors. Using neuronal-derived cell lines authors establish a link between N-acetylase activity, Btbd3 binding to CapZb, and F-actin, ultimately impinging on neurite extension. The evidence demonstrating this is in most cases incomplete, since some key controls are missing and clearly described or simply because claims are not supported by the data. The manuscript also contains biochemical, co-immunoprecipitation, and proteomic data that will certainly be of value to our knowledge of the effects of protein N--acetylation in neuronal development and function.

    3. Reviewer #2 (Public review):

      In this study, the authors sought to elucidate the neural mechanisms underlying the role of Naa10 in neurodevelopmental disruptions with a focus on its role in the hippocampus. The authors use an impressive array of techniques to identify a chain of events that occurs in the signaling pathway starting from Naa10 acetylating Btbd3 to regulation of F-actin dynamics that are fundamental to neurite outgrowth. They provide convincing evidence that Naa10 acetylates Btbd3, that Btbd3 facilitates CapZb binding to F-actin in a Naa10 acetylation-dependent manner, and that this CapZb binding to F-actin is key to neurite outgrowth. Besides establishing this signaling pathway, the authors contribute novel lists of Naa10 and Btbd3 interacting partners, which will be useful for future investigations into other mechanisms of action of Naa10 or Btbd3 through alternative cell signaling pathways. The evidence presented for an anxiety-like behavioral phenotype as a result of Naa10 dysfunction is mixed and tenuous, and assays for the primary behaviors known to be altered by Naa10 mutations in humans were not tested. As such, behavioral findings and their translational implications should be interpreted with caution. Finally, while not central to the main cell signaling pathway delineated, the characterization of brain region-specific and cell maturity of Naa10 expression patterns was presented in few to single animals and not quantified, and as such should also be interpreted with caution. On a broader level, these findings have implications for neurodevelopment and potentially, although not tested here, synaptic plasticity in adulthood, which means this novel pathway may be fundamental for brain health.

      Summarized list of minor concerns

      (1) The early claims of the manuscript are supported by very small sample sizes (often 1-3) and/or lack of quantification, particularly in Figures S1 and 1.

      (2) Evidence is insufficient for CA1-specific knockdown of Naa10.

      (3) The relationship between the behaviors measured, which centered around mood, and Ogden syndrome, was not clear, and likely other behavioral measures would be more translationally relevant for this study. Furthermore, the evidence for an anxiety-like phenotype was mixed.

      (4) Btbd3 is characterized by the authors as an OCD risk gene, but its status as such is not well supported by the most recent, better-powered genome-wide association studies than the one that originally implicated Btbd3. However, there is evidence that Btbd3 expression, including selectively in the hippocampus, is implicated in OCD-relevant behaviors in mice.

      (5) The reporting of the statistics lacks sufficient detail for the reader to deduce how experimental replicates were defined.

    1. eLife Assessment

      Maloney et al. offer an important contribution to understanding the potential ecological mechanisms behind individual behavioral variation. By providing compelling theoretical data and convincing experimental data, the study bridges the gap between individual, apparently stochastic behavior with its evolutionary purpose and consequences. The work further provides a testable and generalizable model framework to explore behavioral drift in other behaviors.

    2. Reviewer #1 (Public review):

      Summary:

      In "Drift in Individual Behavioral Phenotype as a Strategy for Unpredictable Worlds," Maloney et al. (2024) investigate changes in individual responses over time, referred to as behavioral drift within the lifespan of an animal. Drift, as defined in the paper, complements stable behavioral variation (animal individuality/personality within a lifetime) over shorter timeframes, which the authors associate with an underlying bet-hedging strategy. The third timeframe of behavioral variability that the authors discuss occurs within seasons (across several generations of some insects), termed "adaptive tracking." This division of "adaptive" behavioral variability over different timeframes is intuitively logical and adds valuable depth to the theoretical framework concerning the ecological role of individual behavioral differences in animals.

      Strengths:

      While the theoretical foundations of the study are strong, the connection between the experimental data (Figure 1) and the modeling work (Figure 2-4) is less convincing.

      Weaknesses:

      In the experimental data (Figure 1), the authors describe the changes in behavioral preferences over time. While generally plausible, I identify three significant issues with the experiments:

      (1) All of the subsequent theoretical/simulation data is based on changing environments, yet all the experiments are conducted in unchanging environments. While this may suffice to demonstrate the phenomenon of behavioral instability (drift) over time, it does not properly link to the theory-driven work in changing environments. An experiment conducted in a changing environment and its effects on behavioral drift would improve the manuscript's internal consistency and clarify some points related to (3) below.

      (2) The temporal aspect of behavioral instability. While the analysis demonstrates behavioral instability, the temporal dynamics remain unclear. It would be helpful for the authors to clarify (based on graphs and text) whether the behavioral changes occur randomly over time or follow a pattern (e.g., initially more right turns, then more left turns). A proper temporal analysis and clearer explanations are currently missing from the manuscript.

      (3) The temporal dimension leads directly into the third issue: distinguishing between drift and learning (e.g., line 56). In the neutral stimuli used in the experimental data, changes should either occur randomly (drift) or purposefully, as in a neutral environment, previous strategies do not yield a favorable outcome. For instance, the animal might initially employ strategy A, but if no improvement in the food situation occurs, it later adopts strategy B (learning). In changing environments, this distinction between drift and learning should be even more pronounced (e.g., if bananas are available, I prefer bananas; once they are gone, I either change my preference or face negative consequences). Alternatively, is my random choice of grapes the substrate for the learning process towards grapes in a changing environment? Further clarification is needed to resolve these potential conflicts.

    3. Reviewer #2 (Public review):

      Summary:

      This is an inspired study that merges the concept of individuality with evolutionary processes to uncover a new strategy that diversifies individual behavior that is also potentially evolutionarily adaptive.

      The authors use a time-resolved measurement of spontaneous, innate behavior, namely handedness or turn bias in individual, isogenic flies, across several genetic backgrounds.

      They find that an individual's behavior changes over time, or drifts. This has been observed before, but what is interesting here is that by looking at multiple genotypes, the authors find the amount of drift is consistent within genotype i.e., genetically regulated, and thus not entirely stochastic. This is not in line with what is known about innate, spontaneous behaviors. Normally, fluctuations in behavior would be ascribed to a response to environmental noise. However, here, the authors go on to find what is the pattern or rule that determines the rate of change of the behavior over time within individuals. Using modeling of behavior and environment in the context of evolutionarily important timeframes such as lifespan or reproductive age, they could show when drift is favored over bet-hedging and that there is an evolutionary purpose to behavioral drift. Namely, drift diversifies behaviors across individuals of the same genotype within the timescale of lifespan, so that the genotype's chance for expressing beneficial behavior is optimally matched with potential variation of environment experienced prior to reproduction. This ultimately increases the fitness of the genotype. Because they find that behavioral drift is genetically variable, they argue it can also evolve.

      Strengths:

      Unlike most studies of individuality, in this study, the authors consider the impact of individuality on evolution. This is enabled by the use of multiple natural genetic backgrounds and an appropriately large number of individuals to come to the conclusions presented in the study. I thought it was really creative to study how individual behavior evolves over multiple timescales. And indeed this approach yielded interesting and important insight into individuality. Unlike most studies so far, this one highlights that behavioral individuality is not a static property of an individual, but it dynamically changes. Also, placing these findings in the evolutionary context was beneficial. The conclusion that individual drift and bet-hedging are differently favored over different timescales is, I think, a significant and exciting finding.

      Overall, I think this study highlights how little we know about the fundamental, general concepts behind individuality and why behavioral individuality is an important trait. They also show that with simple but elegant behavioral experiments and appropriate modeling, we could uncover fundamental rules underlying the emergence of individual behavior. These rules may not at all be apparent using classical approaches to studying individuality, using individual variation within a single genotype or within a single timeframe.

      Weaknesses:

      I am unconvinced by the claim that serotonin neuron circuits regulate behavioral drift, especially because of its bidirectional effect and lack of relative results for other neuromodulators. Without testing other neuromodulators, it will remain unclear if serotonin intervention increases behavioral noise within individuals, or if any other pharmacological or genetic intervention would do the same. Another issue is that the amount of drugs that the individuals ingested was not tracked. Variable amounts can result in variable changes in behavior that are more consistent with the interpretation of environmental plasticity, rather than behavioral drift. With the current evidence presented, individual behavior may change upon serotonin perturbation, but this does not necessarily mean that it changes or regulates drift.

      However, I think for the scope of this study, finding out whether serotonin regulates drift or not is less important. I understand that today there is a strong push to find molecular and circuit mechanisms of any behavior, and other peers may have asked for such experiments, perhaps even simply out of habit. Fortunately, the main conclusions derived from behavioral data across multiple genetic backgrounds and the modeling are anyway novel, interesting, and in fact more fundamental than showing if it is serotonin that does it or not.

      To this point, one thing that was unclear from the methods section is whether genotypes that were tested were raised in replicate vials and how was replication accounted for in the analyses. This is a crucial point - the conclusion that genotypes have different amounts of behavioral drift cannot be drawn without showing that the difference in behavioral drift does not stem from differences in developmental environment.

    4. Reviewer #3 (Public review):

      Summary:

      The paper begins by analyzing the drift in individual behavior over time. Specifically, it quantifies the circling direction of freely walking flies in an arena. The main takeaway from this dataset is that while flies exhibit an individual turning bias (when averaged over time), their preferences fluctuate over slow timescales.

      To understand whether genetic or neuromodulatory mechanisms influence the drift in individual preference, the authors test different fly strains concluding that both genetic background and the neuromodulator serotonin contribute to the degree of drift.

      Finally, the authors use theoretical approaches to identify the range of environmental conditions under which drift in individual bias supports population growth.

      Strengths:

      The model provides a clear prediction of the environmental fluctuations under which a drift in bias should be beneficial for population growth.

      The approach attempts to identify genetic and neurophysiological mechanisms underlying drift in bias.

      Weaknesses:

      Different behavioral assays are used and are differently analysed, with little discussion on how these behaviors and analyses compare to each other.

      Some of the model assumptions should be made more explicit to better understand which aspects of the behaviors are covered.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In "Drift in Individual Behavioral Phenotype as a Strategy for Unpredictable Worlds," Maloy et al. (2024) investigate changes in individual responses over time, referred to as behavioral drift within the lifespan of an animal. Drift, as defined in the paper, complements stable behavioral variation (animal individuality/personality within a lifetime) over shorter timeframes, which the authors associate with an underlying bet-hedging strategy. The third timeframe of behavioral variability that the authors discuss occurs within seasons (across several generations of some insects), termed "adaptive tracking." This division of "adaptive" behavioral variability over different timeframes is intuitively logical and adds valuable depth to the theoretical framework concerning the ecological role of individual behavioral differences in animals.

      Strengths:

      While the theoretical foundations of the study are strong, the connection between the experimental data (Figure 1) and the modeling work (Figure 2-4) is less convincing.

      Weaknesses:

      In the experimental data (Figure 1), the authors describe the changes in behavioral preferences over time. While generally plausible, I identify three significant issues with the experiments:

      (1) All of the subsequent theoretical/simulation data is based on changing environments, yet all the experiments are conducted in unchanging environments. While this may suffice to demonstrate the phenomenon of behavioral instability (drift) over time, it does not properly link to the theory-driven work in changing environments. An experiment conducted in a changing environment and its effects on behavioral drift would improve the manuscript's internal consistency and clarify some points related to (3) below.

      In our framework, we posit that the amount of drift has been shaped by evolution to maximize fitness in the environments that the population has experienced, and this drift is observed independent of environment. While we agree that exploring the role of changing environments on the measure of drift would be interesting, we would anticipate the effects may be nuanced and beyond the scope of the current paper (and the scope of our theoretical work, which assumes that the individual phenotype is unaffected by change of environment except as mediated by death due to fitness effects). For example, it would be difficult to differentiate drift from idiosyncratic differences in learning (Smith et al., 2022), and non-adaptive plasticity to unrelated cues has been posited as a method of producing diverse phenotypes (Maxwell and Magwene, 2017), so “learning” to uncorrelated stimuli could conceivably be a mechanism for drift. Given the scope of the current study, we prioritized eliminating potential confounds for measuring drift, but remain interested in the interaction between learning and drift.

      (2) The temporal aspect of behavioral instability. While the analysis demonstrates behavioral instability, the temporal dynamics remain unclear. It would be helpful for the authors to clarify (based on graphs and text) whether the behavioral changes occur randomly over time or follow a pattern (e.g., initially more right turns, then more left turns). A proper temporal analysis and clearer explanations are currently missing from the manuscript.

      We agree it would be helpful to have more description of the dynamics over time aside from the power spectrum and autoregressive model fits. We hope to address this in more detail to provide more description of the changes over time in a revision.

      (3) The temporal dimension leads directly into the third issue: distinguishing between drift and learning (e.g., line 56). In the neutral stimuli used in the experimental data, changes should either occur randomly (drift) or purposefully, as in a neutral environment, previous strategies do not yield a favorable outcome. For instance, the animal might initially employ strategy A, but if no improvement in the food situation occurs, it later adopts strategy B (learning). In changing environments, this distinction between drift and learning should be even more pronounced (e.g., if bananas are available, I prefer bananas; once they are gone, I either change my preference or face negative consequences). Alternatively, is my random choice of grapes the substrate for the learning process towards grapes in a changing environment? Further clarification is needed to resolve these potential conflicts.

      As in our response to point 1, we believe this is a crucial distinction, and we intend to further highlight it in the discussion in the revision and further expand our discussion of how the two strategies may interact.

      Reviewer #2 (Public review):

      Summary:

      This is an inspired study that merges the concept of individuality with evolutionary processes to uncover a new strategy that diversifies individual behavior that is also potentially evolutionarily adaptive.

      The authors use a time-resolved measurement of spontaneous, innate behavior, namely handedness or turn bias in individual, isogenic flies, across several genetic backgrounds.

      They find that an individual's behavior changes over time, or drifts. This has been observed before, but what is interesting here is that by looking at multiple genotypes, the authors find the amount of drift is consistent within genotype i.e., genetically regulated, and thus not entirely stochastic. This is not in line with what is known about innate, spontaneous behaviors. Normally, fluctuations in behavior would be ascribed to a response to environmental noise. However, here, the authors go on to find what is the pattern or rule that determines the rate of change of the behavior over time within individuals. Using modeling of behavior and environment in the context of evolutionarily important timeframes such as lifespan or reproductive age, they could show when drift is favored over bet-hedging and that there is an evolutionary purpose to behavioral drift. Namely, drift diversifies behaviors across individuals of the same genotype within the timescale of lifespan, so that the genotype's chance for expressing beneficial behavior is optimally matched with potential variation of environment experienced prior to reproduction. This ultimately increases the fitness of the genotype. Because they find that behavioral drift is genetically variable, they argue it can also evolve.

      Strengths:

      Unlike most studies of individuality, in this study, the authors consider the impact of individuality on evolution. This is enabled by the use of multiple natural genetic backgrounds and an appropriately large number of individuals to come to the conclusions presented in the study. I thought it was really creative to study how individual behavior evolves over multiple timescales. And indeed this approach yielded interesting and important insight into individuality. Unlike most studies so far, this one highlights that behavioral individuality is not a static property of an individual, but it dynamically changes. Also, placing these findings in the evolutionary context was beneficial. The conclusion that individual drift and bet-hedging are differently favored over different timescales is, I think, a significant and exciting finding.

      Overall, I think this study highlights how little we know about the fundamental, general concepts behind individuality and why behavioral individuality is an important trait. They also show that with simple but elegant behavioral experiments and appropriate modeling, we could uncover fundamental rules underlying the emergence of individual behavior. These rules may not at all be apparent using classical approaches to studying individuality, using individual variation within a single genotype or within a single timeframe.

      Weaknesses:

      I am unconvinced by the claim that serotonin neuron circuits regulate behavioral drift, especially because of its bidirectional effect and lack of relative results for other neuromodulators. Without testing other neuromodulators, it will remain unclear if serotonin intervention increases behavioral noise within individuals, or if any other pharmacological or genetic intervention would do the same. Another issue is that the amount of drugs that the individuals ingested was not tracked. Variable amounts can result in variable changes in behavior that are more consistent with the interpretation of environmental plasticity, rather than behavioral drift. With the current evidence presented, individual behavior may change upon serotonin perturbation, but this does not necessarily mean that it changes or regulates drift.

      However, I think for the scope of this study, finding out whether serotonin regulates drift or not is less important. I understand that today there is a strong push to find molecular and circuit mechanisms of any behavior, and other peers may have asked for such experiments, perhaps even simply out of habit. Fortunately, the main conclusions derived from behavioral data across multiple genetic backgrounds and the modeling are anyway novel, interesting, and in fact more fundamental than showing if it is serotonin that does it or not.

      We agree that our data do not support a strong conclusion that serotonin plays a privileged role in regulating drift. Based on previous literature (e.g. Kain et al., 2014, where identical pharmacological manipulations had an effect on variability while dopaminergic and octopaminergic manipulations did not), we think it likely that large global perturbations in serotonin that we observe are likely to influence plasticity that might be involved in drift (and thus find the results we observe not particularly surprising). Nonetheless, we agree that the mechanism by which serotonin may affect drift could be indirect, and it is similarly plausible that many global perturbations could lead to some shift in the amount of drift. We intend to further discuss these issues in the revision.

      To this point, one thing that was unclear from the methods section is whether genotypes that were tested were raised in replicate vials and how was replication accounted for in the analyses. This is a crucial point - the conclusion that genotypes have different amounts of behavioral drift cannot be drawn without showing that the difference in behavioral drift does not stem from differences in developmental environment.

      While a cursory inspection suggests that batch effects between different replicates was small, we intend to clarify this and more explicitly address the effects of replicates in revision.

      Reviewer #3 (Public review):

      Summary:

      The paper begins by analyzing the drift in individual behavior over time. Specifically, it quantifies the circling direction of freely walking flies in an arena. The main takeaway from this dataset is that while flies exhibit an individual turning bias (when averaged over time), their preferences fluctuate over slow timescales.

      To understand whether genetic or neuromodulatory mechanisms influence the drift in individual preference, the authors test different fly strains concluding that both genetic background and the neuromodulator serotonin contribute to the degree of drift.

      Finally, the authors use theoretical approaches to identify the range of environmental conditions under which drift in individual bias supports population growth.

      Strengths:

      The model provides a clear prediction of the environmental fluctuations under which a drift in bias should be beneficial for population growth.

      The approach attempts to identify genetic and neurophysiological mechanisms underlying drift in bias.

      Weaknesses:

      Different behavioral assays are used and are differently analysed, with little discussion on how these behaviors and analyses compare to each other.

      We intend to address this in a revision of the discussion.

      Some of the model assumptions should be made more explicit to better understand which aspects of the behaviors are covered.

      We will further clarify the assumptions of the model in revision.

    1. eLife Assessment

      The results by Zhu et al provide valuable insights into the representation of border ownership in area V1. They used neuropixel recording to demonstrate the clustering of border ownership, and compared cross-correlation functions between neurons in different layers to demonstrate that they depend on the type of stimulus. The strength of the evidence is solid but can be improved by performing additional analyses and accounting for the differences in classical and non-classical receptive field stimulation conditions.

    2. Reviewer #1 (Public review):

      Zhu and colleagues used high-density Neuropixel probes to perform laminar recordings in V1 while presenting either small stimuli that stimulated the classical receptive field (CRF) or large stimuli whose border straddled the RF to provide nonclassical RF (nCRF) stimulation. Their main question was to understand the relative contribution of feedforward (FF), feedback (FB), and horizontal circuits to border ownership (Bown), which they addressed by measuring cross-correlation across layers. They found differences in cross-correlation between feedback/horizontal (FH) and input layers during CRF and nCRF stimulation.

      Although the data looks high quality and analyses look mostly fine, I had a lot of difficulty understanding the logic in many places. Examples of my concerns are written below.

      (1) What is the main question? The authors refer to nCRF stimulation emerging from either feedback from higher areas or horizontal connections from within the same area (e.g. lines 136 to 138 and again lines 223-232). I initially thought that the study would aim to distinguish between the two. However, the way the authors have clubbed the layers in 3D, the main question seems to be whether Bown is FF or FH (i.e., feedback and horizontal are clubbed). Is this correct? If so, I don't see the logic, since I can't imagine Bown to be purely FF. Thus, just showing differences between CRF stimulation (which is mainly expected to be FF) and nCRF stimulation is not surprising to me.

      (2) Choice of layers for cross-correlation analysis: In the Introduction, and also in Figure 3C, it is mentioned that FF inputs arrive in 4C and 6, while FB/Horizontal inputs arrive at "superficial" and "deep", which I take as layer 2/3 and 5. So it is not clear to me why (i) layer 4A/B is chosen for analysis for Figure 3D (I would have thought layer 6 should have been chosen instead) and (ii) why Layers 5 and 6 are clubbed.

      (3) Addressing the main question using cross-correlation analysis: I think the nice peaks observed in Figure 3B for some pairs show how spiking in one neuron affects the spiking in another one, with the delay in cross-correlation function arising from the conduction delay. This is shown nicely during CRF stimulation in Figure 3D between 4C -> 2/3, for example. However, the delay (positive or negative) is constrained by anatomical connectivity. For example, unless there are projections from 2/3 back to 4C which causes firing in a 2/3 layer neuron to cause a spike in a layer 4 neuron, we cannot expect to get a negative delay no matter what kind of stimulation (CRF versus nCRF) is used.

    3. Reviewer #2 (Public review):

      Summary:

      The authors present a study of how modulatory activity from outside the classical receptive field (cRF) differs from cRF stimulation. They study neural activity across the different layers of V1 in two anesthetized monkeys using Neuropixels probes. The monkeys are presented with drifting gratings and border-ownership tuning stimuli. They find that border-ownership tuning is organized into columns within V1, which is unexpected and exciting, and that the flow of activity from cell-to-cell (as judged by cross-correlograms between single units) is influenced by the type of visual stimulus: border-ownership tuning stimuli vs. drifting-grating stimuli.

      Strengths:

      The questions addressed by the study are of high interest, and the use of Neuropixels probes yields extremely high numbers of single-units and cross-correlation histograms (CCHs) which makes the results robust. The study is well-described.

      Weaknesses:

      The weaknesses of the study are (a) the use of anesthetized animals, which raises questions about the nature of the modulatory signal being measured and the underlying logic of why a change in visual stimulus would produce a reversal in information flow through the cortical microcircuit and (b) the choice of visual stimuli, which do not uniquely isolate feedforward from feedback influences.

      (1) The modulation latency seems quite short in Figure 2C. Have the authors measured the latency of the effect in the manuscript and how it compares to the onset of the visually driven response? It would be surprising if the latency was much shorter than 70ms given previous measurements of BO and figure-ground modulation latency in V2 and V1. On the same note, it might be revealing to make laminar profiles of the modulation (i.e. preferred - non-preferred border orientation) as it develops over time. Does the modulation start in feedback recipient layers?

      (2) Can the authors show the average time course of the response elicited by preferred and non-preferred border ownership stimuli across all significant neurons?

      (3) The logic of assuming that cRF stimulation should produce the opposite signal flow to border-ownership tuning stimuli is worth discussing. I suspect the key difference between stimuli is that they used drifting gratings as the cRF stimulus, the movement of the stimulus continually refreshes the retinal image, leading to continuous feedforward dominance of the signals in V1. Had they used a static grating, the spiking during the sustained portion of the response might also show more influence of feedback/horizontal connections. Do the initial spikes fired in response to the border-ownership tuning stimuli show the feedforward pattern of responses? The authors state that they did not look at cross-correlations during the initial response, but if they do, do they see the feedforward-dominated pattern? The jitter CCH analysis might suffice in correcting for the response transient.

      (4) The term "nCRF stimulation" is not appropriate because the CRF is stimulated by the light/dark edge.

    4. Reviewer #3 (Public review):

      Summary:

      The paper by Zhu et al is on an important topic in visual neuroscience, the emergence in the visual cortex of signals about figures and ground. This topic also goes by the name border ownership. The paper utilizes modern recording techniques very skillfully to extend what is known about border ownership. It offers new evidence about the prevalence of border ownership signals across different cortical layers in V1 cortex. Also, it uses pairwise cross-correlation to study signal flow under different conditions of visual stimulation that include the border ownership paradigm.

      Strengths:

      The paper's strengths are its use of multi-electrode probes to study border ownership in many neurons simultaneously across the cortical layers in V1, and its innovation of using cross-correlation between cortical neurons -- when they are viewing border-ownership patterns or instead are viewing grating patterns restricted to the classical receptive field (CRF).

      Weaknesses:

      The paper's weaknesses are its largely incremental approach to the study of border ownership and the lack of a critical analysis of the cross-correlation data. The paper as it is now does not advance our understanding of border ownership; it mainly confirms prior work, and it does not challenge or revise consensus beliefs about mechanisms. However, it is possible that, in the rich dataset the authors have obtained, they do possess data that could be added to the paper to make it much stronger.

      Critique:

      The border ownership data on V1 offered in the paper replicates experimental results obtained by Zhou and von der Heydt (2000) and confirms the earlier results using the same analysis methods as Zhou. The incremental addition is that the authors found border ownership in all cortical layers extending Zhou's results that were only about layer 2/3.

      The cross-correlation results show that the pattern of the cross-correlogram (CCG) is influenced by the visual pattern being presented. However, the results are not analyzed mechanistically, and the interpretation is unclear. For instance, the authors show in Figure 3 (and in Figure S2) that the peak of the CCG can indicate layer 2/3 excites layer 4C when the visual stimulus is the border ownership test pattern, a large square 8 deg on a side. But how can layer 2/3 excite layer 4C? The authors do not raise or offer an answer to this question. Similar questions arise when considering the CCG of layer 4A/B with layer 2/3. What is the proposed pathway for layer 2/3 to excite 4A/B? Other similar questions arise for all the interlaminar CCG data that are presented. What known functional connections would account for the measured CCGs?

      The problems in understanding the CCG data are indirectly caused by the lack of a critical analysis of what is happening in the responses that reveal the border ownership signals, as in Figure 2. Let's put it bluntly - are border ownership signals excitatory or inhibitory? The reason I raise this question is that the present authors insightfully place border ownership as examples of the action of the non-classical receptive field (nCRF) of cortical cells. Most previous work on the nCRF (many papers cited by the authors) reveal the nCRF to be inhibitory or suppressive. In order to know whether nCRF signals are excitatory or inhibitory, one needs a baseline response from the CRF, so that when you introduce nCRF signals you can tell whether the change with respect to the CRF is up or down. As far as I know, prior work on border ownership has not addressed this question, and the present paper doesn't either. This is where the rich dataset that the present authors possess might be used to establish a fundamental property of border ownership.

      Then we must go back to consider what the consequences of knowing the sign of the border ownership signal would mean for interpreting the CCG data. If the border ownership signals from extrastriate feedback or, alternatively, from horizontal intrinsic connections, are excitatory, they might provide a shared excitatory input to pairs of cells that would show up in the CCG as a peak at 0 delay. However, if the border ownership manuscript signals are inhibitory, they might work by exciting only inhibitory neurons in V1. This could have complicated consequences for the CCG. The interpretation of the CCG data in the present version of the m is unclear (see above). Perhaps a clearer interpretation could be developed once the authors know better what the border ownership signals are.

      My critique of the CCG analysis applies to Figure 5 also. I cannot comprehend the point of showing a very weak correlation of CCG asymmetry with Border Ownership Index, especially when what CCG asymmetry means is unclear mechanistically. Figure 5 does not make the paper stronger in my opinion.

      In Figure 3, the authors show two CCGs that involve 4C--4C pairs. It would be nice to know more about such pairs. If there are any 6--6 pairs, what they look like also would be interesting. The authors also in Figure 3 show CCG's of two 4C--4A/B pairs and it would be quite interesting to know how such CCGs behave when CRF and nCRF stimuli are compared. In other words, the authors have shown us they have many data but have chosen not to analyze them further or to explain why they chose not to analyze them. It might help the paper if the authors would present all the CCG types they have. This suggestion would be helpful when the authors know more about the sign of border ownership signals, as discussed at length above.

    1. eLife Assessment

      This study provides valuable insights into the differential impact of intrinsic and synaptic conductances on circuit robustness, emphasizing intrinsic plasticity as a crucial but often overlooked factor in neural dynamics. Although the findings are solid and underscore the significance of intrinsic factors, they are limited by the simplified model and the potential confounding effects of drastic intrinsic perturbations on single-neuron activity. Further refinements would help validate the generality of these conclusions across diverse networks and functions.

    2. Reviewer #1 (Public review):

      The paper by Fournier et al. investigates the sensitivity of neural circuits to changes in intrinsic and synaptic conductances. The authors use models of the stomatogastric ganglion (STG) to compare how perturbations to intrinsic and synaptic parameters impact network robustness. Their main finding is that changes to intrinsic conductances tend to have a larger impact on network function than changes to synaptic conductances, suggesting that intrinsic parameters are more critical for maintaining circuit function.

      The paper is well-written and the results are compelling, but I have several concerns that need to be addressed to strengthen the manuscript. Specifically, I have two main concerns:<br /> (1) It is not clear from the paper what the mechanism is that leads to the importance of intrinsic parameters over synaptic parameters.<br /> (2) It is not clear how general the result is, both within the framework of the STG network and its function, and across other functions and networks. This is crucial, as the title of the paper appears very general.

      I believe these two elements are missing in the current manuscript, and addressing them would significantly strengthen the conclusions. Without a clear understanding of the mechanism, it is difficult to determine whether the results are merely anecdotal or if they depend on specific details such as how the network is trained, the particular function being studied, or the circuit itself. Additionally, understanding how general the findings are is vital, especially since the authors claim in the title that "Circuit function is more robust to changes in synaptic than intrinsic conductances," which suggests a broad applicability.

      I do not wish to discourage the authors from their interesting result, but the more we understand the mechanism and the generality of the findings, the more insightful the result will be for the neuroscience community.

      Major comments

      (1) Mechanism<br /> While the authors did a nice job of describing their results, they did not provide any mechanism for why synaptic parameters are more resilient to changes than intrinsic parameters. For example, from Figure 5, it seems that there is mainly a shift in the sensitivity curves. What is the source of this shift? Can something be changed in the network, the training, or the function to control it? This is just one possible way to investigate the mechanism, which is lacking in the paper.

      (2) Generality of the results within the framework of the STG circuit<br /> (a) The authors did show that their results extend to multiple networks with different parameters (the 100 networks). However, I am still concerned about the generality of the results with respect to the way the models were trained. Could it be that something in the training procedure makes the synaptic parameters more robust than intrinsic parameters? For example, the fact that duty cycle error is weighted as it is in the cost function (large beta) could potentially affect the parameters that are more important for yielding low error on the duty cycle.<br /> (b) Related to (a), I can think of a training scheme that could potentially improve the resilience of the network to perturbations in the intrinsic parameters rather than the synaptic parameters. For example, in machine learning, methods like dropout can be used to make the network find solutions that are robust to changes in parameters. Thus, in principle, the results could change if the training procedure for fitting the models were different, or by using a different optimization algorithm. It would be helpful to at least mention this limitation in the discussion.

      (3) Generality of the function<br /> The authors test their hypothesis based on the specific function of the STG. It would be valuable to see if their results generalize to other functions as well. For example, the authors could generate non-oscillatory activity in the STG circuit, or choose a different, artificial function, maybe with different duty cycles or network cycles. It could be that this is beyond the scope of this paper, but it would be very interesting to characterize which functions are more resilient to changes in synapses, rather than intrinsic parameters. In other words, the authors might consider testing their hypothesis on at least another 'function' and also discussing the generality of their results to other functions in the discussion.

      (4) Generality of the circuit<br /> The authors have studied the STG for many years and are pioneers in their approach, demonstrating that there is redundancy even in this simple circuit. This approach is insightful, but it is important to show that similar conclusions also hold for more general network architectures, and if not, why. In other words, it is not clear if their claim generalizes to other network architectures, particularly larger networks. For example, one might expect that the number of parameters (synaptic vs intrinsic) might play a role in how resilient the function is with respect to changes in the two sets of parameters. In larger models, the number of synaptic parameters grows as the square of the number of neurons, while the number of intrinsic parameters increases only linearly with the number of neurons. Could that affect the authors' conclusions when we examine larger models?

      In addition, how do the authors' conclusions depend on the "complexity" of the non-linear equations governing the intrinsic parameters? Would the same conclusions hold if the intrinsic parameters only consisted of fewer intrinsic parameters or simplified ion channels? All of these are interesting questions that the authors should at least address in the discussion.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript presents an important exploration of how intrinsic and synaptic conductances affect the robustness of neural circuits. This is a well-deserved question, and overall, the manuscript is written well and has a logical progression.

      The focus on intrinsic plasticity as a potentially overlooked factor in network dynamics is valuable. However, while the stomatogastric ganglion (STG) serves as a well-characterized and valuable model for studying network dynamics, its simplified structure and specific dynamics limit the generalizability of these findings to more complex systems, such as mammalian cortical microcircuits.

      Strengths:<br /> Clean and simple model. Simulations are carefully carried out and parameter space is searched exhaustively.

      Weaknesses:

      (1) Scope and Generalizability:<br /> The study's emphasis on intrinsic conductance is timely, but with its minimalistic and unique dynamics, the STG model poses challenges when attempting to generalize findings to other neural systems. This raises questions regarding the applicability of the results to more complex circuits, especially those found in mammalian brains and those where the dynamics are not necessarily oscillating. This is even more so (as the authors mention) because synaptic conductances in this study are inhibitory, and changes to their synaptic conductances are limited (as the driving force for the current is relatively low).

      (2) Challenges in Comparison:<br /> A significant challenge in the study is the comparison method used to evaluate the robustness of intrinsic versus synaptic perturbations. Perturbations to intrinsic conductances often drastically affect individual neurons' dynamics, as seen in Figure 1, where such changes result in single spikes or even the absence of spikes instead of the expected bursting behavior. This affects the input to downstream neurons, leading to circuit breakdowns. For a fair comparison, it would be essential to constrain the intrinsic perturbations so that each neuron remains within a particular functional range (e.g., maintaining a set number of spikes). This could be done by setting minimal behavioral criteria for neurons and testing how different perturbation limits impact circuit function.

      (3) Comparative Metrics for Perturbation:<br /> Another notable issue lies in the evaluation metrics for intrinsic and synaptic perturbations. Synaptic perturbations are straightforward to quantify in terms of conductance, but intrinsic perturbations involve more complexity, as changes in maximal conductance result in variable, nonlinear effects depending on the gating states of ion channels. Furthermore, synaptic perturbations focus on individual conductances, while intrinsic perturbations involve multiple conductance changes simultaneously. To improve fairness in comparison, the authors could, for example, adjust the x-axis to reflect actual changes in conductance or scale the data post hoc based on the real impact of each perturbation on conductance. For example, in Figure 6, the scale of the panels of the intrinsic (e.g., g_na-bar) is x500 larger than the synaptic conductance (a row below), but the maximal conductance for sodium hits maybe for a brief moment during every spike and than most of the time it is close to null. Moreover, changing the sodium conductance over the range of 0-250 for such a nonlinear current is, in many ways, unthinkable, did you ever measure two neurons with such a difference in the sodium conductance? So, how can we tell that the ranges of the perturbations make a meaningful comparison?

    1. eLife Assessment

      This important study highlights a critical challenge to a great many studies of the neural correlates of consciousness that were based on post hoc sorting of reported awareness experience. The evidence supporting this criticism is convincing, based on simulations and decoding analysis of EEG data. The results will be of interest not only to psychologists and neuroscientists but also to philosophers who work on addressing mind-body relationships.

    2. Reviewer #1 (Public review):

      Summary:

      The paper proposes that the placement of criteria for determining whether a stimulus is 'seen' or 'unseen' can significantly impact the validity of neural measures of consciousness. The authors found that conservative criteria, which require stronger evidence to classify a stimulus as 'seen,' tend to inflate effect sizes in neural measures, making conscious processing appear more pronounced than it is. Conversely, liberal criteria, which require less evidence, reduce these effect sizes, potentially underestimating conscious processing. This variability in effect sizes due to criterion placement can lead to misleading conclusions about the nature of conscious and unconscious processing.

      Furthermore, the study highlights that the Perceptual Awareness Scale (PAS), a commonly used tool in consciousness research, does not effectively mitigate these criterion-related confounds. This means that even with PAS, the validity of neural measures can still be compromised by how criteria are set. The authors emphasize the need for careful consideration and standardization of criterion placement in experimental designs to ensure that neural measures accurately reflect the underlying cognitive processes. By addressing this issue, the paper aims to improve the reliability and validity of findings in the field of consciousness research.

      Strengths:

      (1) This research provides a fresh perspective on how criterion placement can significantly impact the validity of neural measures in consciousness research.

      (2) The study employs robust simulations and EEG experiments to demonstrate the effects of criterion placement, ensuring that the findings are well-supported by empirical evidence.

      (3) By highlighting the limitations of the PAS and the impact of criterion placement, the study offers practical recommendations for improving experimental designs in consciousness research.

      Weaknesses:

      The primary focused criterion of PAS is a commonly used tool, but there are other measures of consciousness that were not evaluated, which might also be subject to similar or different criterion limitations. A simulation could applied to these metrics to show how generalizable the conclusion of the study is.

    3. Reviewer #2 (Public review):

      Summary:

      The study investigates the potential influence of the response criterion on neural decoding accuracy in consciousness and unconsciousness, utilizing either simulated data or reanalyzing experimental data with post-hoc sorting data.

      Strengths:

      When comparing the neural decoding performance of Target versus NonTarget with or without post-hoc sorting based on subject reports, it is evident that response criterion can influence the results. This was observed in simulated data as well as in two experiments that manipulated the subject response criterion to be either more liberal or more conservative. One experiment involved a two-level response (seen vs unseen), while the other included a more detailed four-level response (ranging from 0 for no experience to 3 for a clear experience). The findings consistently indicated that adopting a more conservative response criterion could enhance neural decoding performance, whether in conscious or unconscious states, depending on the sensitivity or overall response threshold.

      Weaknesses:

      (1) The response criterion plays a crucial role in influencing neural decoding because a subject's report may not always align with the actual stimulus presented. This discrepancy can occur in cases of false alarms, where a subject reports seeing a target that was not actually there, or in cases where a target is present but not reported. Some may argue that only using data from consistent trials (those with correct responses) would not be affected by the response criterion. However, the authors' analysis suggests that a conservative response criterion not only reduces false alarms but also impacts hit rates. It is important for the authors to further investigate how the response criterion affects neural decoding even when considering only correct trials.

      (2) The author has utilized decoding target vs. nontarget as the neural measures of unconscious and/or conscious processing. However, it is important to note that this is just one of the many neural measures used in the field. There are an increasing number of studies that focus on decoding the conscious content, such as target location or target category. If the author were to include results on decoding target orientation and how it may be influenced by response criterion, the field would greatly benefit from this paper.

    4. Reviewer #3 (Public review):

      Summary:

      Fahrenfort et al. investigate how liberal or conservative criterion placement in a detection task affects the construct validity of neural measures of unconscious cognition and conscious processing. Participants identified instances of "seen" or "unseen" in a detection task, a method known as post hoc sorting. Simulation data convincingly demonstrate that, counterintuitively, a conservative criterion inflates effect sizes of neural measures compared to a liberal criterion. While the impact of criterion shifts on effect size is suggested by signal detection theory, this study is the first to address this explicitly within the consciousness literature. Decoding analysis of data from two EEG experiments further shows that different criteria lead to differential effects on classifier performance in post hoc sorting. The findings underscore the pervasive influence of experimental design and participants report on neural measures of consciousness, revealing that criterion placement poses a critical challenge for researchers.

      Strengths and Weaknesses:

      One of the strengths of this study is the inclusion of the Perceptual Awareness Scale (PAS), which allows participants to provide more nuanced responses regarding their perceptual experiences. This approach ensures that responses at the lowest awareness level (selection 0) are made only when trials are genuinely unseen. This methodological choice is important as it helps prevent the overestimation of unconscious processing, enhancing the validity of the findings.

      A potential area for improvement in this study is the use of single time-points from peak decoding accuracy to generate current source density topography maps. While we recognize that the decoding analysis employed here differs from traditional ERP approaches, the robustness of the findings could be enhanced by exploring current source density over relevant time windows. Event-related peaks, both in terms of timing and amplitude, can sometimes be influenced by noise or variability in trial-averaged EEG data, and a time-window analysis might provide a more comprehensive and stable representation of the underlying neural dynamics.

      It is helpful that the authors show the standard error of the mean for the classifier performance over time. A similar indication of a measure of variance in other figures could improve clarity and transparency.<br /> That said, the paper appears solid regarding technical issues overall. The authors also do a commendable job in the discussion by addressing alternative paradigms, such as wagering paradigms, as a possible remedy to the criterion problem (Peters & Lau, 2015; Dienes & Seth, 2010). Their consideration of these alternatives provides a balanced view and strengthens the overall discussion.

      Impact of the Work:

      This study effectively demonstrates a phenomenon that has been largely unexplored within the consciousness literature. Subjective measures may not reliably capture the construct they aim to measure due to criterion confounds. Future research on neural measures of consciousness should account for this issue, and no-report measures may be necessary until the criterion problem is resolved.

  2. Nov 2024
    1. eLife Assessment

      In this important study, Li et al. identify estrogen receptor 1-expressing neurons (ESR1+) in Barrington's nucleus as key regulators coordinating both bladder contraction and the relaxation of the external urethral sphincter. Using appropriate and validated methodologies aligned with the current state of the art, the data are convincing and of generally high quality.

    2. Reviewer #1 (Public review):

      Summary:

      Urination requires precise coordination between the bladder and external urethral sphincter (EUS), while the neural substrates controlling this coordination remain poorly understood. In this study, Li et al. identify estrogen receptor 1-expressing neurons (ESR1+) in Barrington's nucleus as key regulators that faithfully initiate or suspend urination. Results from peripheral nerve lesions suggest that BarEsr1 neurons play independent roles in controlling bladder contraction and relaxation of the EUS. Finally, the authors performed region-specific retrograde tracing, claiming that distinct populations of BarEsr1 neurons target specific spinal nuclei involved in regulating the bladder and EUS, respectively.

      Strength:

      Overall, the work is of high quality. The authors integrate several cutting-edge technologies and sophisticated, thorough analyses, including opto-tagged single unit recordings, combined optogenetics, and urodynamics, particularly those following distinct peripheral nerve lesions.

      Weakness:

      (1) My major concern is the novelty of this study. Keller et al. 2018 have shown that BarEsr1 neurons are active during urination and play an essential role in relaxing the external urethral sphincter (EUS). Minimally, substantial content that merely confirms previous findings (e.g. Figures 1A-E; Figures 3A-E) should be move to the supplementary datasets.

      (2) I also have concerns regarding the results showing that the inactivation of BarEsr1 neurons led to the cessation of EUS muscle firing (Figures 2G and S5C). As shown in the cartoon illustration of Figure 8, spinal projections of BarEsr1 neurons contact interneurons (presumably inhibitory) that innervate motor neurons, which in turn excite the EUS. I would therefore expect that the inactivation of BarEsr1 should shift the EUS firing pattern from phasic (as relaxation) to tonic (removal of relaxation), rather than stopping their firing entirely. Could the authors comment on this and provide potential reasons or mechanisms for this finding?

      (3) Current evidence is insufficient to support the claim that the majority of BarEsr1 neurons innervate the SPN but not DGC. The current spinal images are uninformative, as the fluorescence reflects the distribution of Esr1- or Crh-expressing neurons in the spinal cord, along with descending BarEsr1 or BarCrh axons. Given the close anatomical proximity of these two nuclei, a more thorough histological analysis is required to demonstrate that the spinal injections were accurately confined to either the SPN or the DGC.

    3. Reviewer #2 (Public review):

      Summary:

      The authors have performed a rigorous study to assess the role of ESR1+ neurons in the PMC to control the coordination of bladder and sphincter muscles during urination. This is an important extension of previous work defining the role of these brainstem neurons, and convincingly adds to the understanding of their role as master regulators of urination. This is a thorough, well-done study that clarifies how the Pontine micturition center coordinates different muscle groups for efficient urination, but there are some questions and considerations that remain.

      Strengths:

      These data are thorough and convincing in showing that ESR1+ PMC neurons exert coordinated control over both the bladder and sphincter activity, which is essential for efficient urination. The anatomical distinctions in pelvic versus pudendal control are clear, and it's an advance to understand how this coordination occurs. This work offers a clearer picture of how micturition is driven.

      Weaknesses:

      The dynamics of how this population of ESR1+ neurons is engaged in natural urination events remains unclear. Not all ESR1+neurons are always engaged, and it is not measured whether this is simply variation in population activity, or if more neurons are engaged during more intense starting bladder pressures, for instance. In particular, the response dynamics of single and doubly-projecting neurons are not defined. Additionally, the model for how these neurons coordinate with CRH+ neuron activity in the PMC is not addressed, although these cell types seem to be engaged at the same time. Lastly, it would be interesting to know how sensory input can likely modulate the activity of these neurons, but this is perhaps a future direction.

    4. Reviewer #3 (Public review):

      Summary:

      The paper by Li et al explored the role of Estrogen receptor 1 (Esr1) expressing neurons in the pontine micturition center (PMC), a brainstem region also known as Barrington's nucleus (Hou et al 2016, Keller et al 2018). First, the author conducted bulk Ca2+ imaging/unit recording from PMCESR1 to investigate the correlations of PMCESR1 neural activity to voiding behavior in conscious mice and bladder pressure/external urethral muscle activity in urethane anesthetized mice. Next, the authors conducted optogenetics inactivation/activation of PMCESR1 to confirm the contribution to the voiding behavior also conducted peripheral nerve transection together with optogenetics activation to confirm the independent control of bladder pressure and urethral sphincter muscle.

      Weaknesses:

      (1) The study demonstrates that pelvic nerve transection reduces urinary volume triggered by PMCESR1+ cell photoactivation in freely moving mice. Could the role of pudendal nerve transection also be examined in awake mice to provide a more comprehensive understanding of neural involvement?

      (2) While the paper primarily focuses on PMCESR1+ cells in bladder-sphincter coordination, the analysis of PMCESR1+-DGC/SPN neural circuits - given their distinct anatomical projections in the sacral spinal cord - feels underexplored. How do these circuits influence bladder and sphincter function when activated or inhibited? Also, do you have any tracing data to confirm whether bladder-sphincter innervation comes from distinct spinal nuclei?

      (3) Although the paper successfully identifies the physiological role of PMCESR1+ cells in bladder-sphincter coordination, the study falls short in examining the electrophysiological properties of PMCESR1+-DGC/SPN cells. A deeper investigation here would strengthen the findings.

      (4) The parameters for photoactivation (blue light pulses delivered at 25 Hz for 15 ms, every 30 s) and photoinhibition (pulses at 50 Hz for 20 ms) vary. What drove the selection of these specific parameters? Moreover, for photoactivation experiments, the change in pressure (ΔP = P5 sec - P0 sec) is calculated differently from photoinhibition (Δpressure = Ppeak - Pmin). Can you clarify the reasoning behind these differing approaches?

      (5) The discussion could further emphasize how PMCESR1+ cells coordinate bladder contraction and sphincter relaxation to control urination, highlighting their central role in the initiation and suspension of this process.

      (6) In Figure 8, The authors analyze the temporal sequence of bladder pressure and EUS bursting during natural voiding and PMC activation-induced voiding. It would be acceptable to consider the existence of a lower spinal reflex circuit, however, the interpretation of the data contains speculation. Bladder pressure measurement is hard to say reflecting efferent pelvic nerve activity in real time. (As a biological system, bladder contraction is mediated by smooth muscle, and does not reflect real-time efferent pelvic nerve activity. As an experimental set-up, bladder pressure measurement has some delays to reflect bladder pressure because of tubing, but EUS bursting has no delay.) Especially for the inactivation experiment, these factors would contribute to the interpretation of data. This reviewer recommends a rewrite of the section considering these limitations. Most of the section is suitable for the results.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      Urination requires precise coordination between the bladder and external urethral sphincter (EUS), while the neural substrates controlling this coordination remain poorly understood. In this study, Li et al. identify estrogen receptor 1-expressing neurons (ESR1+) in Barrington's nucleus as key regulators that faithfully initiate or suspend urination. Results from peripheral nerve lesions suggest that BarEsr1 neurons play independent roles in controlling bladder contraction and relaxation of the EUS. Finally, the authors performed region-specific retrograde tracing, claiming that distinct populations of BarEsr1 neurons target specific spinal nuclei involved in regulating the bladder and EUS, respectively.

      Strength:

      Overall, the work is of high quality. The authors integrate several cutting-edge technologies and sophisticated, thorough analyses, including opto-tagged single unit recordings, combined optogenetics, and urodynamics, particularly those following distinct peripheral nerve lesions.

      Weakness:

      (1) My major concern is the novelty of this study. Keller et al. 2018 have shown that BarEsr1 neurons are active during urination and play an essential role in relaxing the external urethral sphincter (EUS). Minimally, substantial content that merely confirms previous findings (e.g. Figures 1A-E; Figures 3A-E) should be move to the supplementary datasets.

      Indeed, we are aware of and have carefully studied the literature of Keller et al. Our manuscript here presents novel experiments beyond the scopes of that paper. Thanks to this comment, we will substantially revise our manuscript to enhance the visibility of novel data while keeping the agreeing data in the supplementary.

      (2) I also have concerns regarding the results showing that the inactivation of BarEsr1 neurons led to the cessation of EUS muscle firing (Figures 2G and S5C). As shown in the cartoon illustration of Figure 8, spinal projections of BarEsr1 neurons contact interneurons (presumably inhibitory) that innervate motor neurons, which in turn excite the EUS. I would therefore expect that the inactivation of BarEsr1 should shift the EUS firing pattern from phasic (as relaxation) to tonic (removal of relaxation), rather than stopping their firing entirely. Could the authors comment on this and provide potential reasons or mechanisms for this finding?

      We agree with this point. We meant that the EUS’ phasic bursting pattern was rapidly stopped upon BarEsr1 photoinhibition, but not all the firing stopped instantaneously. According to the previous studies (Chang et al., 2007, de Groat, 2009, de Groat and Yoshimura, 2015, Kadekawa et al., 2016), the voiding physiology of rodents is probably different from that of humans, such that for rodents the urine is step-wise pumped out in the gap time between multiple consecutive EUS phasic bursting epochs, and for humans the urine is continuously pumped out once the EUS firing is almost fully inhibition during a period of time. Namely, for mice, the EUS display sustained tonic activity following phasic bursting, while, in contrast, for humans the EUS keeps tonic firing until the moment of voiding onset (complete inhibition, muscle relaxed). Despite the prominent differences in the basic physiological properties, our assumption is that the logic of circuits from the brainstem to the urethra in this pathway is evolutionally conserved for both species; thus the logic of brainstem coordination of voiding could also be the same for both species, which is the main interest of our study (of using an animal model to address concerns of human health). Thus, to interpret our data for a broader audience we made a simplified and inaccurate expression. We apologize for the inaccuracy and we will correct our previous inaccurate description in the revised manuscript.

      (3) Current evidence is insufficient to support the claim that the majority of BarEsr1 neurons innervate the SPN but not DGC. The current spinal images are uninformative, as the fluorescence reflects the distribution of Esr1- or Crh-expressing neurons in the spinal cord, along with descending BarEsr1 or BarCrh axons. Given the close anatomical proximity of these two nuclei, a more thorough histological analysis is required to demonstrate that the spinal injections were accurately confined to either the SPN or the DGC.

      We agree that current evidence is insufficient to support the current claim. To address this concern and strengthen our claim, we will repeat the retrograde viral tracing experiments, combined with CTB647 injections to label the injection site, to validate specific targeting of SPN or DGC populations. We will also add higher-magnification imaging to distinguish BarESR1 axonal projections targeting SPN versus DGC. Results from these ongoing experiments will be incorporated into the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      The authors have performed a rigorous study to assess the role of ESR1+ neurons in the PMC to control the coordination of bladder and sphincter muscles during urination. This is an important extension of previous work defining the role of these brainstem neurons, and convincingly adds to the understanding of their role as master regulators of urination. This is a thorough, well-done study that clarifies how the Pontine micturition center coordinates different muscle groups for efficient urination, but there are some questions and considerations that remain.

      Strengths:

      These data are thorough and convincing in showing that ESR1+PMC neurons exert coordinated control over both the bladder and sphincter activity, which is essential for efficient urination. The anatomical distinctions in pelvic versus pudendal control are clear, and it's an advance to understand how this coordination occurs. This work offers a clearer picture of how micturition is driven.

      Weaknesses:

      The dynamics of how this population of ESR1+ neurons is engaged in natural urination events remains unclear. Not all ESR1+neurons are always engaged, and it is not measured whether this is simply variation in population activity, or if more neurons are engaged during more intense starting bladder pressures, for instance. In particular, the response dynamics of single and doubly-projecting neurons are not defined. Additionally, the model for how these neurons coordinate with CRH+ neuron activity in the PMC is not addressed, although these cell types seem to be engaged at the same time. Lastly, it would be interesting to know how sensory input can likely modulate the activity of these neurons, but this is perhaps a future direction.

      In response to the reviewer’s comments, we will attempt perform the following revisions for this round:

      (1) Engagement of ESR1+ neurons in natural urination events:

      We agree that probably not all ESR1+ neurons are consistently engaged during urination. To address this, we will perform a detailed analysis of the opto-tagged single unit recordings data.

      (2) Response dynamics of single- and doubly-projecting neurons:

      (a) We will use retrograde labelling combined with Ca2+ photometry recordings to differentiate the response dynamics of SPN- and DGC-projecting neurons during urination.

      (b) We will perform functional validations to assess the specific roles of single- and doubly-projecting neurons in coordinating bladder and EUS activity.

      (3) Coordination with CRH+ neurons in the PMC:<br /> We appreciate the suggestion to include CRH+ neurons in our model. We will expand our model to incorporate CRH+ neurons and their potential interactions with ESR1+ neurons.

      (4) Sensory modulation of ESR1+ neurons:<br /> The reviewer raises an excellent point regarding sensory input modulation of ESR1+ neuron activity. Although this is beyond the scope of our current study, we recognize its importance and propose to include this as a future direction.

      Reviewer #3 (Public review):

      Summary:

      The paper by Li et al explored the role of Estrogen receptor 1 (Esr1) expressing neurons in the pontine micturition center (PMC), a brainstem region also known as Barrington's nucleus (Hou et al 2016, Keller et al 2018). First, the author conducted bulk Ca2+ imaging/unit recording from PMCESR1 to investigate the correlations of PMCESR1 neural activity to voiding behavior in conscious mice and bladder pressure/external urethral muscle activity in urethane anesthetized mice. Next, the authors conducted optogenetics inactivation/activation of PMCESR1 to confirm the contribution to the voiding behavior also conducted peripheral nerve transection together with optogenetics activation to confirm the independent control of bladder pressure and urethral sphincter muscle.

      Weaknesses:

      (1) The study demonstrates that pelvic nerve transection reduces urinary volume triggered by PMCESR1+ cell photoactivation in freely moving mice. Could the role of pudendal nerve transection also be examined in awake mice to provide a more comprehensive understanding of neural involvement?

      Thank you for the suggestion, the pudendal nerve transection in awake mice is indeed a challenging experiment that has been missed. We will try it for the revision.

      (2) While the paper primarily focuses on PMCESR1+ cells in bladder-sphincter coordination, the analysis of PMCESR1+-DGC/SPN neural circuits - given their distinct anatomical projections in the sacral spinal cord - feels underexplored. How do these circuits influence bladder and sphincter function when activated or inhibited? Also, do you have any tracing data to confirm whether bladder-sphincter innervation comes from distinct spinal nuclei?

      Thank you for this great comment. The projection-specific neuronal function analysis is, as also suggested by Reviewer 2 in a similar comment (#8), missing in our first submission. These are so challenging experiments that we have missed in the first round of tests, but we decide to pursuit this goal again. Namely, we will perform photometry recordings of PMC neurons projecting to the DGC/SPN during measuring bladder pressure and urethral sphincter EMG activity. Additionally, while our study does not include direct tracing data to confirm distinct spinal nuclei for bladder and sphincter innervation, this has been well-documented in classic literature (Yao et al., 2018, Karnup and De Groat, 2020, Karnup, 2021). Specifically, anatomical studies have shown that SPN primarily innervates the bladder, while the DGC is associated with the innervation of the urethral sphincter. We will cite these references to provide context and support for our interpretations.

      (3) Although the paper successfully identifies the physiological role of PMCESR1+ cells in bladder-sphincter coordination, the study falls short in examining the electrophysiological properties of PMCESR1+-DGC/SPN cells. A deeper investigation here would strengthen the findings.

      While our study primarily focuses on the functional role of PMCESR1+ neurons in bladder-sphincter coordination, we acknowledge that understanding their intrinsic electrophysiological characteristics could further strengthen our findings. However, this aspect falls beyond the scope of the current study. Nevertheless, we recognize the significance of this direction and are excited to pursue it in future research. We appreciate the reviewer’s suggestion, as it highlights an important avenue for expanding upon our current findings.

      (4) The parameters for photoactivation (blue light pulses delivered at 25 Hz for 15 ms, every 30 s) and photoinhibition (pulses at 50 Hz for 20 ms) vary. What drove the selection of these specific parameters? Moreover, for photoactivation experiments, the change in pressure (ΔP = P5 sec - P0 sec) is calculated differently from photoinhibition (Δpressure = Ppeak - Pmin). Can you clarify the reasoning behind these differing approaches?

      We sincerely thank the reviewer for raising these important points and for the opportunity to clarify our experimental design and data analysis methods.

      Photoactivation versus photoinhibition parameters: The differences in photoactivation (25 Hz, 15 ms pulses) and photoinhibition (50 Hz, 20 ms pulses) protocols are based on the distinct physiological and technical requirements for activating versus inhibiting PMCESR1+ neurons. For photoactivation, 25 Hz stimulation aligns with the natural firing patterns of central neurons, allowing for intermittent activation without exceeding the neuronal refractory period. The shorter pulse duration (15 ms) minimizes phototoxicity and avoids overstimulation, as performed in previous studies (Keller et al., 2018). In contrast, photoinhibition requires sustained suppression of neuronal activity, achieved through higher frequencies (50 Hz) and longer pulses (20 ms) to ensure continuous coverage of neuronal activity.

      Calculation of pressure changes (ΔP) for photoactivation and photoinhibition: The differing methods for calculating pressure changes reflect the distinct physiological effects we aimed to capture. In photoactivation experiments (ΔP = P5 sec - P0 sec), the pressures before (P0 sec) and 5 seconds after (P5 sec) light delivery were compared to capture the immediate effect of light activation on bladder pressure, focusing on the onset and early dynamics of activation. In contrast, photoinhibition experiments assessed the immediate impact of light-induced suppression on bladder pressure during an ongoing voiding event. Here, Δpressure was calculated as Ppeak – Pmin to measure the rapid drop in pressure directly attributable to neuronal inhibition.

      We will expand these details in the methods section of the revised manuscript to provide greater transparency.

      (5) The discussion could further emphasize how PMCESR1+ cells coordinate bladder contraction and sphincter relaxation to control urination, highlighting their central role in the initiation and suspension of this process.

      We fully agree with this point. Additionally, in response to your and other reviewers’ suggestions, we are preparing a new round of experiments with projection-specific recording, and thus our discussion and conclusion will also be updated according to the newly obtained data.

      (6) In Figure 8, The authors analyze the temporal sequence of bladder pressure and EUS bursting during natural voiding and PMC activation-induced voiding. It would be acceptable to consider the existence of a lower spinal reflex circuit, however, the interpretation of the data contains speculation. Bladder pressure measurement is hard to say reflecting efferent pelvic nerve activity in real time. (As a biological system, bladder contraction is mediated by smooth muscle, and does not reflect real-time efferent pelvic nerve activity. As an experimental set-up, bladder pressure measurement has some delays to reflect bladder pressure because of tubing, but EUS bursting has no delay.) Especially for the inactivation experiment, these factors would contribute to the interpretation of data. This reviewer recommends a rewrite of the section considering these limitations. Most of the section is suitable for the results.

      Thank you for mentioning the possibility of bladder pressure measurement delay. We would prefer to perform a physical control test to quantify how much delay this measurement is under our experimental conditions. We will use a small ballon to mimic the bladder and use two identical pressure sensors, one with a very short tube inserted into the ballon and one with an extended tube same as in our animal experiments. We will then mimic both contraction initiation and halting, and quantify the delay between the two sensors.

      References

      • Chang HY, Cheng CL, Chen JJJ, de Groat WC. 2007. Serotonergic drugs and spinal cord transections indicate that different spinal circuits are involved in external urethral sphincter activity in rats. American Journal of Physiology-Renal Physiology 292: F1044-F1053. DOI: 10.1152/ajprenal.00175.2006

      • de Groat WC. 2009. Integrative control of the lower urinary tract: preclinical perspective. British Journal of Pharmacology 147. DOI: 10.1038/sj.bjp.0706604

      • de Groat WC, Yoshimura N. 2015. Anatomy and physiology of the lower urinary tract. Handb Clin Neurol 130: 61-108. DOI: 10.1016/B978-0-444-63247-0.00005-5

      • Kadekawa K, Yoshimura N, Majima T, Wada N, Shimizu T, Birder LA, Kanai AJ, de Groat WC, Sugaya K, Yoshiyama M. 2016. Characterization of bladder and external urethral activity in mice with or without spinal cord injury—a comparison study with rats. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology 310: R752-R758. DOI: 10.1152/ajpregu.00450.2015

      • Karnup S. 2021. Spinal interneurons of the lower urinary tract circuits. Autonomic Neuroscience 235. DOI: 10.1016/j.autneu.2021.102861

      • Karnup SV, De Groat WC. 2020. Mapping of spinal interneurons involved in regulation of the lower urinary tract in juvenile male rats. IBRO Rep 9: 115-131. DOI: 10.1016/j.ibror.2020.07.002

      • Keller JA, Chen J, Simpson S, Wang EH-J, Lilascharoen V, George O, Lim BK, Stowers L. 2018. Voluntary urination control by brainstem neurons that relax the urethral sphincter. Nature Neuroscience 21: 1229-1238. DOI: 10.1038/s41593-018-0204-3             

      • Yao J, Zhang Q, Liao X, Li Q, Liang S, Li X, Zhang Y, Li X, Wang H, Qin H, Wang M, Li J, Zhang J, He W, Zhang W, Li T, Xu F, Gong H, Jia H, Xu X, Yan J, Chen X. 2018. A corticopontine circuit for initiation of urination. Nature Neuroscience 21: 1541-1550. DOI: 10.1038/s41593-018-0256-4

    1. eLife Assessment

      This work presents important findings regarding the interaction of the monkeypox virus (MPXV) attachment H3 protein with the cellular receptor heparan sulfate and the use of this information to develop antivirals potentially effective against all orthopoxviruses. Using a combination of state-of-the art computational and wet experiments the authors present convincing evidence to sustain their claims. These results will interest those working on basic orthopoxviruses biology and antiviral development.

    2. Reviewer #1 (Public Review):

      Summary:

      The study aimed to better understand the role of the H3 protein of the Monkeypox virus (MPXV) in host cell adhesion, identifying a crucial α-helical domain for interaction with heparan sulfate (HS). Using a combination of advanced computational simulations and experimental validations, the authors discovered that this domain is essential for viral adhesion and potentially a new target for developing antiviral therapies.

      Strengths:

      The study's main strengths include the use of cutting-edge computational tools such as AlphaFold2 and molecular dynamics simulations, combined with robust experimental techniques like single-molecule force spectroscopy and flow cytometry. These methods provided a detailed and reliable view of the interactions between the H3 protein and HS. The study also highlighted the importance of the α-helical domain's electric charge and the influence of the Mg(II) ion in stabilizing this interaction. The work's impact on the field is significant, offering new perspectives for developing antiviral treatments for MPXV and potentially other viruses with similar adhesion mechanisms. The provided methods and data are highly useful for researchers working with viral proteins and protein-polysaccharide interactions, offering a solid foundation for future investigations and therapeutic innovations.

      Comments on revised version:

      The authors have successfully addressed the questions raised in my review.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript presenting the discovery of a heparan-sulfate (HS) binding domain in monkeypox virus (MPXV) H3 protein as a new anti-poxviral drug target, presented by Bin Zhen and co-workers, is of interest, given that it offers a potentially broad antiviral substance to be used against poxviruses. Using new computational biology techniques, the authors identified a new alpha-helical domain in the H3 protein, which interacts with cell surface HS, and this domain seems to be crucial for H3-HS interaction. Given that this domain is conserved across orthopoxviruses, authors designed protein inhibitors. One of these inhibitors, AI-PoxBlock723, effectively disrupted the H3-HS interaction and inhibited infection with Monkeypox virus and Vaccinia virus. The presented data should be of interest to a diverse audience, given the possibility of an effective anti-poxviral drug.

      Strengths:

      In my opinion, the experiments done in this work were well-planned and executed. The authors put together several computational methods, to design poxvirus inhibitor molecules, and then they test these molecules for infection inhibition.

      Comments on revised version:

      The authors have addressed the comments I made in my review.

    4. Reviewer #3 (Public Review):

      Summary:

      The article is an interesting approach to determining the MPOX receptor using "in silico" tools. The results show the presence of two regions of the H3 protein with a high probability of being involved in the interaction with the HS cell receptor. However, the α-helical region seems to be the most probable, since modifications in this region affect the virus binding to the HS receptor.

      Strengths:

      In my opinion, it is an informative article with interesting results, generated by a combination of "in silico" and wet science to test the theoretical results. This is a strong point of the article.

      Comments on revised version:

      After a review of the changes to the manuscript and the author's responses, no further changes are needed.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study aimed to better understand the role of the H3 protein of the Monkeypox virus (MPXV) in host cell adhesion, identifying a crucial α-helical domain for interaction with heparan sulfate (HS). Using a combination of advanced computational simulations and experimental validations, the authors discovered that this domain is essential for viral adhesion and potentially a new target for developing antiviral therapies.

      Strengths:

      The study's main strengths include the use of cutting-edge computational tools such as AlphaFold2 and molecular dynamics simulations, combined with robust experimental techniques like single-molecule force spectroscopy and flow cytometry. These methods provided a detailed and reliable view of the interactions between the H3 protein and HS. The study also highlighted the importance of the α-helical domain's electric charge and the influence of the Mg(II) ion in stabilizing this interaction. The work's impact on the field is significant, offering new perspectives for developing antiviral treatments for MPXV and potentially other viruses with similar adhesion mechanisms. The provided methods and data are highly useful for researchers working with viral proteins and protein-polysaccharide interactions, offering a solid foundation for future investigations and therapeutic innovations.

      Weaknesses:

      However, some limitations are notable. Despite the robust use of computational methodologies, the limitations of this approach are not discussed, such as potential sources of error, standard deviation rates, and known controls for the H3 protein to justify the claims. Additionally, validations with methodologies like X-ray crystallography would further benefit the visualization of the H3 and HS interaction.

      Thank you very much for the evaluation and appreciation of our work. In response to the identified weakness, we have conducted additional analyses to further assess the limitations of the computational methodologies used. Specifically, we predicted the MPXV H3 structure using two other AI-based protein structure prediction models, ESMFold and RoseTTAFold2. Both models also predicted an a-helical structure, which supports our conclusion. However, they yielded lower pLDDT scores (Figure S1A-C in the revised SI), indicating that some error may be present.

      We agree with this reviewer, as well as the other reviewers, that X-ray crystallography data for the H3 structure would be highly valuable. Unfortunately, we lack the expertise in structural biology to obtain these results at this stage. To complement this, we performed molecular dynamics (MD) simulations, which suggest that the helical domain is connected to the main domain via a flexible linker. This flexibility may help explain the challenges in obtaining a high-resolution X-ray structure. In fact, to date, the only structural data available for H3 is from the VAVC, which excludes the helical domain (The helical domain part is cleaved for the X-ray studies). We have added this point to the discussion and hope that experts in structural biology will be able to resolve the structure of this domain in the future.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript presenting the discovery of a heparan-sulfate (HS) binding domain in monkeypox virus (MPXV) H3 protein as a new anti-poxviral drug target, presented by Bin Zhen and co-workers, is of interest, given that it offers a potentially broad antiviral substance to be used against poxviruses. Using new computational biology techniques, the authors identified a new alpha-helical domain in the H3 protein, which interacts with cell surface HS, and this domain seems to be crucial for H3-HS interaction. Given that this domain is conserved across orthopoxviruses, authors designed protein inhibitors. One of these inhibitors, AI-PoxBlock723, effectively disrupted the H3-HS interaction and inhibited infection with Monkeypox virus and Vaccinia virus. The presented data should be of interest to a diverse audience, given the possibility of an effective anti-poxviral drug.

      Strengths:

      In my opinion, the experiments done in this work were well-planned and executed. The authors put together several computational methods, to design poxvirus inhibitor molecules, and then they test these molecules for infection inhibition.

      Weaknesses:

      One thing that could be improved, is the presentation of results, to make them more easily understandable to readers, who may not be experts in protein modeling programs. For example, figures should be self-explanatory and understood on their own, without the need to revise text. Therefore, the figure legend should be more informative as to how the experiments were done.

      Thank you very much for your appreciation of our work and your support. In response to the identified weakness, we have carefully reviewed all the figure legends to ensure they are more informative.

      Reviewer #3 (Public Review):

      Summary:

      The article is an interesting approach to determining the MPOX receptor using "in silico" tools. The results show the presence of two regions of the H3 protein with a high probability of being involved in the interaction with the HS cell receptor. However, the α-helical region seems to be the most probable, since modifications in this region affect the virus binding to the HS receptor.

      Strengths:

      In my opinion, it is an informative article with interesting results, generated by a combination of "in silico" and wet science to test the theoretical results. This is a strong point of the article.

      Weaknesses:

      Has a crystal structure of the H3 protein been reported?

      The following text is in line 104: "which may represent a novel binding site for HS". It is unclear whether this means this "new binding site" is an alternative site to an old one or whether it is the true binding site that had not been previously elucidated.

      Thank you very much for your thoughtful evaluation and appreciation of our work.

      We agree with this reviewer, as well as the other reviewers, that X-ray crystallography data for the H3 structure would be highly valuable. Unfortunately, we are not experts in structural biology, and we have not yet been able to obtain these structural results. To date, the only structure available for H3 is the one from VAVC, which does not include the helical domain. We have included this point in the discussion and hope that experts in structural biology will be able to resolve the structure of this domain in the future.

      Regarding the "novel binding site," this term refers to "the true binding site that had not been previously elucidated." Previous research identified that H3 binds to heparan sulfate (HS), but the exact binding site had not been determined.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Validation of Results with Other Experimental Methods: While single-molecule force spectroscopy and flow cytometry provide valuable data, including complementary methods such as X-ray crystallography could offer additional insights into the H3-HS interaction and the effectiveness of the inhibitors.

      Discussion of Computational Model Limitations: Although the use of AlphaFold2 and other advanced tools is a strength, it is important to discuss the limitations of these models in more detail, including potential sources of error and how they may impact the interpretation of the results.

      During the manuscript evaluation, it is not clear the protein localization (transmembrane?) since the protein`s end is very close to the virus membrane surface. All experiments demonstrated the protein without being anchored to the membrane, letting the interaction site always be exposed. If the protein is linked to the membrane, how would the site be exposed due to the limited space between it and the virus structure?

      Thank you for these insightful comments. As you pointed out, the H3 protein, particularly the helical domain at the C-terminal, is indeed located close to the membrane, which could limit the available space for H3 binding. To investigate this further, we modeled the full-length H3 protein in the context of the membrane and performed molecular dynamics (MD) simulations to assess the available space. Our results show that there is more than 1 nm of space between the helical domain and the membrane, which should be sufficient for potential heparan sulfate (HS) binding (see Figure 1E, and Figure S1D&E in the revised manuscript).

      Minor corrections:

      Line 31: "is an emerging zoonotic pathogen" should be revised to reflect that Mpox is a re-emerging virus, given its history of causing outbreaks, such as in 2003.

      Line 71 and Line 75: Adding an explanation of "Mg binding sites" and "GAG motifs" would enhance reader understanding, as these represent important points in the study. The current positioning of Figure 1 causes some confusion for the reader.

      Line 111: High score? What controls were used for the protein? Are there known inhibitors of H3? If so, why weren't they tested for structure comparison? Additionally, what about other molecules that H3 binds to, such as UDP-Glucose, as demonstrated in the base article for the Vaccinia virus H3 protein available in the PDB?

      Figure 2B: Improve the legend, as the colors of the lines are not clear.

      Thank you for your instructive comments. We have addressed most of them in the revised manuscript.

      Regarding the "high score," AlphaFold2 provides a confidence score for its protein structure predictions, with a maximum score of 100. A score above 80 indicates a high level of confidence in the prediction.

      There are known inhibitors (such as antibodies) of H3, and while the sequence is available, no structure has been reported so far. Previous s NMR titration measurements have shown that UDP-glucose binds to H3, but no structural data for the complex exist. To date, the only available crystal structure is of a truncated H3, which does not include the helical domain we identified from VAVC.

      Reviewer #2 (Recommendations For The Authors):

      The text described in the result section does not match the text presented in Figures. So, it is not easy to see what are the authors referring to when they mention the Figure. For example, the text referring to Figure S8 mentions the GB1 domain and the Cohesin module, but these are not mentioned in Figure S8.

      I do not understand the results presented in Figure 5B. It is not clear to me, from the Figure legend nor after reading the Material and Methods, how this experiment was done. Specifically, what is plotted on X, is it the amount of inhibitor or the amount of protein? These things have to be checked through the manuscript.

      It would be interesting to confirm if the inhibition of infection is based on the inhibition of viral binding to the cells. This should not be complicated to realize, and it could provide evidence for the mechanism of action.

      Extensive use of terms like "this domain" is not good in this type of article, like in lines 207, and 211. It is not always clear to what domain are authors referring to, so it may be much better to mention the domain in question by the exact name.

      Line 337, If I am not mistaken dilutions are serial not series.

      Line 613, in methods. Please use g force instead of rpm, it is more informative. Even if it is just to pellet cells.

      Thank you very much for your instructive comments. We have addressed most of them in the revised manuscript. For instance, the immobilization of the GB1 domain and the cohesin module is now mentioned in Figure S9. Additionally, in the previous Figure 5B, the "x" represents the concentration of the inhibitor. Serial and g force is updated.

      Reviewer #3 (Recommendations For The Authors):

      Line 190

      Did you mutate all the amino acids at the same time? What was the impact of all these mutations on the structure of the helical region? Or if you modeled the protein again after replacing these 7 amino acids, did you find that there was no difference? Regardless of your answer, you must include a superposition of the mutated structure and the wt.

      Thank you for the insightful comment. We have now also predicted the structure of the serine mutant using AlphaFold2 (AF2). As expected, the helical domain structure remains largely preserved with only minor differences. We have included these results in Figure S6, as suggested.

      Figure 2D

      In this graph, the authors should indicate the ΔG as a negative value. In fact, the graph does not match the text.

      Thanks for the reminder, it is corrected in the graph

      Figure 4B

      Is the difference in binding force significantly different? 28.8 vs 33.7 pN

      The absolute difference in binding force is not large (~5 pN). However, for a system with a relatively low binding force, this difference is significant. Specifically, the 5 pN difference accounts for approximately a 14% reduction in binding force. We have included this percentage in the revised manuscript.

      Figure 5

      If AI-PoxBlocks723 was the only peptide effective in inhibiting viral infection of MPOX and other related viruses but not with 100% effectiveness, do you think this could be a consequence of a low interaction efficiency or the existence of a different receptor? Or a secondary region of binding in the H3? Can you argue about this?

      It has been proposed that there are other adhesion proteins for MPXV, such as D8, in addition to H3. We believe this accounts for the observed less-than-100% effectiveness.

      The use of peptides as "inhibitory tools" could have an interesting effect in vitro, however, in vivo the immunological response against the peptide will reduce/eliminate it, how you may optimize the "drug" development with this system, as you state in line 387.

      Thank you for your thoughtful comment. You are correct that the use of peptides as inhibitory tools could induce an immune response in vivo, which might limit their effectiveness over time. To optimize this approach for drug development, conjugate the peptides with carrier molecules, such as liposomes, nanoparticles, or dendrimers, which can protect the peptides from immune detection and improve their delivery to target cells. This could allow for more controlled and sustained release of the peptide in vivo, reducing the chances of immune clearance. We have added this discussion in the revised manuscript.

    1. eLife Assessment

      This fundamental work provides evidence that glutamate and GABA are released from different synaptic vesicles at supramammillary axon terminals onto granule cells of the dentate gyrus. The study uses complementary electrophysiological and anatomical experimental approaches. Together, these provide convincing evidence that the co-release of glutamate and GABA from different vesicles within the same terminal could modulate granule cell firing in a frequency-dependent manner, although thorough elimination of alternative mechanisms would have strengthened the study. The work will be of interest to neuroscientists investigating co-release of neurotransmitters in various synapses in the brain and those interested in subcortical control of hippocampal function.

    2. Reviewer #1 (Public review):

      This study of mixed glutamate/GABA transmission from axons of the supramammillary nucleus to dentate gyrus seeks to sort out whether the two transmitters are released from the same or different synaptic vesicles. This conundrum has been examined in other dual-transmission cases and even in this particular pathway there are different views. The authors use a variety of electrophysiological and immunohistochemical methods to reach the surprising (to me) conclusion that glutamate and GABA filled vesicles are distinct yet released from the same nerve terminals. While the strength of the conclusion rests on the abundance of data (approaches) rather than the decisiveness of any one approach, I came away believing that the boutons may indeed produce and release distinct types of vesicles. Accepting the conclusion, one is now left with another conundrum: how can a single bouton sort out VGLUTs and VIAATs to different vesicles, position them in distinct locations with nm precision and recycle them without mixing? And why do it this way instead of with single vesicles having mixed chemical content? For example, could a quantitative argument be made that separate vesicles allow for higher transmitter concentrations? Hopefully, future studies will probe these issues.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors investigated the release properties of glutamate/GABA co-transmission at the supramammillary nucleus (SuM)-dentate granule cell (DGC) synapses using state -of-the-arts in vitro electrophysiology and anatomical approaches at the light and electron microscopy level. They found that SuM to dentate granule cell synapses, which co-release glutamate and GABA, exhibit distinct differences in paired-pulse ratio, Ca2+ sensitivity, presynaptic receptor modulation, and Ca2+ channel-vesicle coupling configuration for each neurotransmitter. The study shows that glutamate/GABA co-release produces independent glutamatergic and GABAergic synaptic responses, with postsynaptic targets segregated. They show that most SuM boutons form distinct glutamatergic and GABAergic synapses at proximity, characterized by GluN1 and GABAAα1 receptor labeling respectively. Furthermore, they demonstrate that glutamate/GABA co-transmission exhibits distinct short-term plasticity, with glutamate showing frequency-dependent depression and GABA showing frequency-independent stable depression. The authors provide compelling evidence at the anatomical and physiological levels that glutamate and GABA are co-release by different synaptic vesicles within the same synaptic terminal at the SuM-DGC synapses and that the distinct transmission modes of the glutamate and GABA release serve as a frequency-dependent filters of SuM inputs on GC outputs.<br /> This is a fundamental work, that significantly advances our understanding of the mechanism by which the two fast-acting and functionally opposing neurotransmitters glutamate and GABA are co-transmitted at the SuM-DGC synapses and the functional role of this type of Glutamate/GABA co-transmission.

      Strengths:

      The conclusions of this paper are provided by a large number of compelling data

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Hirai et al investigated the release properties of glutamate/GABA co-transmission at SuM-GC synapses and reported that glutamate/GABA co-transmission exhibits distinct short-term plasticity with segregated postsynaptic targets. Using optogenetics, whole-cell patch-clamp recordings, and immunohistochemistry, the authors reveal distinct transmission modes of glutamate/GABA co-release as frequency-dependent filters of incoming SuM inputs.

      Strengths:

      Overall, this study is well-designed and executed; conclusions are supported by the results. This study addressed a long-standing question of whether GABA and glutamate are packaged in the same vesicles and co-released in response to the same stimuli in the SuM-GC synapses (Pedersen et al., 2017; Hashimotodani et al., 2018; Billwiller et al., 2020; Chen et al., 2020; Li et al., 2020; Ajibola et al., 2021). Knowledge gained from this study advances our understanding of neurotransmitter co-release mechanisms and their functional roles in the hippocampal circuits.

      Comments on revisions:

      The authors have addressed my comments, and now the manuscript is in a good form as it currently stands.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      This study of mixed glutamate/GABA transmission from axons of the supramammillary nucleus to dentate gyrus seeks to sort out whether the two transmitters are released from the same or different synaptic vesicles. This conundrum has been examined in other dual-transmission cases and even in this particular pathway, there are different views. The authors use a variety of electrophysiological and immunohistochemical methods to reach the surprising (to me) conclusion that glutamate and GABA- filled vesicles are distinct yet released from the same nerve terminals. The strength of the conclusion rests on the abundance of data (approaches) rather than the decisiveness of any one approach, and I came away believing that the boutons may indeed produce and release distinct types of vesicles, but have reservations. 

      We thank the reviewer for his/her evaluation of our work. At present, several studies reported that a variety of combinations of two transmitters are co-released from different synaptic vesicles in the central nervous system. In this regard, we think the cotransmission of glutamate/GABA from different synaptic vesicles is not surprising. To better explain to the reader how much we know about co-release of dual transmitters in the brain, we have now added new sentences describing segregated co-release of two neurotransmitters in other synapses in the Introduction (line 63-80).

      Accepting the conclusion, one is now left with another conundrum, not addressed even in the discussion: how can a single bouton sort out VGLUTs and VIAATs to different vesicles, position them in distinct locations with nm precision, and recycle them without mixing? And why do it this way instead of with single vesicles having mixed chemical content? For example, could a quantitative argument be made that separate vesicles allow for higher transmitter concentrations? I feel the paper needs to address these problems with some coherent discussion, at minimum. 

      Although these questions are very important and interesting to address, little is known about molecular mechanisms how VGluT2 and VIAAT are sorted to different vesicles and each synaptic vesicle is segregated. That is why we had not mentioned the sorting mechanisms in the original manuscript. Nevertheless, in response to the reviewer’s suggestion, we have now added new sentences describing possible mechanisms for the sorting and segregation of VGluT2 and VIAAT in the Discussion (line 439-462).

      As for the question regarding why glutamate and GABA are released from different synaptic vesicles, we mentioned the functional roles of separate release of two transmitters over release from single vesicles several times in the Introduction (line 94100), Results (line 300-302), and Discussion (line 406-408, 521-522). Although it seems to be an interesting point to think about transmitter concentrations in the vesicles, we think this issue is beyond the scope of the present study. Given that manipulation of vesicular transmitter contents is technically possible (Hori and Takamori, 2021), this issue awaits further investigation.

      Major concerns: 

      (1) Throughout the paper, the authors use repetitive optogenetic stimulation to activate SuM fibers and co-release glutamate and GABA. There are several issues here: first, can the authors definitively assure the reader that all the short-term plasticity is presynaptic and not due to ChR2 desensitization? This has not been addressed. Second, can the authors also say that all the activated fibers release both transmitters? If for example 20% of the fibers retained a onetransmitter identity and had distinct physiological properties, could that account for some of the physiological findings? 

      Thank you for raising this important point. To examine whether repetitive light illumination induces ChR2 desensitization, the fiber volley was extracellularly recorded. We found that paired-pulse or 10 stimuli at 5, 10, and 20 Hz reliably evoked similar amplitudes of fiber volley during light stimulation. These results clearly indicate that repetitive light stimulation can reliably activate ChR2 and elicit action potentials in the SuM axons. These new findings are now included in Figure 1-figure supplement 2 and Figure 5-figure supplement 2. We also previously demonstrated that by direct patch-clamp recordings from ChR2-expressing hippocampal mossy fiber terminals, 125 times light stimulation at 25 Hz reliably elicited action potentials (Fig. S1: Fukaya et al., 2023). Therefore, we believe that if expression level of ChR2 is high, activation of ChR2 induces action potentials in response to repetitive light stimulation and mediates synaptic transmission with high efficiency.

      We found that most of the SuM terminals (95%) have both VGluT2 and VIAAT (Figure 1E). This anatomical evidence strongly indicates that most of the SuM terminals have the ability to release both glutamate and GABA, and the SuM fibers having one transmitter identity should be minor populations.

      (2) PPR differences in Figures 1F-I are statistically significant but still quite small. You could say they are more similar than different in fact, and residual differences are accounted for by secondary factors like differential receptor saturation. 

      In this experiment, the light intensity was adjusted to yield less than 80% of the maximum response as described in the method section of original and revised manuscript, minimizing the possibility of receptor saturation. We also excluded the possibility that PPR differences could be attributed to differential receptor saturation and desensitization by using a low-affinity AMPA receptor antagonist and a low-affinity GABAA receptor antagonist (Figure 5-figure supplement 3). These results indicate that PPR differences are mediated by the presynaptic origin.

      (3) The logic of the GPCR experiments needs a better setup. I could imagine different fibers released different transmitters and had different numbers of mGluRs, so that one would get different modulations. On the assumption that all the release is from a single population of boutons, then either the mGluRs are differentially segregated within the bouton, or the vesicles have differential responsiveness to the same modulatory signal (presumably a reduced Ca current). This is not developed in the paper. 

      Based on our minimal stimulation results and anatomical analysis, we believe that many SuM terminals contain both glutamate and GABA. Therefore, both transmissions are able to be modulated by mGluRs and GABAB receptors within the same terminals. As the reviewer pointed out, differential responsiveness of glutamate-containing and GABA-containing vesicles to the GPCR signal could be one of the molecular mechanisms for differential effects of GPCRs on EPSCs and IPSCs. In addition, the spatial coupling between GPCRs and active zones for glutamate and GABA in the same SuM terminals may be different, which may give rise to differential modulation of glutamate and GABA release. These possible mechanisms are now described in the Discussion (line 469-476).

      (4) The biphasic events of Figures 3 and S3: I find these (unaveraged) events a bit ambiguous. Another way to look at them is that they are not biphasic per se but rather are not categorizable. Moreover, these events are really tiny, perhaps generated by only a few receptors whose open probability is variable, thus introducing noise into the small currents. 

      We agree with the reviewer that some events are tiny and some small currents could be masked by background noise. We understand that detecting the biphasic events by minimal stimulation has technical limitations. Because we automatically detected biphasic events, which were defined as an EPSC-IPSC sequence, only if an outward peak current following an inward current appeared within 20 ms of light illumination as described in the method section, we cannot exclude the possibility that the biphasic events we detected might include false biphasic responses. To compensate these technical issues, we also performed strontium-induced asynchronous release as another approach and found similar results as minimal stimulation experiments (Figures 3E and 3F). Furthermore, we confirmed that the amplitudes and kinetics of minimal light stimulation-evoked EPSCs or IPSCs were not altered by blockade of their counterpart currents (Figure 3-figure supplement 2). Even if false biphasic responses were accidentally included in the analysis, eventually biphasic events are a minor population and we successfully detected discernible independent EPSCs and IPSCs, which were the major population of uniquantal release-mediated synaptic responses. Thus, multiple pieces of evidence support distinct release of glutamate and GABA from SuM terminals.

      (5) Figure 4 indicates that the immunohistochemical analysis is done on SuM terminals, but I do not see how the authors know that these terminals come from SuM vs other inputs that converge in DG. 

      We thank the reviewer for raising an important point. As shown in Figure 4A, B, almost all VGluT2-positive terminals in the GC layer co-expressed with VIAAT. We are aware that VTA neurons reportedly project to the GC layer of the DG and co-release glutamate and GABA (Ntamati and Luscher, 2016). Contrary to this report, our retrograde tracing analysis did not reveal direct projections from the VTA to the DG. This new data is now included in Figure 4-figure supplement 1. We also added pre-embedding immunogold EM analysis, in which SuM terminals were virally labeled with eYFP, confirming that they form both asymmetric and symmetric synapses (revised Figure 4F). Together with these new data, our results clearly demonstrate that SuM terminals in the GC layer form both asymmetric and symmetric synapses. While our results strongly suggest that VGluT2positive terminals and SuM terminals in the GC layer are nearly identical, we cannot fully exclude the possibility that other inputs originating from unidentified brain regions may co-express VGluT2 and VIAAT in the GC layer. Therefore, in Figure 4 of the revised manuscript, we described “VGluT2-positive terminals” instead of “SuM terminals”.

      (6) Figure 4E also shows many GluN1 terminals not associated with anything, not even Vglut, and the apparent numbers do not mesh with the statistics. Why? 

      In triple immunofluorescence for VGluT2, VIAAT, and GluN1, free GluN1 puncta were predominantly observed in the molecular layer. Given that VGluT2-positive terminals are sparse in the molecular layer, these GluN1 puncta are primarily associated with VGluT1, the dominant subtype. In this study, we focused the analysis of GluN1 puncta specifically on the GC layer, excluding the molecular layer. To avoid miscommunication, we changed the original Figure 4E to the new Figure 4G, which focuses on the GC layer and aligns with the quantitative analysis. Additionally, we used ultrathin sections (100-nm-thick) to enhance spatial resolution, which limits the detection of co-localization events within this confined spatial range, as noted in the Discussion (line 485-488).

      (7) Do the conclusions based on the fluorescence immuno mesh with the apparent dimensions of the EM active zones and the apparent intermixing of labeled vesicles in immuno EM? 

      To further support our immunofluorescence results, we performed EM study and found that a single SuM terminal formed both asymmetric and symmetric synapses on a GC soma (revised Figures 4E and 4F). These new data and our immunofluorescence results clearly indicate that a single SuM terminal forms both glutamatergic and GABAergic synapses on a GC and co-release glutamate and GABA. 

      As the reviewer pointed out, our immuno EM shows that VGluT2 and VIAAT labeled vesicles appear to intermix in asymmetric and symmetric synapses. Accordingly, in the revised manuscript, Figure 7 has been modified to show the intermixing of glutamate and GABA-containing vesicles in the SuM terminal. It should be noted that because of low labeling efficiency, our immuno-EM images don’t represent the whole picture of synaptic vesicles for glutamate and GABA. There could be biased distribution of vesicles close to their release site (more VGluT2-containing vesicles close to asymmetric synapses and more VIAAT-containing vesicles close to symmetric synapses) as reported previously (Root et al., 2018). Additionally, our results could be explained by other mechanisms: co-release of glutamate and GABA from the same vesicles, with one transmitter undetected due to the absence of its postsynaptic receptor. This possibility is now mentioned in the Discussion (line 512-520). More detailed vesicle configuration in a single SuM terminal will have to be investigated in future studies.

      (8) Figure 6 is not so interesting to me and could be removed. It seems to test the obvious: EPSPs promote firing and IPSPs oppose it. 

      We believe these results are necessary for the following two reasons. First, we showed that glutamate/GABA co-transmission balance is dynamically changed in a frequency-dependent manner (Figure 5). In terms of physiological significance, it is important to demonstrate how these frequency-dependent dynamic changes affect GC firing. Therefore, we believe that figure 6, which shows how SuM inputs modulate GC firing by repetitive SuM stimulation, is necessary for this paper. Second, we previously reported the excitatory effects of the SuM inputs on GC firing, suggesting the important roles of glutamatergic transmission of the SuM inputs in synaptic plasticity (Hashimotodani et al., 2018; Hirai et al., 2022; Tabuchi et al., 2022). In contrast, how GABAergic cotransmission contributes to SuM-GC synaptic plasticity and DG information processing was not well understood. Our results in figure 6, which demonstrate the inhibitory effects of GABAergic co-transmission on GC firing by high frequency repetitive SuM input activity, clearly show the contribution of GABAergic co-transmission to short-term plasticity at SuM-GC synapses. For these reasons, we would like to keep Figure 6. We hope that our explanations convince the reviewer. 

      Reviewer #2:

      Summary:

      In this study, the authors investigated the release properties of glutamate/GABA co-transmission at the supramammillary nucleus (SuM)-granule cell (GC) synapses using in vitro electrophysiology and anatomical approaches at the light and electron microscopy level. They found that SuM to dentate granule cell synapses, which co-release glutamate and GABA, exhibit distinct differences in paired-pulse ratio, Ca2+ sensitivity, presynaptic receptor modulation, and Ca2+ channel-vesicle coupling configuration for each neurotransmitter. The study shows that glutamate/GABA co-release produces independent glutamatergic and GABAergic synaptic responses, with postsynaptic targets segregated. They show that most SuM boutons form distinct glutamatergic and GABAergic synapses in close proximity, characterized by GluN1 and GABAAα1 receptor labeling, respectively. Furthermore, they demonstrate that glutamate/GABA co-transmission exhibits distinct short-term plasticity, with glutamate showing frequencydependent depression and GABA showing frequency-independent stable depression. 

      Their findings suggest that these distinct modes of glutamate/GABA co-release by SuM terminals serve as frequency-dependent filters of SuM inputs. 

      Strengths:

      The conclusions of this paper are mostly well supported by the data. 

      We thank the reviewer for their positive and constructive comments on our manuscript.

      Weaknesses: 

      Some aspects of Supplementary Figure 1A and the table need clarification. Specifically, the claim that the authors have stimulated an axon fiber rather than axon terminals is not convincingly supported by the diagram of the experimental setup. Additionally, the antibody listed in the primary antibodies section recognizes the gamma2 subunit of the GABAA receptor, not the alpha1 subunit mentioned in the results and Figure 4. 

      We have now answered these questions in recommendations section below.

      Reviewer #3:

      Summary: 

      In this manuscript, Hirai et al investigated the release properties of glutamate/GABA cotransmission at SuM-GC synapses and reported that glutamate/GABA co-transmission exhibits distinct short-term plasticity with segregated postsynaptic targets. Using optogenetics, whole-cell patch-clamp recordings, and immunohistochemistry, the authors reveal distinct transmission modes of glutamate/GABA co-release as frequency-dependent filters of incoming SuM inputs. 

      Strengths: 

      Overall, this study is well-designed and executed; conclusions are supported by the results. This study addressed a long-standing question of whether GABA and glutamate are packaged in the same vesicles and co-released in response to the same stimuli in the SuM-GC synapses (Pedersen et al., 2017; Hashimotodani et al., 2018; Billwiller et al., 2020; Chen et al., 2020; Li et al., 2020; Ajibola et al., 2021). Knowledge gained from this study advances our understanding of neurotransmitter co-release mechanisms and their functional roles in the hippocampal circuits. 

      Weaknesses:

      No major issues are noted. Some minor issues related to data presentation and experimental details are listed below. 

      We appreciate the reviewer’s positive view of our study. We responded in more detail in recommendations section below.

      Recommendations for the authors:

      Reviewer #1:

      (1) The blue color for VIAAT in panel 1C is extremely hard to see. 

      Thank you for pointing out. We have changed to the cyan color for VIAAT in Figure 1C and D in the revised manuscript.

      (2) Line 329 "perforant" not "perfomant".  

      We appreciate the reviewer’s careful attention. In the revised manuscript, we corrected this misword.

      Reviewer #2:

      To convincingly demonstrate that the authors stimulated SuM axon fiber instead of SuM terminals (Supplementary Figures 1A), they should provide an image showing the distribution of SuMlabeled fibers and axon terminals reaching the dentate gyrus (DG) and the trace of the optic fiber, rather than providing a diagram of the experimental setup. 

      We appreciate the reviewer’s suggestion. We have now provided a new experimental setup image (Figure 1-figure supplement 1A) showing a single GC, the distribution of SuM fibers in the GC layer, and the illumination area at each location. As SuM inputs make synapses onto the GC soma and dendrite close to the GC cell body, SuM-GC synapses in the recording GCs exist in a very limited area. This characteristic synaptic localization allowed us to control the illumination area without applying light to the SuM terminals in the recording GCs. Delayed onsets of EPSCs/IPSCs by over-axon stimulation (Figure 1-figure supplement 1C, D) also support that SuM terminals in the recording GCs were out of illumination area.

      Additionally, the authors should clarify the discrepancy between the antibody mentioned in the list of primary antibodies, which recognizes the gamma2 subunit of the GABAA receptor, and the alpha1 subunit of the GABAA receptor mentioned in the results and Figure 4. 

      We apologize for this mistake. As described in the main text and figure, we used the antibody for a1 subunit of the GABAA receptor. Table S1 has been corrected in the revised version of the paper.

      Reviewer #3:

      (1) In Figure 1, the authors used two [Ca2+]o concentrations to study the EPSC and IPSC amplitudes. How does the Ca2+ concentration affect the PPR in the EPSC and IPSC, respectively? 

      Given that lowering the extracellular Ca2+ concentration reduces the release probability, it is expected that 1 mM extracellular Ca2+ concentration increases PPR compared to 2.5 mM. Actually, we observed that lowering the extracellular Ca2+ concentration increased the synaptic responses from 2nd to 10th (both EPSC and IPSC) by train stimulation (Figure 5).

      (2) In Figure 2D, does baclofen also have a dose-dependent effect on the inhibition of the EPSC and IPSC similar to the DCG-IV in Figure 2C? 

      Thank you for your question. Because we aimed to demonstrate the differential inhibitory effects of baclofen at a certain concentration on glutamatergic and GABAergic co-transmission, we did not go into detail regarding a dose-dependent effect. In response to the reviewer’s comment, we performed the effects of higher concentration of baclofen on EPSCs and IPSCs. As shown in the figure below, 50 µM baclofen inhibited EPSCs and IPSCs to the similar extent. Therefore, by comparing inhibitory effect of two different concentrations of baclofen (5 and 50 µM), we believe that baclofen also has a dose-dependent inhibitory effect on both EPSCs and IPSCs similar to the DCGIV.

      Author response image 1.

      (3) In Figure 2E, statistical labels, such as "*" or "n.s." (not significant), should be provided on the plots to facilitate the reading of figures. 

      In response to the reviewer’s comment, we have provided statistical labels in the Figure 2E.

      (4) In Figure 3A, the latency of the evoked EPSC for the lower light stimulation groups seems to be much slower than the one shown on the left or other figures in the paper, such as Figure 1F.

      Please double-check if the blue light stimulation label is placed in the right location. 

      Corrected, thanks.

      (5) The use of minimal light stimulation in optogenetic experiments is not appropriately justified or described. More detailed information should be provided, such as whether the optogenetic stimulation is performed on the axon or the terminals of the SuM. 

      We appreciate the reviewer’s suggestion. To effectively detect stochastic synaptic responses, the light stimulation was applied on the terminals of the SuM. We have now stated this information (line 212). We also further described the justification of use of minimal light stimulation in the revised manuscript (line 207-209). 

      References

      Fukaya R, Hirai H, Sakamoto H, Hashimotodani Y, Hirose K, Sakaba T (2023) Increased vesicle fusion competence underlies long-term potentiation at hippocampal mossy fiber synapses. Sci Adv 9:eadd3616.

      Hashimotodani Y, Karube F, Yanagawa Y, Fujiyama F, Kano M (2018) Supramammillary Nucleus Afferents to the Dentate Gyrus Co-release Glutamate and GABA and Potentiate Granule Cell Output. Cell Rep 25:2704-2715 e2704.

      Hirai H, Sakaba T, Hashimotodani Y (2022) Subcortical glutamatergic inputs exhibit a Hebbian form of long-term potentiation in the dentate gyrus. Cell Rep 41:111871.

      Hori T, Takamori S (2021) Physiological Perspectives on Molecular Mechanisms and Regulation of Vesicular Glutamate Transport: Lessons From Calyx of Held Synapses. Front Cell Neurosci 15:811892.

      Ntamati NR, Luscher C (2016) VTA Projection Neurons Releasing GABA and Glutamate in the Dentate Gyrus. eNeuro 3.

      Root DH, Zhang S, Barker DJ, Miranda-Barrientos J, Liu B, Wang HL, Morales M (2018) Selective Brain Distribution and Distinctive Synaptic Architecture of Dual Glutamatergic-GABAergic Neurons. Cell Rep 23:3465-3479.

      Tabuchi E, Sakaba T, Hashimotodani Y (2022) Excitatory selective LTP of supra-mammillary glutamatergic/GABAergic co-transmission potentiates dentate granule cell firing. Proc Natl Acad Sci U S A 119:e2119636119.

    1. eLife Assessment

      The presented soft tissue data of pterosaur tail vanes represent a valuable contribution to ongoing research efforts to decipher the flight abilities of pterosaurs in the fields of paleontology, comparative biomechanics, and bioinspired design. The new methods are compelling and give new detail on tail morphology, with a potential to resolve how pterosaurs were able to control and maintain tail stiffness to furnish flight control.

    2. Reviewer #1 (Public review):

      This paper reports fossil soft-tissue structures (tail vanes) of pterosaurs, and attempts to relate this to flight performance and other proposed functions for the tail

      The paper presents new evidence for soft-tissue strengthening of vanes using exciting new methods.

      There is now some discussion of bias in the sample selection method as well as some theory to show how the lattice could have functioned, other than a narrative description.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper reports fossil soft-tissue structures (tail vanes) of pterosaurs, and attempts to relate this to flight performance and other proposed functions for the tail

      Strengths:

      The paper presents new evidence for soft-tissue strengthening of vanes using exciting new methods.

      We thank Reviewer #1 for the positive assessment of our work.

      Weaknesses:

      There seems to be no discussion of bias in the sample selection method - even a simple consideration of whether discarded specimens were likely not to have had the cross-linking lattice, or if it was not visible.

      There seems to be no supporting evidence or theory to show how the lattice could have functioned, other than a narrative description. Moreover, there is no comparison to extant organisms where a comparison of function might be drawn.

      We note these weaknesses and have addressed them as part of the consensus of suggested edits given below (‘first option’). We thank the reviewer for this feedback.

      Reviewer #2 (Public review):

      Summary:

      The authors have set out to investigate and explain how early members of the Pterosauria were able to maintain stiffness in the vane of their tails. This stiffness, it is said, was crucial for flight in early members of this clade. Through the use Laser-Stimulated Fluorescence imaging, the authors have revealed that certain pterosaurs had a sophisticated dynamic tensioning system that has previously been unappreciated.

      Strengths:

      The choice of method of investigation for the key question is sound enough, and the execution of the same is excellent. Overall the paper is well written and well presented, and provides a very succinct, accessible and clear conclusion.

      We thank Reviewer #2 for their positive assessment of our work.

      Weaknesses:

      None

      We thank Reviewer #2 for their positive assessment of our work.

      Recommendations for the authors:

      The consensus between the reviewers and reviewing board is that this manuscript can be substantially strengthened and this can be achieved in two ways that are presented in order of preference.

      First option; resolve the following weaknesses:

      - Include a rigorous discussion of possible bias in the sample selection method with consideration of discarded specimens in relation to cross-linking lattice observation.

      - Include published biomechanics theory, supported by citations or a self-derived biomechanical model, to show how the lattice could have functioned biomechanically.

      - Discuss whether you found similar mechanisms in extant organisms for comparative functional interpretation.

      We thank the reviewers and reviewing board for taking the time to discuss the review and propose two consensus options for how to substantially strengthen the manuscript. We carefully considered both proposed options and decided to implement the first option in full. We have therefore made main text edits relating to all three points of the first option. The marked up article file shows exactly which parts of the text were edited in relation to the points.

      Second option; rewrite the manuscript so no mechanistic claims are made that are not supported by the information presented:

      - Accept the possibility of sampling bias and its limitation in the presentation of cross-linking lattice observation, outlining future work needed to address this.

      - Discuss biomechanics theory needs to be developed to show how the lattice could have functioned biomechanically and remove unsupported speculation about this. It is acceptable to present a new hypothesis, clearly outline the motivation for the hypothesis and how it can be tested with future biomechanical and comparative studies. Remove and replace all current speculative sections and phrasing accordingly and replace this with the framework supporting the idea of a new hypothesis.

      The first option was implemented instead of the second option.

    1. eLife Assessment

      In this important work, Lodhiya et al. provide evidence that excessive ATP underlies the killing of the model organism Mycobacterium smegmatis by two mechanistically-distinct antibiotics. The data are generally solid as the authors deploy multiple, orthogonal readouts and methods for manipulating reactive oxygen species and ATP. The work will be of interest to those studying antibiotic mechanisms of action.

    2. Reviewer #1 (Public review):

      Summary:

      Lodhiya et al. demonstrate that antibiotics with distinct mechanisms of action, norfloxacin and streptomycin, cause similar metabolic dysfunction in the model organism Mycobacterium smegmatis. This includes enhanced flux through the TCA cycle and respiration as well as a build-up of reactive oxygen species (ROS) and ATP. Genetic and/or pharmacologic depression of ROS or ATP levels protect M. smegmatis from norfloxacin and streptomycin killing. Because ATP depression is protective, but in some cases does not depress ROS, the authors surmise that excessive ATP is the primary mechanism by which norfloxacin and streptomycin kill M. smegmatis. In general, the experiments are carefully executed; alternative hypotheses are discussed and considered; the data are contextualized within the existing literature.

      Strengths:

      The authors tackle a problem that is both biologically interesting and medically impactful, namely, the mechanism of antibiotic-induced cell death.

      Experiments are carefully executed, for example, numerous dose- and time-dependency studies; multiple, orthogonal readouts for ROS; and several methods for pharmacological and genetic depletion of ATP.

      There has been a lot of excitement and controversy in the field, and the authors do a nice job of situating their work in this larger context.

      Inherent limitations to some of their approaches are acknowledged and discussed e.g., normalizing ATP levels to viable counts of bacteria.

      Weaknesses:

      All of the experiments performed here were in the model organism M. smegmatis. As the authors point out, the extent to which these findings apply to other organisms (most notably, slow-growing pathogens like M. tuberculosis) is to be determined. To avoid the perception of overreach, I would recommend substituting "M. smegmatis" for Mycobacteria (especially in the title and abstract).

      At first glance, a few of the results in the manuscript seem to conflict with what has been previously reported in the (referenced) literature. In their response to reviewers, the authors addressed my concerns. It would also be ideal to include a few lines in the manuscript briefly addressing these points. (Other readers may have similar concerns)

      In the first round of review, I suggested that the authors consider removing Figs. 9 and 10A-B as I believe they distract from the main point of the paper and appear to be the beginning of a new story rather than the end of the current one. I still hold this opinion. However, one of the strengths of the eLife model is that we can agree to disagree.

    3. Reviewer #2 (Public review):

      Summary:

      The authors are trying to test the hypothesis that ATP bursts are the predominant driver of antibiotic lethality of Mycobacteria

      Strengths:

      No significant strengths in the current state as it is written.

      Weaknesses:

      A major weakness is that M. smegmatis has a doubling time of three hours and the authors are trying to conclude that their data would reflect the physiology of M. tuberculossi that has a doubling time of 24 hours. Moreover, the authors try to compare OD measurements with CFU counts and thus observe great variabilities.

      Comments on revisions:

      I am surprised that the authors simply did not repeat the study in figure one with CFU counts and repeated in triplicate. Since this is M. smegmatis, it would take no longer than two weeks to repeat this experiment and replace the figure. I understand that obtaining CFU counts is much more laborious than OD measurements but it is necessary. Your graph still says that there is 0 bacteria at time 0, yet in your legend it says you started with 600,000 CFU/ml. I don't understand why this experiment was not repeated with CFU counts measured throughout. This is not a big ask since this is M. smegmatis but it appears that the authors do not want to repeat this experiment. Minimally, fix the graph to represent the CFU.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Lodhiya et al. demonstrate that antibiotics with distinct mechanisms of action, norfloxacin, and streptomycin, cause similar metabolic dysfunction in the model organism Mycobacterium smegmatis. This includes enhanced flux through the TCA cycle and respiration as well as a build-up of reactive oxygen species (ROS) and ATP. Genetic and/or pharmacologic depression of ROS or ATP levels protect M. smegmatis from norfloxacin and streptomycin killing. Because ATP depression is protective, but in some cases does not depress ROS, the authors surmise that excessive ATP is the primary mechanism by which norfloxacin and streptomycin kill M. smegmatis. In general, the experiments are carefully executed; alternative hypotheses are discussed and considered; the data are contextualized within the existing literature. Clarification of the effect of 1) ROS depression on ATP levels and 2) ADP vs. ATP on divalent metal chelation would strengthen the paper, as would discussion of points of difference with the existing literature. The authors might also consider removing Figures 9 and 10A-B as they distract from the main point of the paper and appear to be the beginning of a new story rather than the end of the current one. Finally, statistics need some attention.

      Strengths:

      The authors tackle a problem that is both biologically interesting and medically impactful, namely, the mechanism of antibiotic-induced cell death.

      Experiments are carefully executed, for example, numerous dose- and time-dependency studies; multiple, orthogonal readouts for ROS; and several methods for pharmacological and genetic depletion of ATP.

      There has been a lot of excitement and controversy in the field, and the authors do a nice job of situating their work in this larger context.

      Inherent limitations to some of their approaches are acknowledged and discussed e.g., normalizing ATP levels to viable counts of bacteria.

      We sincerely appreciate the reviewer’s encouraging feedback.

      Weaknesses:

      The authors have shown that treatments that depress ATP do not necessarily repress ROS, and therefore conclude that ATP is the primary cause of norfloxacin and streptomycin lethality for M. smegmatis. Indeed, this is the most impactful claim of the paper. However, GSH and dipyridyl beautifully rescue viability. Do these and other ROS-repressing treatments impact ATP levels? If not, the authors should consider a more nuanced model and revise the title, abstract, and text accordingly.

      We thank the reviewer for asking this question. In the revised version of the manuscript, we have included data on the impact of the antioxidant GSH on antibiotic-induced ATP levels as the supplementary figure (S9C)

      Does ADP chelate divalent metal ions to the same extent as ATP? If so, it is difficult to understand how conversion of ADP to ATP by ATP synthase would alter metal sequestration without concomitant burst in ADP levels.

      We sincerely thank the reviewer for raising this insightful question. Indeed, ADP and AMP can also form complexes with divalent metal ions; however, these complexes tend to be less stable. According to the existing literature, ATP-metal ion complexes exhibit a higher formation constant compared to ADP or AMP complexes. This has been attributed to the polyphosphate chain of ATP, which acts as an active site, forming a highly stable tridentate structure (Khan et al., 1962; Distefano et al., 1953). An antibiotic-induced increase in ATP levels, irrespective of any changes in ADP levels or a total pool size of purine nucleotides, could still result in the formation of more stable complexes with metal ions, potentially leading to metal ion depletion. Although recent studies indicate that antibiotic treatment stimulates purine biosynthesis (Lobritz MA et al., 2022; Yang JH et al., 2019), thereby imposing energy demands and enhancing ATP production, and therefore, the possibility of a corresponding increase in total purine nucleotide levels (ADP+ATP) exist (is mentioned in discussion section). However, this hypothesis requires further investigation.

      Khan MMT, Martell AE. Metal Chelates of Adenosine Triphosphate. Journal of Physical Chemistry (US). 1962 Jan 1;Vol: 66(1):10–5

      Distefano v, Neuman wf. Calcium complexes of adenosinetriphosphate and adenosinediphosphate and their significance in calcification in vitro. Journal of Biological Chemistry. 1953 Feb 1;200(2):759–63

      Lobritz MA, Andrews IW, Braff D, Porter CBM, Gutierrez A, Furuta Y, et al. Increased energy demand from anabolic-catabolic processes drives β-lactam antibiotic lethality. Cell Chem Biol [Internet]. 2022 Feb 17.

      Yang JH, Wright SN, Hamblin M, McCloskey D, Alcantar MA, Schrübbers L, et al. A White-Box Machine Learning Approach for Revealing Antibiotic Mechanisms of Action. Cell [Internet]. 2019 May 30

      Reviewer #1 (Recommendations for the authors):

      (1) Some of the results in the paper diverge from what has been previously reported by some of the referenced literature. These discrepancies should be clarified.

      We apologize for any confusion, but we are uncertain about the specific discrepancies the reviewer is referring. In the discussion section, we have addressed and analysed our results within the broader context of the existing literature, regardless of whether our findings align with or differ from previous studies.

      (a) CCCP, nigericin, BDQ, and the atpD mutant all appear to affect M. smegmatis growth (Figures S6C, S7C, S7D-E, and Figure 1B from reference 41). Could depressed growth contribute to the rescue effects of these compounds?

      We concur with the reviewer that the reagents we used (CCCP, Nigericin, and BDQ) to suppress the ATP burst in the presence of antibiotics do affect bacterial growth. This growth sub-inhibitory effect is expected given their roles in either uncoupling the electron transport chain from oxidative phosphorylation or directly inhibiting ATP synthase, leading to reduced ATP production compared to the untreated control. However, we chose concentrations that reduces the antibiotic-induced surge in ATP levels without significantly depriving the bacteria of the ATP  essential for their survival, thereby avoiding cell death.

      Consequently, all three reagents (as shown in Figures S6C, S7C, and S7D-E) were employed at non-lethal concentrations. We would like to emphasize, however, that it was not feasible to select a reagent concentration that had no impact on growth yet still suppressed the antibiotic-induced ATP burst. We recognize the possibility that growth retardation may have contributed to the observed rescue effects. To address this concern, we used multiple orthogonal methods (CCCP, Nigericin, and BDQ), each with distinct mechanisms having a common effect of reducing the ATP surge, to minimize off-target effects and support our findings.

      Also, the authors report no growth phenotype for atpD mutant (Figure S8) but only carry out the growth curve to an OD of 2, which is approximately where the growth curve from ref 41 begins to diverge.

      Additionally, to further confirm that bacterial rescue was not due to growth retardation caused by these reagents, we utilized the atpD mutant. All experiments, including those involving the atpD mutant, were conducted when the OD600nm reached 0.8 (during the exponential phase). We specifically ensured that the growth of the atpD mutant was not compromised during this phase (Figure S8) and restricted our growth curve to the early stationary phase (OD600 between 1.5 and 2). While it is possible that the atpD mutant may exhibit slower growth compared to wild-type bacteria in stationary phase at an OD600nm of 4 (as shown in ref 41), however, this does not impact our observations.

      (b) Reference 41 also reports that the atpD mutant is more sensitive to some antibiotics  (Figure 6). This includes isoniazid, which references 34 and 35 have both reported caused an ATP burst.

      We acknowledge the reviewer’s query regarding the phenotype of the atpD mutant against isoniazid (Reference 41). However, the cited reference does not provide clarity on why the M. smegmatis atpD mutant exhibits increased sensitivity to isoniazid and other antibiotics, nor does it explain whether this sensitivity is due to reduced ATP levels or altered cell wall properties, such as enhanced drug uptake, as observed with Nile red and ethidium bromide.

      While references 34 and 35 reported an ATP burst following isoniazid treatment in slow-growing M. bovis BCG and M. tuberculosis, it remains to be tested whether isoniazid acts similarly in the fast-growing M. smegmatis, where it is bacteriostatic rather than being bactericidal as observed in M. bovis BCG and M. tuberculosis.  

      (2) The statistics require some attention. First, the wording for almost all of the figures is something like "data points represent the mean of at least three independent replicates," is that correct? CFUs are notoriously messy so it is surprising (impressive?) that the variability between replicates is so low. Second, t-tests are not appropriate for multiple comparisons.

      We thank the reviewer for raising this important query. It is correct that all our experiments included at least three independent replicates, and many of our results exhibit a high degree of variability, as indicated by the large error bars. We would like to clarify that we did not perform multiple comparisons on our results. For all analyses, an unpaired t-test was conducted between the control group and one experimental group at a time. Consequently, statistical data were generated for each pair of results, and the comparisons were displayed on the graph relative to the control data points, as mentioned in the Methods section under the heading “Statistical analysis”

      (3) Figures 9 and 10A-B seem tangential to the main point of the paper and, in the case of Figure 10A-B, preliminary.

      In this study, our aim was to comprehensively investigate the nature of antibiotic-induced stresses (i.e., mechanisms of action from T = 15 hrs) and leverage these insights to enhance our understanding of bacterial adaptation mechanisms, particularly antibiotic tolerance (from T = 25 hrs). While a significant portion of the manuscript focuses on the secondary consequences of antibiotic exposure, we also sought to assess the bacteria's ability to counteract these stresses, contributing to our understanding of how antibiotic tolerance phenotypes develop.

      The results presented in Figure 9 clearly demonstrate that bacteria attempt to reduce respiration by decreasing flux through the complete TCA cycle, thereby mitigating ROS and ATP production in response to antibiotics. These findings not only uncovers potential metabolic pathways to downregulate respiration but also validate our observations regarding the role of increased respiration, ROS generation, and subsequent ATP production in antibiotic action.

      Importantly, bacterial responses to antibiotics were not limited to metabolic adaptations. They also included the upregulation of the intrinsic drug resistance determinant Eis (Figure 10A) and an increase in mutation frequency (Figure 10B), both of which indicate a greater likelihood of these bacteria developing antibiotic tolerance and resistance. Therefore, the data presented in Figures 9 and 10A-B are not peripheral to the central theme of the paper. Rather, they complement and strengthen it by providing a comprehensive understanding of the consequences of antibiotic exposure, which aligns with the primary objectives of our study.

      Do the various perturbations used here (especially streptomycin) effect expression and/or turnover of the genetically-encoded sensors Mrx1-roGFP2 or Peredox-mCherry?

      We appreciate the reviewer for raising this query. Since streptomycin treatment leads to mistranslation and eventually inhibits protein synthesis, it is possible that such treatment could impact the expression and/or turnover of the genetically encoded biosensors, Mrx1-roGFP2 (1) or Peredox-mCherry (2). However, we do not anticipate any effects on the readout as both biosensors provide ratiometric measurements of redox potential and NADH levels, respectively, which eliminates errors due to variations in protein abundance. Nevertheless, in our experiments with both drugs, we employed multiple time- and dose-dependent responses, ensuring that all meaningful conclusions were drawn from the overall trends seen in the data rather than an individual data point.

      (1) Bhaskar A, Chawla M, Mehta M, Parikh P, Chandra P, Bhave D, et al. (2014) Reengineering Redox Sensitive GFP to Measure Mycothiol Redox Potential of Mycobacterium tuberculosis during Infection. PLoS Pathog 10(1): e1003902. https://doi.org/10.1371/journal.ppat.1003902

      (2) Shabir A. Bhat, Iram K. Iqbal, and Ashwani Kumar*. Imaging the NADH:NAD+ Homeostasis for Understanding the Metabolic Response of Mycobacterium to Physiologically Relevant Stresses. Front Cell Infect Microbiol. 2016; 6: 145. doi: 10.3389/fcimb.2016.00145

      (4) Do the antibiotics affect permeability? Especially relevant to CellROX experiments.

      Antibiotics can impact, or even increase, bacterial membrane permeability, a phenomenon noticed in case of self-promoted uptake of aminoglycosides. When aminoglycosides bind to ribosomes, they induce mistranslation, including of membrane proteins, leading to the formation of membrane pores, which in turn enhances antibiotic uptake and lethality (1-2). However, whether the antibiotics used in our study (norfloxacin and streptomycin) at the concentrations applied altered membrane permeability is not known.

      Experiments involving the CellROX dye are unlikely to be influenced by changes in membrane permeability, as the dye is freely permeable to the mycomembrane.

      References:

      (1) Davis BD Chen LL Tai PC (1986) Misread protein creates membrane channels: an essential step in the bactericidal action of aminoglycosides PNAS 83:6164–6168.

      (2) Ezraty B Vergnes A Banzhaf M Duverger Y Huguenot A Brochado AR Su SY Espinosa L Loiseau L Py B Typas A Barras F (2013) Fe-S cluster biosynthesis controls uptake of aminoglycosides in a ROS-less death pathway Science 340:1583–1587.

      (5) Figures 4E-H does GSH affect bacterial growth/viability on its own i.e. in the absence of a drug?

      We thank the reviewer for raising this query. Indeed, the 10 mM GSH used in our experiments to mitigate and rescue cells from antibiotic-induced ROS does impact bacterial growth on its own, though it does not affect viability, likely due to GSH inducing reductive stress on bacterial physiology. For clarification, we have included the viability measurement data in the presence of 10 mM GSH alone in the revised version of the manuscript, as supplementary figure (S4E).

      (6) p. 2 "...antibiotic resistance involves more complex mechanisms and manifests as genotypic resistance, antibiotic tolerance, and persistence." This reads as tolerance and persistence being a subset of resistance, which is not quite accurate. There is at least one other example of similar wording in the text.

      We thank the reviewer for highlighting this point. Our intention was to convey that resistance to antibiotics can manifest in two forms: permanent or genetic resistance, and transient resilience through antibiotic tolerance and persistence.

      (7) p. 3 "...and showing no visible differences in the growth rate...". It is hard to say this as all the values appear to be 0 - possible to zoom in on the CFU counts in this region? Same comment for p. 5 "...the unaffected growth rate in the early response phase...".

      We apologize for the lack of clarity regarding the resolution of the early time points in the growth curve. Unfortunately, it was not feasible for us to zoom in on the initial time points due to the significant difference in cell viability between T=0 and T=25 hours (i.e., spanning 8 generations). For clarification in the growth phenotype at early time points, please refer to Author response image 1, where CFU counts are plotted on a logarithmic scale. The y-axis spans 6-8 orders of magnitude across different conditions, making it difficult to visualize early time points on a linear scale.

      Author response image 1.

      (8) p. 5 "...data for each condition were subjected to rigorous quality control analysis (S2B)." I believe that this is the case, but how Figure S2B demonstrates this fact is not clear.

      Figures S2A and S2B present the quality assessment data for all six proteomics datasets. Figure S2A illustrates the consistency in the number of proteins identified across 10 samples (5 independent replicates for both control and drug treatment). The minimal variation in the number of identified proteins indicates reproducibility across the different runs. Similarly, Figure S2B displays the variability in Pearson correlation coefficient values of protein abundance (LFQ intensities) across the 10 samples. The closer and more consistent the Pearson correlation values, the greater the reproducibility of the quantitative data acquisition.

      (9) p. 7 "To look for a shared mechanism of antibiotic action..." The wording implies an assumption. Perhaps "to test whether" would be more appropriate? Same comment for p. 12 "To further confirm whether enhanced respiration ...".

      We appreciate the reviewer’s suggestions for both sentences and have made the necessary changes in the revised version. Thank you for bringing this to our attention.

      (10) Figure S1A-B figure legend. How was this assay performed?

      The experiment for Figures S1A-B was conducted using a standard REMA assay, as described in the methods section. Cells were harvested at the 25th-hour time point, and drug MICs were compared between cells grown with and without 1/4x MBC99 of the drugs. This was done to determine whether the growth recovery observed during the recovery phase was due to the presence of drug-resistant bacteria.

      (11) p. 14 "...(CCCP), a protonophore, at non-toxic levels..." Figure S6C implies an effect on growth.

      As clarified earlier in response to query 1(a), the CCCP reagent was used at concentrations that effectively minimize the antibiotic-induced surge in ATP levels. However, at these concentrations, CCCP reduces cellular ATP production (Figure S6A), leading to bacterial growth delay (Figure S6C). By "non-toxic levels," we intended to convey that these concentrations of CCCP are non-lethal to the bacteria, as evidenced in Figure S6C.

      (12) Figure 8A y axis is this CFU/mL or OD/mL?

      The y-axis for the figure 8A depicts CFU/ml as it measures the cell survival in response to increasing concentrations of bipyridyl.

      Reviewer #2 (Public review):

      Summary:

      The authors are trying to test the hypothesis that ATP bursts are the predominant driver of antibiotic lethality of Mycobacteria.

      Strengths:

      This reviewer has not identified any significant strengths of the paper in its current form.

      Weaknesses:

      A major weakness is that M. smegmatis has a doubling time of three hours and the authors are trying to conclude that their data would reflect the physiology of M. tuberculosis which has a doubling time of 24 hours. Moreover, the authors try to compare OD measurements with CFU counts and thus observe great variabilities.

      If the authors had evidence to support the conclusion that ATP burst is the predominant driver of antibiotic lethality in mycobacteria then this paper would be highly significant. However, with the way the paper is written, it is impossible to make this conclusion.

      We have identified a new mechanism of antibiotic action in Mycobacterium smegmatis. However, as discussed extensively in the manuscript's discussion section, whether and to what extent this mechanism applies to other organisms still needs to be tested.

      We have always drawn inferences from the CFU counts as the OD600nm is never a reliable method as reported in all of our experiments.

      Reviewer #2 (Recommendations for the authors):

      Figure 1 needs to have an x-axis that has intervals that have 10E5 CFU to 4 x 10E8. But even 4 x 10E8 CFU/ml is a late log and not exponentially growing cells.

      Figure 1 illustrates the growth curve. We hope the reviewer meant the Y axis which represents CFU/ml on a linear scale. As mentioned in response to reviewer #1’s query no. 7, it was not feasible to include the viability (CFU/ml) values at T=0 and a few subsequent time points. Naturally, the starting cell count was not zero; we began with approximately 600,000 CFU/ml, corresponding to an OD600nm of 0.0025/ml. For clarification, we have mentioned the initial OD as well CFU/ml at T= 0 hr in the figure legend.  

      Carefully look at Figure 1, what were you trying to show? Your x-axis goes from 0 to 10E8, of course you did not inoculate 0 cells, but if you had measured CFUs, you might not have gotten the great variability you reported in your graph.

      We assume that the reviewer is suggesting that "if we had measured OD600nm/ml instead of CFU/ml, we might not have observed the high variability we reported." While we agree with the reviewer's comment, our decision to use CFU/ml for growth measurement was to obtain more resolved and detectable data points, as an OD600nm of 0.0025/ml cannot be reliably measured with a spectrophotometer. Additionally, at around T=15 hours, where we observed an extended lag phase (referred to as the stress phase), the OD600nm was approximately 0.05, which is barely detectable. Therefore, the significant differences between the control group and the ¼ x MBC99 drug-treated group might not have been observed if we had relied on OD-based measurements. Despite the presence of high error bars and variability in the data points, we were still able to demonstrate clear differences in bacterial growth between treated and untreated samples at sub-lethal drug doses. This ultimately allowed us to capture the nature of antibiotic-induced stresses.

      There is no doubt that sublethal concentrations of antibiotics will have an effect on the bacterial cells. But it is not clear how you are concluding that ATP burst is the dominant driver of lethality. M. smegmatis can be very different from Mtb.

      Using a series of time- and dose-dependent experiments with plasmid and kit-based approaches, we demonstrated that both antibiotics generate and rely on ROS and ATP bursts to induce lethality in M. smegmatis. Careful monitoring of oxidative stress in cells, following specific quenching of the antibiotic-induced ATP burst (Figure 7, S9A-B), revealed that the ATP burst is the dominant driver of antibiotic lethality. In all tested experiments, surviving bacteria exhibited elevated levels of oxidative stress but were able to maintain their viability, suggesting that oxidative stress alone is not the dominant factor in antibiotic-induced lethality. Furthermore, quenching of ROS by glutathione also suppressed antibiotic-induced surge in ATP levels, thus supporting the notion that ROS alone, is not the dominant driver of antibiotic action as previously understood.

      All experiments reported were conducted using fast-growing M. smegmatis, and have acknowledged the need for similar experiments in other bacterial systems, including M. tuberculosis, to assess whether our findings are applicable to other systems.

      Another point, the use of a mutant in the ATP synthase is an interesting idea, but would it be better to use something where you knock out the ATP synthase activity with siRNA or a temperature-sensitive allele?

      We appreciate the reviewer’s encouraging comment. Knocking out ATP synthase would completely halt oxidative phosphorylation and shut down aerobic respiration, leading to severe metabolic and growth defects. Such stressful and non-growing conditions are not suitable for testing the efficacy of antibiotics, as it is widely accepted that antibiotics are more effective against metabolically active bacteria.

      Lastly, the conclusion is that norfloxacin and streptomycin have common mechanisms of action, but the authors do not explain how a DNA gyrase inhibitor shows the same mechanisms of action as a ribosome inhibitor.

      The connection between antibiotic target corruption (DNA gyrase or ribosome) and the activation of respiration is indeed unclear, intriguing, and represents one of the most exciting questions in the field of antibiotic mechanisms of action. In the discussion section, we have speculated on potential pathways for this connection, including the possibility that the inhibition of cell division by both drugs may create a perception of resource scarcity (energy and biosynthetic precursors), which could subsequently trigger increased metabolism, respiration, ROS production, and ATP synthesis. However, the precise mechanisms underlying this connection require further investigation and are beyond the scope of the present study.

    1. eLife Assessment

      In this important study the authors develop an elegant lung metastasis mouse model that closely mimics the events in human patients. They provide convincing evidence for the effectiveness of IL-15/12-conditioned NK cells in this design, which was also critical for the authors being able to conclusively reveal the T cell-dependency of NK-cell-mediated long-term control of experimental metastasis. Of note, an investigator-initiated clinical trial demonstrated that similar NK cell infusions in cancer patients after resections were safe and showed signs of efficacy, which is of promising clinical application value.

    2. Reviewer #1 (Public review):

      Summary:

      This is a very nice paper in which the authors addressed the potential for NK cell cellular therapy to treat and potentially eliminate previously established metastases after surgical resections, which are a major cause of death in human cancer patients. To do so they developed a model using the EO771 breast cancer cell line, in which they establish and then resect tumors and the draining lymph node, after which the majority of mice eventually succumb to metastatic disease. They found that when the initiating tumors were resected when still relatively small, adoptive transfers of IL-15/12-conditioned NK cells substantially enhanced the survival of tumor-bearing animals. They then delved into the cellular mechanisms involved. Interestingly and somewhat unexpectedly, the therapeutic effect of the transferred NK cells was dependent on the host's CD8+ T cells. Accordingly, the NK cell therapy contributed to the formation of tumor-specific CD8+ T cells, which protected the recipient animals against tumor re-challenge and were effective in protecting mice from tumor formation when transferred to naive mice. Mechanistically, they used Ifng knockout NK cells to provide evidence that IFNgamma produced by the transferred NK cells was crucial for the accumulation and activation of DCs in the metastatic lung, including expression of CD86, CD40 and MHC genes. In turn, IFNgamma production by NK cells was essential for the induced accumulation of activated CD8 effector T cells and stem cell-like CD8 T cells in the metastatic lung. The authors then expanded their findings from the mouse model to a small clinical trial. They found that inoculations of IL-15/12-conditioned autologous NK cells in patients with various malignancies after resection was safe and showed signs of efficacy.

      Strengths:

      - Monitoring of long-term metastatic disease and survival after resection used in this paper is a physiological model that closely resembles clinical scenarios more than the animal models usually used, a great strength of the approach.<br /> - Previous literature focused on the notion that NK cells clear metastatic lesions directly, within a short period. The authors' use of a more relevant model and time frame revealed the previously unexplored T cell-dependent mechanism of action of infused NK cells for long-term control of metastatic diseases.<br /> - Also important, the paper provides solid evidence for the contribution of IFNgamma produced by NK cells for activation of dendritic cells and T cells. This is an interesting finding that provokes additional questions concerning the action of the interferon gamma in this context.<br /> - The results from the clinical trial in cancer patients based on the same type of IL-15/12-conditioned NK cell infusions, was encouraging with respect to safety and showed signals of efficacy, which support the translatability of the author's findings.

      Future studies in this model could shed even more light on the mechanisms. The authors do not address whether the IL-12 in their cocktail is essential for the effects they see. Relatedly, it was of interest that despite the effectiveness of the transferred IL-15/IL-12 cultured NK cells, the cells failed to persist very long after transfer. Published studies have reported that so-called memory-like NK cells, which are pre-activated with a cocktail of IL-12, IL-18 and IL-15, persist much longer in lympho-depleted mice and patients than IL-2 cultured NK cells. It would be illuminating to compare these two types of NK cell products in the author's model system, and with, or without, lymphodepletion, to identify the critical parameters. If greater persistence occurred with the memory-like NK cell product, it is possible that the NK cells might provide greater benefit, including by directly targeting the tumor.

    3. Reviewer #2 (Public review):

      Summary:

      The authors show convincing data that increasing NK cell function/frequency can reduce development and progression of metastatic disease after primary tumor resection.

      Strengths:

      The inclusion of a first-in-human trial highlighting some partial responses of metastatic patients treated with in vitro expanded NK cells is tantalising. It is difficult to perform trials in preventing further metastasis since the timelines are very protracted but more data like these highlighting a role for NK cells in improving local cDC1/T cells anti-tumor immunity will encourage deeper thinking around therapeutic approaches to target endogenous NK cells to achieve the same.

      Weaknesses:

      As always, more patient data would help increase confidence around the human relevance of the approach.

      Comments on revisions:

      The authors have addressed all my queries

    4. Author response:

      The following is the authors’ response to the original reviews.

      Author Response

      Reviewer #1 (Public Review): 

      Weaknesses: 

      - Having demonstrated that NK cell IFNgamma is important for recruiting and activating DCs and T cells in their model, one is left to wonder whether it is important for the therapeutic effect, which was not tested. 

      We conducted a preliminary study to compare the pro-survival effect of WT NK and Ifng-/- NK cell therapies. We found that, in the 95-500 mg day-21 tumor group, the overall survival (OS) of mice receiving Ifng-/- NK cell therapy significantly decreased (p = 0.045) compared to mice receiving WT NK cell therapy up to 60 days after tumor inoculation, but there was no difference in OS beyond 65 days after tumor inoculation. Therefore, we have added the following sentences at the end of the second paragraph in our Discussion (Page 32):

      “However, although Ifng-/- NK cells induced less cDC activation compared to WT NK cells, the levels of CD86 on cDCs of mice that received Ifng-/- NK cells were higher than those of mice not subjected to NK cell transfer (Figure 4B). This outcome indicates the presence of IFN-g-independent or/and compensatory mechanism(s) for cDC activation by the transferred NK cells, which is in line with our preliminary result that Ifng-/- NK cell therapy does not significantly diminish the pro-survival effect in comparison to WT NK cell therapy beyond 60 days after tumor cell inoculation (data not shown).”

      - It was somewhat difficult to gauge the clinical trial results because the trial was early stage and therefore not controlled. Evaluation of the results therefore relies on historical comparisons. To evaluate how encouraging the results are, it would be valuable for the authors to provide some context on the prognoses and likely disease progression of these patients at the time of treatment. 

      We had already indicated in our Results that all six patients had an ECOG performance status of 0 (Page 25 and Table). We have now added in the Results that they had “a predicted survival of >3 months” (Page 25).

      Reviewer #1 (Recommendations For The Authors):

      Minor points: 

      (1) It would be helpful if the authors provided a rationale for why they derived their NK cell product from bone marrow cells instead of the more common source, spleen cells. 

      We now clarify that: “We used BM cells instead of splenocytes for NK cell culture because removal of T cells from BM cells before culturing is not necessary” (Page 35) to the section Ex vivo expansion of murine and human NK cells in our Materials and Methods.

      (2) It would have been helpful to provide summary results from replicates of the cytokine production data shown in Figure 1F. 

      We have now added a graphical panel on the relative ΔMFI of two independent experiments to Figure 1F and revised the figure legend accordingly (Page 7—8).

      (3) The role of conventional CD4+ T cells is a little unclear. The authors state in the discussion that they contribute to the antitumor response, which is consistent with their finding that depleting both CD4 T cells and CD8 T cells has a greater effect than depleting CD8 T cells. Depleting CD4 T cells alone trended towards improving the response, however. Probably Tregs are the culprit in the latter effect but a sentence or two would be helpful if the claim for a protective role for CD4 T cells is to remain.  

      We have now re-analyzed the data of Figure 3D by separating mice into two groups according to day 21 tumor weight, i.e., 95-600 mg and >600 mg (Page 13—14). We have revised our explanation of the Figure 3D data in the Results (Page 11—12) as follows:

      “Accordingly, we examined the role of T cells in NK cell therapy by depleting T cell subsets with antiCD4 or/and anti-CD8 antibodies two days before primary tumor resection (Figure 3D Schema and Figure 3-figure supplement 1). In the 95-600 mg tumor group, depletion of CD8+ cells alone or both CD4+ and CD8+ cells diminished the effect of NK cell therapy, whereas depletion of CD4+ cells alone did not affect OS (Figure 3D). This result indicates that CD8+ T cells are essential for the effect of NK cell therapy. In contrast, the >600 mg tumor group displayed a limited NK-cell treatment effect as expected, but did exhibit improved OS upon depleting CD4+ cells alone (Figure 3D). As the proportion of lung Foxp3+CD4+ T cells in CD45+ cells positively correlated with day 21 tumor weight (data not shown), depletion of Foxp3+CD4+ T cells by anti-CD4 antibody likely has a stronger effect in augmenting the immune response for the >600 mg tumor group than the 95-600 mg tumor group. Moreover, both tumor groups showed diminished OS upon depletion of both CD4+ and CD8+ cells than was the case for depletion of CD8+ cells alone, indicating a CD8+ T cell-independent anti-tumor effect of CD4+ T cells (Figure 3D).”

      (4) The schema in Figure 3E states that mice were inoculated with either EO771 tumor cells or B16F10 tumor cells, but it appears that the data only show EO771 tumor challenges. This should be corrected. 

      Corrected according to the reviewer’s comment.

    1. eLife Assessment

      This important study presents work on the molecular mechanism driving asymmetric cell division and fate decisions during embryonic development of echinoids. The evidence supporting the claims of the authors is convincing. The work will be of interest to developmental biologists and cell biologists working in the field of self-renewal.

    2. Reviewer #1 (Public review):

      Summary:

      Previous work has shown that the evolutionarily-conserved division-orienting protein LGN/ Pins/ GPR1/2 (vertebrates/flies/nematodes) participates in division orientation across a variety of cell types, perhaps most importantly those that undergo asymmetric divisions (ACDs). Micromere formation in echinoids relies on asymmetric cell division at the 16-cell stage, and these authors previously demonstrated a role for the LGN/Pins homolog AGS (Activator of G-protein signaling) in that ACD process. Here they extend that work by investigating and exploiting the question of why echinoids but not other echinoderms form micromeres. Using an impressive combination of phylogenetics and molecular experiments, they determine that much of the difference in ACD and micromere formation in echinoids can be attributed to differences in the AGS C-terminus, in particular a GoLoco domain (GL1) that is missing in most other echinoderms. This work helps explain how AGS works and thereby enhances our understanding of a conserved player in division orientation.

    3. Reviewer #2 (Public review):

      This study from Dr. Emura and colleagues addresses the relevance of AGS3 mutations in the execution of asymmetric cell divisions promoting the formation of the micromere during sea-searching development. To this aim, the authors use quantitative imaging approaches to evaluate the localisation of AGS3 mutants truncated at the N-terminal region or at the C-terminal region, and correlate these distributions with the formation of micromere and correct development of embryos to the pluteus stage. The authors also analyse the capacity of these mutated proteins to rescue developmental defects observed upon AGS3 depletion by morpholino antisense nucleotides (MO). Collectively these experiments revealed that the C-terminus of AGS3, coding for four GoLoco motifs binding to cortical Gaphai proteins, is the molecular determinant for cortical localisation of AGS3 at the micromeres and correct pluteus development. Further genetic dissections and expression of chimeric AGS3 mutants carrying shuffled copies of the GoLoco motifs or four copies of the same motifs revealed that the position of GoLoco1 is essential for AGS3 functioning. To understand whether the AGS3-GoLoco1 evolved specifically to promote asymmetric cell divisions, the author analyse chimeric AGS3 variants in which they replaced the sea urchin GoLoco region with orthologs from other echinoids that do not form micromeres, or from Drosophila Pins or human LGN. These analyses corroborate the notion that the GoLoco1 position is crucial for asymmetric AGS3 functions. In the last part of the manuscript, the authors explore whether SpAGS3 interacts with the molecular machinery described to promote asymmetric cell division in eukaryotes, including Insc, NuMA, Par3 and Galphai, and show that all these proteins colocalize at the nascent micromere, together with the fate determinant Vasa. Collectively this evidence highlighted how evolutionarily selected AGS3 modifications are essential to sustain asymmetric divisions and specific developmental programs associated with them.

      The manuscript addresses an interesting question and uses elegant genetic approaches associated with imaging analyses to elucidate the molecular mechanisms whereby AGS3 and spindle orientation proteins promote asymmetric divisions and specific developmental programs. This considered, it might be worth clarifying a few aspects of the reported findings.

      (1) In some experimental settings, the presence of AGS3 mutants exacerbates the AGS3 deletion by MO (Fig. 4F). Can the author speculate on what can be the molecular explanation?

      (2) Imaging analyses of Figure 4B-C suggest that the mutant AGS1111 does not localise at the vegetal cortex while AGS2222 does (Fig. 4C). However these mutants induce similar developmental defects (Fig. 4F) . What could be the reason?

      (3) Figure 7 shows the crosstalk between AGS3 and other asymmetry players including NuMA. Vertebrate and Drosophila NuMA are ubiquitously present in tissues and localises to the spindle poles in mitosi. However in Figure 7A and 7E NuMA seems expressed only in a subset of sea urchin embryonic cells. Is this the case?

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Previous work has shown that the evolutionarily-conserved division-orienting protein LGN/Pins (vertebrates/flies) participates in division orientation across a variety of cell types, perhaps most importantly those that undergo asymmetric divisions. Micromere formation in echinoids relies on asymmetric cell division at the 16-cell stage, and these authors previously demonstrated a role for the LGN/Pins homolog AGS in that ACD process. Here they extend that work by investigating and exploiting the question of why echinoids but not other echinoderms form micromeres. Starting with a phylogenetics approach, they determine that much of the difference in ACD and micromere formation in echinoids can be attributed to differences in the AGS Cterminus, in particular a GoLoco domain (GL1) that is missing in most other echinoderms.

      Thank you for the summary.

      Strengths: 

      There is a lot to like about this paper. It represents a superlative match of the problem with the model system and the findings it reports are a valuable addition to the literature. It is also an impressively thorough study; the authors should be commended for using a combination of experimental approaches (and consequently generating a mountain of data). 

      Thank you.

      Weaknesses: 

      There is an intriguing finding described in Figure 1. AGS in sea cucumbers looks identical to AGS in the pencil urchin, at least at the C terminus (including the GL1 domain). Nevertheless, there are no micromeres in sea cucumbers. Therefore another mechanism besides GL motif organization has arisen to support micromere formation. It is a consequential finding and an important consideration in interpreting the data, but I could not find any mention of it in the text. That is a missed opportunity and should be remedied, ideally not only through discussion but also experimentation. Specifically: does sea cucumber AGS (SbAGS) ever localize to the vegetal cortex in sea cucumbers? Can it do so in echinoids? Will that support micromere formation? 

      Thank you for pointing this out. 

      To respond to the Reviewer’s request, we synthesized sea cucumber (Sb) AGS based on the sequence available in the database and tested it in the sea urchin (Sp) embryos, which is enclosed in Fig. S3. We performed this experiment to confirm that SbAGS localizes less at the vegetal cortex than SpAGS as a proof of principle. However, we hesitate to conduct further studies using the synthetic sequence in this study. Sea cucumbers are an emerging yet understudied model. This species is not readily available or established as a model system for embryology. Even for the two species (A. japonicus in Japan and P. parvimensis in the USA) that were previously used for embryonic studies, their gametes are typically available only for 12 months in a year. Since some echinoderm researchers are aiming to establish sea cucumbers as a model system in the near future (see 2024 review: PMID: 38368336), we hope to be able to have better access to their embryos in the future. Yet, it may require a few more years to reach that condition.

      In this revised manuscript, we explained the above details and further added the discussion described below. All of the experimental models used in this study are wild animals obtained from the ocean, raising the standard for reproducibility. However, handling wild animals could come with challenges. We hope that the reviewer understands the unique benefits and challenges of this study.

      Discussion:

      Previous studies (PMIDs: 17726110; 21855794) suggest that GL1 is not involved in intramolecular interaction with TPR domains. This allows GL1 to interact independently with Gαi for cortical recruitment yet without influencing other GLs for AGS activation. To ensure GL1's independence, GL1 is typically located distantly from other GLs in Pins (flies), LGN (humans), and AGS (sea urchins). Based on this prior knowledge, we speculate three scenarios for sea cucumber (Sb) AGS not being able to localize or function during asymmetric cell division (ACD): 1) GL1 and GL2 are located too close to each other, compromising GL1's independence for recruitment. 2) A lack of GL4 loosens the autoinhibition state. 3) The GL1 sequence of SbAGS is quite different from that of echinoids’ AGS (Figure S2), compromising its recruiting efficacy. 

      For 1), we tested this possibility by making the SpAGS-GL1GL2 mutant that has GL1 and GL2 next to each other (Fig. 4G). This mutant indeed compromised its cortical localization and function in ACD. For 2), we showed that the lack of GL4 partially compromised ACD in SpAGS (Fig. 3F), suggesting that GL4 supports ACD. For 3), The results in Figure 4 indicate that the position but not the sequence of GL1 is critical for ACD. Based on these observations, we speculate a combination of 1) and 2) compromised SbAGS's ACD function. However, it is still possible that a significant difference in the GL1 sequence diminished its function as GL entirely. Future studies should address these remaining questions directly in the sea cucumber embryos once they are established as a model system in the near future (PMID: 38368336)

      The authors point out that AGS-PmGL demonstrates enrichment at the vegetal cortex (arrow in 5G, quantifications in 5H), unlike PmAGS. AGS-PmGL does not however support ACD. They interpret this result to indicate "that other elements of SpAGS outside of its C-terminus can drive its vegetal cortical localization but not function." This is a critical finding and deserves more attention. Put succinctly: Vegetal cortical localization of AGS is insufficient to promote ACD, even in echinoids. Why should this be?  

      Thank you for the suggestion. We revised our wording to be more succinct. Of note, as we noted in the text, AGS-PmGL has only two GL domains, which will likely not provide the full force to control ACD and result in insufficient ACD function.

      The authors did perform experiments to address this problem, hypothesizing that the difference might be explained by the linker region, which includes a conserved phosphorylation site that mediates binding to Dlg. They write "To test if this serine is essential for SpAGS localization, we mutated it to alanine (AGS-S389A in Fig. S3A). Compared to the Full AGS control, the mutant AGS-S389A showed reduced vegetal cortical localization (Fig. S3B-C) and function (Fig. S3D-E). Furthermore, we replaced the linker region of PmAGS with that of SpAGS (PmAGSSpLinker in Fig. S4A-B). However, this mutant did not show any cortical localization nor proper function in ACD (Fig. S4C-F). Therefore, the SpAGS C-terminus is the primary element that drives ACD, while the linker region serves as the secondary element to help cortical localization of AGS." 

      The experiments performed only make sense if the AGS-PmGL chimeric protein used in Figure 5 starts the PmGL sequence only after the Sp linker, or at least after the Sp phosphorylation site. I can't tell from the paper (Figure S3 indicates that it does, whereas S5 suggests otherwise), but it's a critical piece of information for the argument. 

      Thank you for the pointer, and we apologize for the confusion. AGS-PmGL contains the SpAGS linker domain. To clarify this point, we added the amino acid position at the junction of each chimeric construct diagram in Figs. 5 and S4. To clarify, Figure S5 is about the GL domain mutations (not about the Linker).  

      Another piece of missing information is whether the PmAGS can be phosphorylated at its own conserved phosphorylation site. The authors don't test this, which they could at least try using a phosphosite prediction algorithm, but they do show that the candidate phosphorylation site has a slightly different sequence in Pm than in Et and Sp (Fig. S4A). With impressive rigor, the authors go on to mutate the PmAGS phosphorylation site to make it identical to Sp. Nothing happens. Vegetal cortical localization does not increase over AGS-PmGL alone. Micromere formation is unrescued. 

      There is therefore a logic problem in the text, or at least in the way the text is written. The paragraph begins "Additionally, AGS-PmGL unexpectedly showed cortical localization (Figure 5G), while PmAGS showed no cortical localization (Figure 5B)." We want to understand why this is true, but the explanation provided in the remainder of the paragraph doesn't match the question: according to quite a bit of their own data, the phosphorylation site in the linker does not explain the difference. It might explain why AGS-PmGL fails to promote micromere formation, but only if the AGS-PmGL chimeric protein uses the Pm linker domain (see above).

      Thank you for the insightful suggestion. As suggested, we performed the phosphosite predictions using GPS 6.0 (PMID: 37158278) and enclosed the results in Fig. S4A (replacing the old Fig. S3A). The software predicts SpAGS and EtAGS have a predicted AuroraA phosphorylation site (RRRSMEN in Supplemental figure S4A) in their linker domain, while PmAGS does not. Sp and Et AGS also have the additional 5-7 predicted phosphorylation sites, while PmAGS has only three sites with low scores. Therefore, the linker domain is not conserved in PmAGS. 

      The PmAGS+SpLinker mutant does restore the predicted AuroraA phosphorylation site on the software, yet it does not restore the cortical localization or ACD function in the embryo. Therefore, other sites in the Linker region might also be necessary for cortical localization and ACD function of AGS. In this study, we did not perform further manipulations in the Linker domain. As the reviewer rightfully pointed out, even if we identify the Linker regions essential for AGS localization and function, it will be difficult to interpret the result unless we know what proteins interact with the Linker domain of AGS. Therefore, this is beyond the scope of the current manuscript. We discussed these remaining matters in the discussion section. 

      Another concern that is potentially related is the measurement of cortical signal. For example, in the control panel of Figure 5C, there is certainly a substantial amount of "non-cortical" signal that I believe is nuclear. I did not see a discussion of this signal or its implications. My impression of the pictures generally is that the nuclear signal and cortical signal are inversely correlated, which makes sense if they are derived from the same pool of total protein at different points of the cell cycle. If that's the case (and it might not be) I would expect some quantifications to be impacted. For example, the authors show in Figure S3B that AGS-S389A mutant does not localize to the cortex. However, this mutant shows a radically different localization pattern to the accompanying control picture (AGS), namely strong enrichment in what I assume to be the nucleus. Is the S389 mutant preventing AGS from making it to the cortex? Or are these pictures instead temporally distinct, meaning that AGS hasn't yet made it out of the nucleus? Notably, the work of Johnston et al. (Cell 2009), cited in the text, does not show or claim that the linker domain impacts Pins localization. Their model is rather that Pins is anchored at the cortex by Gαi, not Dlg, and that is the same model described in this manuscript.

      In agreement with that model and the results of Johnston et al., a later study (Neville et al. EMBO Reports 2023) failed to find a role for Dlg or the conserved phosphorylation site in Pins localization. 

      In the sea urchin embryo, the dye or GFP often appears in the nucleus randomly on top of the cytoplasm (for example, see Fig. S2b of PMID: 35444184). Further, embryos tend to incorporate exogenous genomic fragments more efficiently during early embryogenesis (PMID: 3165895). It is proposed that early embryos may have a loosened or incomplete nuclear envelope compared to adult cells as they divide rapidly (every 40 minutes). Therefore, any excess protein with no specific localization signal may randomly appear in the nucleus as it serves as an available space in the cell. As the Reviewer rightfully pointed out, we consider that the nuclear AGS signal is due to the lack of a specific destination since this signal pattern is not consistent across embryos. In contrast, the proteins that have nuclear localization (e.g., transcription factors) usually show a consistent nuclear signal across cells and embryos with less cytoplasmic signal. To avoid confusion, we replaced the S389A image in Fig. S3B (which is now Fig. S4C) as well as any other images that may create similar confusion.

      Reviewer #2 (Public Review): 

      This study from Dr. Emura and colleagues addresses the relevance of AGS3 mutations in the execution of asymmetric cell divisions promoting the formation of the micromere during seasearching development. To this aim, the authors use quantitative imaging approaches to evaluate the localisation of AGS3 mutants truncated at the N-terminal region or at the Cterminal region, and correlate these distributions with the formation of micromere and correct development of embryos to the pluteus stage. The authors also analyse the capacity of these mutated proteins to rescue developmental defects observed upon AGS3 depletion by morpholino antisense nucleotides (MO). Collectively these experiments revealed that the Cterminus of AGS3, coding for four GoLoco motifs binding to cortical Gaphai proteins, is the molecular determinant for cortical localisation of AGS3 at the micromeres and correct pluteus development. Further genetic dissections and expression of chimeric AGS3 mutants carrying shuffled copies of the GoLoco motifs or four copies of the same motifs revealed that the position of GoLoco1 is essential for AGS3 functioning. To understand whether the AGS3-GoLoco1 evolved specifically to promote asymmetric cell divisions, the authors analyse chimeric AGS3 variants in which they replaced the sea urchin GoLoco region with orthologs from other echinoids that do not form micromeres, or from Drosophila Pins or human LGN. These analyses corroborate the notion that the GoLoco1 position is crucial for asymmetric AGS3 functions. In the last part of the manuscript, the authors explore whether SpAGS3 interacts with the molecular machinery described to promote asymmetric cell division in eukaryotes, including Insc, NuMA, Par3, and Galphai, and show that all these proteins colocalize at the nascent micromere, together with the fate determinant Vasa. Collectively this evidence highlighted how evolutionarily selected AGS3 modifications are essential to sustain asymmetric divisions and specific developmental programs associated with them. 

      Thank you for the useful summary.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      The quantifications of "vegetal cortical localization" are somewhat incomplete. As measured, "vegetal cortical localization" does not demonstrate particular enrichment at the vegetal cortex, only that some signal appears there. In other words, we can't tell for sure that there is any more signal at the vegetal cortex than anywhere else along the cortex, and in fact that's plainly true and even described for the ACS1111 and AGS2222 constructs. One solution would be to measure signal strength around the cell perimeter and see where it is strongest. 

      As suggested by the Reviewer, we added new measurements, focusing and comparing the signals on the animal versus vegetal cortices (Figs. 2C, 3D, 4C, 5C, &H, 9D & F, S3D, S4D &I). 

      A related issue is that the strength of cortical enrichment is indicated in this paper by the ratio of cortical to "non-cortical" signal, but "non-cortical" is not defined. Does it include the nuclear signal? 

      As described above, we replaced all measurements using the above animal vs. vegetal cortices to avoid confusion. The nuclear signal is thus not measured in these analyses.

      I'm enthusiastic about the results in Figure 7, but I can't really see them very well. Could you please consider changing the color scheme? For single-color figures, it would be helpful to view them as black on white rather than (for example) blue on black. That change is easily achieved with Fiji. 

      We revised the Figure as suggested.

      Page 3 Results section: "At the time of ACD, Insc recruits Pins/LGN to the cortex through Gαi": I understand this sentence to mean that Gαi is an intermediary protein that Insc uses to recruit Pins/LGN. I think the point should be made more clear. As shown in Figure 1, Insc binds to Pins/LGN directly and interacts with cortical polarity proteins directly. Recruitment therefore doesn't appear to require Gαi, but stable association with the membrane (a subsequent step) probably does. That model is shown and described in Figure 6A.

      Thank you for the pointer. We clarified our explanations as suggested.

      Reviewer #2 (Recommendations For The Authors): 

      The manuscript addresses an interesting question, and uses elegant genetic approaches associated with imaging analyses to elucidate the molecular mechanisms whereby AGS3 and spindle orientation proteins promote asymmetric divisions and specific developmental programs. This considered, it might be worth clarifying a few aspects of the reported findings. 

      (1) In some experimental settings, the presence of AGS3 mutants exacerbates the AGS3 deletion by MO (Figure 4F). Can the author speculate on what can be the molecular explanation? 

      Thank you for pointing this out. We speculate that AGS1111 and AGS2222 are unable to keep the auto-inhibited forms since they lack GL3 and GL4 as modeled in Figure 6. AGS-MO reduces the endogenous AGS, which compromises the vegetal polarity. In this embryo, constitutive active AGS likely further randomizes the polarity, as evidenced by AGS-OE results in Fig. S7, resulting in an even worse outcome. We elaborated on this part in the text.

      (2) Imaging analyses of Figure 4B-C suggest that the mutant AGS1111 does not localise at the vegetal cortex while AGS2222 does (Fig. 4C). However these mutants induce similar developmental defects (Figure 4F). What could be the reason? 

      We apologize for the confusion in Fig. 4C. The majority of embryos from both AGS1111 and 2222 groups failed to form micromeres and showed AGS localization across the cortex. Among the dozens we examined, 0 embryos from 1111 and 8 embryos from 2222 developed micromeres. Those 8 embryos still showed vegetal cortical localization, so the proportion appears high in Fig. 4B, yet it reflects the minority in the group. In contrast, Development was scored for all embryos (including those that failed to form micromeres), so the graph demonstrates the majority of embryos. To avoid this confusion, we replaced the old Fig. 4C with a new graph that analyzes the cortical signal levels at the vegetal versus animal cortices.

      (3) Figure 7 shows the crosstalk between AGS3 and other asymmetry players including NuMA. Vertebrate and Drosophila NuMA are ubiquitously present in tissues and localise to the spindle poles in mitosis. However, in Figures 7A and 7E NuMA seems expressed only in a subset of sea urchin embryonic cells. Is this the case? 

      As the Reviewer rightfully pointed out, Sea urchin NuMA is also present in all cells and localizes to the spindle (please see Fig. 2 of our previous paper PMID: 31439829). AGS is also slightly localized on the spindles of all cells. However, the PLA signal of AGS and NuMA mostly showed up in the vegetal cortex in this study, suggesting that major crosstalk may occur in the vegetal cortex. This does not rule out the possibility that minor interactions may also occur on the spindle or elsewhere in the cell, which was not quantifiable in this study. We clarified this point in the text.

    1. eLife Assessment

      This manuscript offers valuable insights by identifying two distinct liver cancer subtypes through multi-omics integration and developing a robust prognostic model, validated across various datasets, including single-cell RNA sequencing. The evidence is solid, with comprehensive validation in both internal and independent cohorts; however, the reliance on computational methods highlights the necessity for further experimental validation to fully confirm the mechanistic insights.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to classify hepatocellular carcinoma (HCC) patients into distinct subtypes using a comprehensive multi-omics approach. They employed an innovative consensus clustering method that integrates multiple omics data types, including mRNA, lncRNA, miRNA, DNA methylation, and somatic mutations. The study further sought to validate these subtypes by developing prognostic models using machine learning algorithms and extending the findings through single-cell RNA sequencing (scRNA-seq) to explore the cellular mechanisms driving subtype-specific prognostic differences.

      Strengths:

      (1) Comprehensive Data Integration: The study's integration of various omics data provides a well-rounded view of the molecular characteristics underlying HCC. This multi-omics approach is a significant strength, as it allows for more accurate and detailed classification of cancer subtypes.

      (2) Innovative Methodology: The use of a consensus clustering approach that combines results from 10 different clustering algorithms is a notable methodological advancement. This approach reduces the bias that can result from relying on a single clustering method, enhancing the robustness of the findings.

      (3) Machine Learning-Based Prognostic Modeling: The authors rigorously apply a wide array of machine learning algorithms to develop and validate prognostic models, testing 101 different algorithm combinations. This comprehensive approach underscores the study's commitment to identifying the most predictive models, which is a considerable strength.

      (4) Validation Across Multiple Cohorts: The external validation of findings in independent cohorts is a critical strength, as it increases the generalizability and reliability of the results. This step is essential for demonstrating the clinical relevance of the proposed subtypes and prognostic models.

      Weaknesses:

      (1) Inconsistent Storyline:<br /> Despite the extensive data mining and rigorous methodologies, the manuscript suffers from a lack of a coherent and consistent narrative. The transition between different sections, particularly from multi-omics data integration to single-cell validation, feels disjointed. A clearer articulation of how each analysis ties into the overall research question would improve the manuscript.

      (2) Questionable Relevance of Immune Cell Activity Analysis:<br /> The evaluation of immune cell activities within the cancer cell model raises concerns about its meaningfulness. The methods used to assess immune function in the tumor microenvironment may not be fully appropriate, potentially limiting the insights gained from this part of the study.

      (3) Incomplete Single-Cell RNA-Seq Validation:<br /> The validation of the findings using single-cell RNA-seq data appears insufficient to fully support the study's claims. While the authors make an effort to extend their findings to the single-cell level, the analysis lacks depth. A more comprehensive validation is necessary to substantiate the robustness of the identified subtypes.

      (4) Figures and Visualizations:<br /> Several figures in the manuscript are missing necessary information, which affects the clarity of the results. For instance, the pathways in Figure 3A could be clustered to enhance interpretability, the blue bar in Figure 4A is unexplained, and Figure 4B is not discussed in the text. Additionally, the figure legend in Figure 7C lacks detail, and many figure descriptions merely repeat the captions without providing deeper insights.

      (5) Appraisal of the Study's Aims and Results:<br /> The authors have set out to achieve an ambitious goal of classifying HCC patients into distinct prognostic subtypes and validating these findings through both bulk and single-cell analyses. While the methodologies employed are innovative and the data integration comprehensive, the study falls short of fully achieving its aims due to inconsistencies in the narrative and incomplete validation. The results partially support the conclusions, but the lack of coherence and depth in certain areas limits the overall impact of the study.

      (6) Impact on the Field:<br /> If the identified weaknesses are addressed, this study has the potential to significantly impact the field of HCC research. The multi-omics approach combined with machine learning is a powerful framework that could set a new standard for cancer subtype classification. However, the current state of the manuscript leaves some uncertainty regarding the practical applicability of the findings, particularly in clinical settings.

      (6) Additional Context<br /> For readers and researchers, this study offers a valuable look into the potential of integrating multi-omics data with machine learning to improve cancer classification and prognostication. However, readers should be aware of the noted weaknesses, particularly the need for more consistent narrative development and comprehensive validation of the methods. Addressing these issues could greatly enhance the study's utility and relevance to the community.

    3. Reviewer #2 (Public review):

      Summary:

      Overall, this is a well-executed and insightful study. With some refinement to the presentation and a deeper exploration of the implications, the manuscript will make a significant contribution to the field of cancer genomics and personalized medicine.

      Strengths:

      The manuscript integrates multi-omics data with machine learning to address the significant heterogeneity of hepatocellular carcinoma (HCC). The use of multiple clustering algorithms and a consensus method strengthens the robustness of the findings. The study successfully develops a prognostic model with excellent predictive accuracy, validated across independent datasets. This adds considerable value to the field, particularly in providing individualized treatment strategies. The identification of two distinct liver cancer subtypes with different biological and metabolic characteristics is well-supported by the data, offering a promising direction for personalized medicine.

      Weaknesses:

      (1) Consider streamlining the presentation of methods, especially regarding the clustering algorithms and machine learning models. Readers may find it difficult to follow the exact process unless more clearly outlined.

      (2) Some figures, such as the signaling pathways and heatmaps, are critical to understanding the study's findings. Ensure that all figures are high quality, easy to interpret, and adequately labeled. You may also want to highlight the key findings within the figure captions more explicitly.

      (3) While the manuscript does compare its prognostic model to those previously published, the novelty of the findings could be emphasized more clearly. Discussing the potential limitations of the study (e.g., the reliance on computational models and small sample sizes for scRNA-seq) could strengthen the manuscript.

      (4) The manuscript mentions that the data was split into training and validation datasets in a 1:1 ratio. How was the performance verified? Is there an independent test set?

      (5) The role of the MIF signaling pathway in subtype differentiation is intriguing, but further mechanistic insights into how this pathway drives the differences between CS1 and CS2 could be discussed in more detail. If experimental evidence for this pathway exists in the literature, it should be mentioned.

      (6) Some sentences are quite long and complex, which can affect readability. Breaking them down into shorter, clearer sentences would improve the flow.

    4. Author response:

      Reviewer #1 (Recommendations for the authors):

      (1) Storyline and Narrative Flow:

      Consider revising the manuscript to create a more coherent and consistent narrative. Clarify how each section of the study-particularly the transition from multi-omics data integration to single-cell RNA-seq validation-contributes to the overall research question. This will help readers better understand the logical flow of the study.

      In the upcoming revisions, we will optimize the logical connections between sections of the manuscript to clarify the role each part plays in the overall research question, making it easier for readers to follow.

      (2) Immune Cell Activity Analysis:

      Reevaluate the methods used to assess immune cell activities within the context of the tumor microenvironment. Consider providing additional justification for the relevance of using the cancer cell model for this analysis. If necessary, explore alternative methods or models that might offer more meaningful insights into immune-tumor interactions.

      We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

      (3) Single-Cell RNA-Seq Validation:

      Expand the validation of your findings using single-cell RNA-seq data. This could include more in-depth analyses that explore the heterogeneity within the subtypes and confirm the robustness of your classification method at the single-cell level. This would strengthen the support for your claims about the relevance of the identified subtypes.

      In the current study, we have applied the obtained multi-omics profiling features to single-cell sequencing data to classify malignant cells. We analyzed the metabolic and cell communication differences between different subtypes of malignant cells and explored potential reasons for these differences. Next, we plan to conduct further analysis of the differences between malignant cell subtypes to identify additional clues and mechanisms underlying these variations.

      (4) Methodological Justification:

      Provide a more detailed rationale for the selection of machine learning algorithms and integration strategies used in the study. Explain why the chosen methods are particularly well-suited for this research, and discuss any potential limitations they might have.

      In the revised manuscript, we will include descriptions of the principles of these analytical methods, as well as examples of their application in other studies, to discuss the rationale and limitations of applying these methods in this research.

      (5) Figures and Visualizations:

      Improve the clarity of your figures by addressing the following:

      a) Figure 3A: Cluster the pathways to make the comparisons clearer and more meaningful.

      b) Figure 4A: Clearly explain the significance of the blue bar.

      c) Figure 4B: Ensure this figure is discussed in the main text to justify its inclusion.

      d) Figure 7C: Enhance the figure legend to provide more informative details.

      Additionally, ensure that figure descriptions go beyond the captions and provide detailed explanations that help the reader understand the significance of each figure.

      We fully agree with the reviewer’s suggestions regarding these figures, and we will make the necessary revisions in the revised manuscript.

      (6) Supplementary Materials:

      Consider including more detailed supplementary materials that provide additional validation data, extended methodological descriptions, and any other information that would support the robustness of your findings.

      When we submission the revised manuscript, we will include supplementary materials such as figures or tables that may enhance the presentation of the manuscript's completeness.

      (7) Recent Literature:

      a) Incorporate more recent studies in your discussion, especially those related to HCC subtypes and the application of machine learning in oncology. This will provide a more current context for your work and help position your findings within the broader field.

      We appreciate the reviewer's suggestion. We will incorporate more recent studies into the discussion section and optimize its content.

      (8) Data and Code Availability:

      Ensure that all data, code, and materials used in your study are made available in line with eLife's policies. Provide clear links to repositories where readers can access the data and code used in your analyses.

      We have indicated the sources of the data and tools used in the analysis process within the text, and these data and tools can be accessed through the websites or literature we have cited.

      Reviewer #2 (Recommendations for the authors):

      (1) While the computational findings are robust, further experimental validation of the two subtypes, particularly the role of the MIF signaling pathway, would strengthen the biological relevance of the findings. In vitro or in vivo validation could confirm the proposed mechanisms and their influence on patient prognosis.

      We fully recognize the importance of using tumor models to analyze and validate immune activity results, and we are considering experimental research in this area in future projects.

      (2) Consider testing the model on additional independent cohorts beyond the TCGA and ICGC datasets to further demonstrate its generalizability and applicability across different patient populations.

      We are considering looking for independent external datasets in the GEO database or other databases to validate our model.

      (3) Review the manuscript for long or complex sentences, which can be broken down into shorter, more readable parts.

      In the revised manuscript, we will address any grammatical issues present in the manuscript and modify long and complex sentences that may hinder reader comprehension.

    1. eLife Assessment

      The paper illustrates a valuable approach to generating TCR transgenic mice specific for known epitopes. Solid evidence validates the described pipeline for identification of TCRs from single-cell datasets for the generation of TCR transgenic mice, while obviating the need for generation of T-cell lines and hybridomas.

    2. Reviewer #1 (Public review):

      Summary:

      Debeuf et al. introduce a new, fast method for the selection of suitable T cell clones to generate TCR transgenic mice, a method claimed to outperform traditional hybridoma-based approaches. Clone selection is based on the assessment of the expansion and phenotype of cells specific for a known epitope following immune stimulation. The analysis is facilitated by a new software tool for TCR repertoire and function analysis termed DALI. This work also introduces a potentially invaluable TCR transgenic mouse line specific for SARS-CoV-2.

      Strengths:

      The newly introduced method proved successful in the quick generation of a TCR transgenic mouse line. Clone selection is based on more comprehensive phenotypical information than traditional methods, providing the opportunity for a more rational T-cell clone selection.

      The study provides a software tool for TCR repertoire analysis and its linkage with function.

      The findings entail general practical implications in the preclinical study of a potentially very broad range of infectious diseases or vaccination.

      A novel SARS-CoV-2 spike-specific TCR transgenic mouse line was generated.

      Weaknesses:

      The authors present a novel method to develop TCR transgenic mice and overcome the limitations of the more traditional method based on hybridomas.

      The authors indicate that they did not intend to directly compare their new method with the traditional hybridoma-based approach. However, such comparison becomes inevitable when the classical method is presented as suboptimal and an alternative approach is introduced to address its limitations. Nevertheless, the explanations provided in their rebuttal have helped clarify their position. The intention behind supplementary figure 1 is to illustrate that a clone that appears suitable using traditional assays may fail to produce a successful TCR transgenic line. This is a valid point that I think should be emphasized more clearly in the manuscript, as it highlights the limitations of the traditional method.

      However, the main question that remains is whether the proposed new method will reliably resolve this issue. As previously noted, only one mouse line was generated (successfully) from a single candidate, and the method presented to generate their new TCR transgenic line starts from a more advanced point (a well characterized epitope is already known, and tetramers are available to preselect specific clones). Although this approach likely increases the chances of success, it also limits applicability.

      The authors suggest that tetramers are not absolutely necessary to select a clone of interest. Testing this hypothesis would have added value to this manuscript, demonstrating the ability to rapidly generate new TCR transgenic lines in response to emerging pathogens, as outlined in the introduction. They propose that, in such cases, mice could be immunised and expanded clones retested for reactivity. However, it is unclear how this strategy differs from the classic method in increasing the chances of selecting an optimal clone.

      Regarding the practical value and cost-effectiveness of extensive expression profiling for T cell clone selection, it remains unclear how well a clone chosen for specific traits will retain these features when developed into a TCR transgenic line, or what traits are ideal for different applications. T cell fate is plastic, and various parameters could influence marker expression.

      Issues remain concerning the statistical analysis. Data are said to have been analyzed using both parametric and non-parametric tests. The described approach of performing a normality test followed by either parametric or non-parametric tests is not a correct method for statistical data analysis.

    3. Reviewer #2 (Public review):

      Summary:

      The authors seek to use single-cell sequencing approaches to identify TCRs specific for the SARS CoV2 spike protein, select a candidate TCR for cloning and use it to construct a TCR transgenic mouse. The argument is that this process is less cumbersome than the classical approach, which involves the identification of antigen-reactive T cells in vitro and the construction of T cell hybridomas prior to TCR cloning. TCRs identified by single-cell sequencing that is already paired to transcriptomic data would more rapidly identify TCRs that are likely to contribute to a functional response. The authors successfully identify TCRs that have expanded in response to SARS CoV2 spike protein immunization, bind to MHC tetramers and express genes associated with functional response. They then select a TCR for cloning and construction of a transgenic mouse in order to test the response of resulting T cells in vivo following immunization with spike protein of coronavirus infection.

      Strengths:

      (1) The study provides proof of principle for the identification and characterization of TCRs based on single-cell sequencing data.

      (2) The authors employ a recently developed software tool (DALI) that assists in linking transcriptomic data to individual clones.

      (3) The authors successfully generate a TCR transgenic animal derived from the most promising T cell clone (CORSET8) using the TCR sequencing approach.

      (4) The authors provide initial evidence that CORSET8 T cells undergo activation and proliferation in vivo in response to immunization or infection.

      (5) Procedures are well-described and readily reproducible.

      Weaknesses:

      (1) The purpose of presenting a failed attempt to generate TCR transgenic mice using a traditional TCR hybridoma method is unclear. The reasons for the failure are uncertain, and the inclusion of this data does not really provide information on the likely success rate of the hybridoma vs single cell approach for TCR identification, as only a single example is provided for either.

      (2) There is little information provided regarding the functional differentiation of the CORSET8 T cells following challenge in vivo, including expression of molecules associated with effector function, cytokine production, killing activity and formation of memory. The study would be strengthened by some evidence that CORSET8 T cells are successfully recapitulating the functional features of the endogenous immune response (beyond simply proliferating and expressing CD44). This information is important to evaluate whether the presented sequencing-based identification and selection of TCRs is likely to result in T-cell responses that replicate the criteria for selecting the TCR in the first place.

      (3) While I find the argument reasonable that the approach presented here has a lot of likely advantages over traditional approaches for generating TCR transgenic animals, the use of TCR sequencing data to identify TCRs for study in variety of areas, including cancer immunotherapy and autoimmunity, is in broad use. While much of this work opts for alternative methods of TCR expression in primary T cells (i.e. CRISPR or retroviral approaches), the process of generating a TCR transgenic mouse from a cloned TCR is not in itself novel. It would be helpful if the authors could provide a more extensive discussion explaining the novelty of their approach for TCR identification in comparison to other more modern approaches, rather than only hybridoma generation.

      Comments on revisions:

      The authors have provided additional clarification on the comparisons between the presented method for TCR transgenic generation and the hybridoma method that is more commonly used and added additional verification of the functional response in vivo of T cells expressing the selected TCR. Overall, these additions enhance the evidence that the proposed methods are likely to identify TCRs with a strong immune activation profile and are a reasonable response to the first round of review.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Debeuf et al. introduce a new, fast method for the selection of suitable T cell clones to generate TCR transgenic mice, a method claimed to outperform traditional hybridoma-based approaches. Clone selection is based on the assessment of the expansion and phenotype of cells specific for a known epitope following immune stimulation. The analysis is facilitated by a new software tool for TCR repertoire and function analysis termed DALI. This work also introduces a potentially invaluable TCR transgenic mouse line specific for SARS-CoV-2.

      Strengths:

      The newly introduced method proved successful in the quick generation of a TCR transgenic mouse line. Clone selection is based on more comprehensive phenotypical information than traditional methods, providing the opportunity for a more rational T cell clone selection.

      The study provides a software tool for TCR repertoire analysis and its linkage with function.

      The findings entail general practical implications in the preclinical study of a potentially very broad range of infectious diseases or vaccination.

      A novel SARS-CoV-2 spike-specific TCR transgenic mouse line was generated.

      Weaknesses:

      The authors attempt to compare their novel method with a more conventional approach to developing TCR transgenic mice. In this reviewer's opinion, this comparison appears imperfect in several ways:

      (1) Work presenting the "traditional" method was inadequate to justify the selection of a suitable clone. It is therefore not surprising that it yielded negative results. More evidence would have been necessary to select clone 47 for further development of the TCR transgenic line, especially considering the significant time and investment required to create such a line.

      Based on Supplementary Figure 1A only, we understand the concern of the reviewer. However, the data presented in Supplementary Figure 1A is collected during the first rough screening of clones where only the production of IL-2 and IFN-y was measured as a readout for activation. Thereafter, a large selection of responsive clones was further grown and co-cultured with a dose-titration of the antigenic peptide pool. In this second co-culture, also flow cytometry readouts are included such as CD69 expression (as shown in Supplementary Figure 1B). Finally, a narrower selection of responder clones was co-cultured with the different individual peptides to unravel the specificity of the TCR of the clone. In conclusion, the clone was tested at least three times in three distinct set-ups with multiple different readouts.

      However, a good evaluation of a clone in an in vitro setting does not necessarily translate in optimal functioning of the cells in a biological context. For instance, some clones survive better in an in vitro setting than others or have already a more activated profile before stimulation.

      (2) The comparison is somewhat unfair, because the methods start at different points: while the traditional method was attempted using a pool of peptides whose immunogenicity does not appear to have been established, the new method starts by utilising tetramers to select T cells specific for a well-established epitope.

      Given the costs and time involved, only a single clone could be tested for either method, intrinsically making a proper comparison unfeasible. Even for their new method, the authors' ability to demonstrate that the selected clone is ideal is limited unless they made different clones with varying profiles to show that a particular profile was superior to others.

      In my view, there was no absolute need to compare this method with existing ones, as the proposed method holds intrinsic value.

      We acknowledge the importance of the well-established hydridoma technology and in no way intended to compare these methods head-to-head, nor do not want to question the validity of the classical methods. The reason why we also wanted to show the failed CORSET8 mouse was to highlight the parts of the TCR generating process which could be rationalized. We again want to emphasize that we do not want to compare methods in any way and recognise that we started from two different bases in terms of clone selection (peptide pool stimulation versus tetramer staining). While the tetramer staining that was employed in the generation of CORSET8 mice allowed to enrich the samples for specific responder clones, this enrichment step is not an absolute requirement for the implementation of the presented method or for the successful generation of a TCR Tg mouse model. An alternative approach could be to use the described method to select for activated and expanded clones upon immunisation and test their reactivity in subsequent steps using peptide stimulation before selecting a receptor. In conclusion, we merely wish to present a novel roadmap for others to use for the generation of their TCR Tg mouse to aid in the selection of the most preferable clone for their purposes.

      (3) While having more data to decide on clone selection is certainly beneficial, given the additional cost, it remains unclear whether knowing the expression profiles of different proteins in Figure 2 aids in selecting a candidate. Is a cell expressing more CD69 preferable to a cell expressing less of this marker? Would either have been effective? Are there any transcriptional differences between clonotype 1 and 2 (red colour in Figure 2G) that justify selecting clone 1, or was the decision to select the latter merely based on their different frequency? If all major clones (i.e. by clonotype count) present similar expression profiles, would it have been necessary to know much more about their expression profiles? Would TCR sequencing and an enumeration of clones have sufficed, and been a more cost-effective approach?

      The method we present in the paper serves as a proof-of-concept, to be adapted to the researcher’s own needs. We agree with the reviewer that for our intentions with the CORSET8 mice, TCRseq in combination with an enumeration of the clones could also have sufficed and would lower the cost of sequencing. However, we wish to present a roadmap for others to use for the generation of their TCR Tg mouse. Important in this, is that the cellular phenotype, and activation state can be taken into consideration, which might for some projects be essential.  

      Nonetheless, we do see clear interclonal differences regarding the expression of “activation” genes, where clone 1 is clearly one of the well activated and interferon producing clones (as shown in Author response image 1). As such, researchers could expand these types of analysis to probe for specific phenotypes of characteristics.

      Author response image 1.

      (4) Lastly, it appears that several of the experiments presented were conducted only once. This information should have been explicitly stated in the figure legends.

      To control for interexperimental variation, every experiment represented in the manuscript has been performed at least two times. We have added the additional information regarding the experimental repetitions and groups in the figure legends.

      Reviewer #2 (Public Review):

      Summary:

      The authors seek to use single-cell sequencing approaches to identify TCRs specific for the SARS CoV2 spike protein, select a candidate TCR for cloning, and use it to construct a TCR transgenic mouse. The argument is that this process is less cumbersome than the classical approach, which involves the identification of antigen-reactive T cells in vitro and the construction of T cell hybridomas prior to TCR cloning. TCRs identified by single-cell sequencing that are already paired to transcriptomic data would more rapidly identify TCRs that are likely to contribute to a functional response. The authors successfully identify TCRs that have expanded in response to SARS CoV2 spike protein immunization, bind to MHC tetramers, and express genes associated with functional response. They then select a TCR for cloning and construction of a transgenic mouse in order to test the response of resulting T cells in vivo following immunization with spike protein of coronavirus infection.

      Strengths:

      (1) The study provides proof of principle for the identification and characterization of TCRs based on single-cell sequencing data.

      (2) The authors employ a recently developed software tool (DALI) that assists in linking transcriptomic data to individual clones.

      (3) The authors successfully generate a TCR transgenic animal derived from the most promising T cell clone (CORSET8) using the TCR sequencing approach.

      (4) The authors provide initial evidence that CORSET8 T cells undergo activation and proliferation in vivo in response to immunization or infection.

      (5) Procedures are well-described and readily reproducible.

      Weaknesses:

      (1) The purpose of presenting a failed attempt to generate TCR transgenic mice using a traditional TCR hybridoma method is unclear. The reasons for the failure are uncertain, and the inclusion of this data does not really provide information on the likely success rate of the hybridoma vs single cell approach for TCR identification, as only a single example is provided for either.

      We refer to comments 2 and 3 of reviewer 1 for an answer to this point.

      (2) There is little information provided regarding the functional differentiation of the CORSET8 T cells following challenge in vivo, including expression of molecules associated with effector function, cytokine production, killing activity, and formation of memory. The study would be strengthened by some evidence that CORSET8 T cells are successfully recapitulating the functional features of the endogenous immune response (beyond simply proliferating and expressing CD44). This information is important to evaluate whether the presented sequencing-based identification and selection of TCRs is likely to result in T-cell responses that replicate the criteria for selecting the TCR in the first place.

      We agree with the reviewer that the data in the initial manuscript included only a limited in vivo functional validation of the CORSET8 T cells. Therefore, we extended these in vivo readouts and measured IFN-g production, CD69, T-bet expression (as measure for activation) and Ki-67 expression (as alternative readout than CTV for proliferation). In the single cell data, we saw that these markers were more pronounced in the selected clone compared to other clones. We could confirm these findings in vivo, and found a stronger induction of IFN-g, CD69, T-bet and Ki-67 in CORSET8 T cells compared to endogenous CD45.2 cells and even Spike-Tetramer+ CD45.2 endogenous cells. We added these data in Figure 4.

      (3) While I find the argument reasonable that the approach presented here has a lot of likely advantages over traditional approaches for generating TCR transgenic animals, the use of TCR sequencing data to identify TCRs for study in a variety of areas, including cancer immunotherapy and autoimmunity, is in broad use. While much of this work opts for alternative methods of TCR expression in primary T cells (i.e. CRISPR or retroviral approaches), the process of generating a TCR transgenic mouse from a cloned TCR is not in itself novel. It would be helpful if the authors could provide a more extensive discussion explaining the novelty of their approach for TCR identification in comparison to other more modern approaches, rather than only hybridoma generation.

      By integrating the recent technological advances in single cell sequencing into the generation of TCR Tg mice, possibilities arise to rationalize clone selection regarding clonal size, lineage/phenotype and functional characteristics. Often, the selection process based on hybridoma selection yields multiple epitope specific clones that upregulate CD69 or IL-2, and only minimal functional and phenotypic parameters are checked before prioritizing one clone to proceed with. In our experience, transgenic clones selected in this way sometimes render TCR clones unable to compete with endogenous polyclonal T clones in vivo. Taken all these caveats into account, the novelty we present here is that the researcher is fully able to select clones based on several layers of information without the need for extensive or repeated screening. Moreover, the selection of the TCR Tg clone can be done via the interactive and easily interpretable DALI tool. Owing to the browser-based interactive GUI, immunologists having limited coding experience can effectively analyse their complex datasets.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Regarding Supplementary Figure 1A was the experiment conducted more than once? Clone 47 seems minimally superior to the other clones. Incorporating a positive control, such as the response of the OT-I hybridoma to SIINFEKL, could have provided a benchmark to gauge the strength of the observed responses.

      Also, what was the concentration of the peptide used to restimulate the T cells in vitro? High peptide concentrations can lead to non-specific responses. Ideally, a titration should have been performed, perhaps in a subsequent experiment that only tested those clones that responded well initially. Given the resources required to create and maintain a transgenic mouse line, proceeding with the chosen clone based on the data presented seems to carry considerable risk.

      The experiment has been performed three times. The data presented in Supplementary Figure 1A is collected during the first rough screening of clones where only the production of IL-2 and IFN-y was measured as a readout for activation. Thereafter, a large selection of responsive clones was further grown and co-cultured with a dose-titration of the antigenic peptide pool. In this second co-culture, also flow cytometry readouts are included such as CD69 expression (as shown in Supplementary Figure 1B). Finally, a narrower selection of responder clones was co-cultured with the different individual peptides to unravel the specificity of the TCR of the clone. In conclusion, the clone was tested at least three times in three distinct set-ups with multiple different readouts.

      In Supplementary Figure 1C, no response to stimulation was detected. Ideally, this figure should have included a positive control, such as PMA/Ionomycin or aCD3/CD28 stimulation.

      We agree with the reviewer that this experiment should have included a positive control to validate the non-specific responsiveness of the clone and the technical feasibility of the experiment. Unfortunately, the initial CORSET8 line is frozen and is thus not easily available to repeat the experiment.

      Can the authors clarify their gating strategy in the legend of In Supplementary Figure 1D?

      Plotted cells are non-debris > single cells > viable cells > CD45+. We have added the information to the legend of Supplementary Figure 1D.

      In Figure 2, the figure legend should provide more detail on which cells were sorted for the single-cell RNA sequencing analysis. The materials and methods section explains that cells were stained for CD44. Were activated cells then sorted (either tetramer-positive or -negative), plus naïve CD8 T cells from a naïve mouse?

      Supplementary Figure 2 contains the detailed gating strategy during the sort for the single cell experiment. We have added additional red gates to the plots to clarify which samples were sent for sequencing. This has been adapted in the figure legends of both Figure 2 and Supplementary Figure 2. 

      In Figure 3, Rag1 sufficient transgenic mice display similar numbers of CD4 and CD8 T cells as WT mice in the spleen. Typically, transgenic mice present skewed frequencies of T cells towards the type generated (CD8 in this case), which the authors only found in the thymus of CORSET8 mice. Could this be discussed?

      The comment of the reviewer is valid as there is indeed a skewing towards CD8 T cells in the thymi of the CORSET8 mice. We looked back into the data of the experiments and noticed that poor resolution of some markers might have resulted in improper results. We have repeated this and added another T cell marker (TCRbeta) next to the already included CD3e marker. By including both markers, we were able to show that also in spleen the skewing towards the CD8 T cell phenotype is present.

      How many repetitions were performed for the experiments in Figures 3D and 3E? How many mice were analyzed for Figure 3E? Please provide this information in the figure legend. Also, include a proper quantification and statistical analysis of the data shown.

      New quantification graphs with statistical analysis have been added to Figure 3E. The accompanying figure legend has been adapted. The co-culture displayed in Figure 3D is a representative experiment of two repetitions.

      Figure 4C includes 3-4 mice per group. This experiment should have been replicated, and this information should be indicated in the figure legend.

      We apologise for omitting this data in the figure legend. The experiment presented in Figure 4A-C has been repeated twice, yielding results following the same trend. We were unable to pool the data as two different proliferation dyes were used in the separate experiments (CFSE and CTV). Furthermore, in the in vivo BSL3 experiments represented in figure 4E-H, we always took along the Spike/CpG-group as positive control. We have added the additional information regarding the experimental repetitions and groups in the figure legend.

    1. eLife Assessment

      This useful study examines how deletion of a major DNA repair gene in bacteria may facilitate the rise of mutations that confer resistance against a range of different antibiotics. Although the phenotypic evidence is intriguing, the interpretation of the phenotypic data presented and the proposed mechanism by which these mutations are generated are incomplete, relying on untested assumptions and methodology that merits optimization. For instance, the authors cannot fully rule out the possibility that the resistance mutations are the result of selection. Nevertheless, this work could be of interest to microbiologists studying antibiotic resistance, genome integrity, and evolution, but the significance remains uncertain.

    2. Reviewer #1 (Public review):

      Summary:

      Jin et al. investigated how the bacterial DNA damage (SOS) response and its regulator protein RecA affects the development of drug resistance under short-term exposure to beta-lactam antibiotics. Canonically, the SOS response is triggered by DNA damage, which results in the induction of error-prone DNA repair mechanisms. These error-prone repair pathways can increase mutagenesis in the cell, leading to the evolution of drug resistance. Thus, inhibiting the SOS regulator RecA has been proposed as means to delay the rise of resistance.

      In this paper, the authors deleted the RecA protein from E. coli and exposed this ∆recA strain to selective levels of the beta-lactam antibiotic, ampicillin. After an 8h treatment, they washed the antibiotic away and allowed the surviving cells to recover in regular media. They then measured the minimum inhibitory concentration (MIC) of ampicillin against these treated strains. They note that after just 8 h treatment with ampicillin, the ∆recA had developed higher MICs towards ampicillin, while by contrast, wild-type cells exhibited unchanged MICs. This MIC increase was also observed subsequent generations of bacteria, suggesting that the phenotype is driven by a genetic change.

      The authors then used whole genome sequencing (WGS) to identify mutations that accounted for the resistance phenotype. Within resistant populations, they discovered key mutations in the promoter region of the beta-lactamase gene, ampC; in the penicillin-binding protein PBP3 which is the target of ampicillin; and in the AcrB subunit of the AcrAB-TolC efflux machinery. Importantly, mutations in the efflux machinery can impact the resistances towards other antibiotics, not just beta-lactams. To test this, they repeated the MIC experiments with other classes of antibiotics, including kanamycin, chloramphenicol, and rifampicin. Interestingly, they observed that the ∆recA strains pre-treated with ampicillin showed higher MICs towards all other antibiotic tested. This suggests that the mutations conferring resistance to ampicillin are also increasing resistance to other antibiotics.

      The authors then performed an impressive series of genetic, microscopy, and transcriptomic experiments to show that this increase in resistance is not driven by the SOS response, but by independent DNA repair and stress response pathways. Specifically, they show that deletion of the recA reduces the bacterium's ability to process reactive oxygen species (ROS) and repair its DNA. These factors drive accumulation of mutations that can confer resistance towards different classes of antibiotics. The conclusions are reasonably well-supported by the data, but some aspects of the data and the model need to be clarified and extended.

      Strengths:

      A major strength of the paper is the detailed bacterial genetics and transcriptomics that the authors performed to elucidate the molecular pathways responsible for this increased resistance. They systemically deleted or inactivated genes involved in the SOS response in E. coli. They then subjected these mutants the same MIC assays as described previously. Surprisingly, none of the other SOS gene deletions resulted an increase in drug resistance, suggesting that the SOS response is not involved in this phenotype. This led the authors to focus on the localization of DNA PolI, which also participates in DNA damage repair. Using microscopy, they discovered that in the RecA deletion background, PolI co-localizes with the bacterial chromosome at much lower rates than wild-type. This led the authors to conclude that deletion of RecA hinders PolI and DNA repair. Although the authors do not provide a mechanism, this observation is nonetheless valuable for the field and can stimulate further investigations in the future.

      In order to understand how RecA deletion affects cellular physiology, the authors performed RNA-seq on ampicillin-treated strains. Crucially, they discovered that in the RecA deletion strain, genes associated with antioxidative activity (cysJ, cysI, cysH, soda, sufD) and Base Excision Repair repair (mutH, mutY, mutM), which repairs oxidized forms of guanine, were all downregulated. The authors conclude that down-regulation of these genes might result in elevated levels of reactive oxygen species in the cells, which in turn, might drive the rise of resistance. Experimentally, they further demonstrated that treating the ∆recA strain with an antioxidant GSH prevents the rise of MICs. These observations will be useful for more detailed mechanistic follow-ups in the future.

      Weaknesses:

      Throughout the paper, the authors use language suggesting that ampicillin treatment of the ∆recA strain induces higher levels of mutagenesis inside the cells, leading to the rapid rise of resistance mutations. However, as the authors note, the mutants enriched by ampicillin selection can play a role in efflux and can thus change a bacterium's sensitivity to a wide range of antibiotics, in what is known as cross-resistance. The current data is not clear on whether the elevated "mutagenesis" is driven ampicillin selection or by a bona fide increase in mutation rate.

      Furthermore, on a technical level, the authors employed WGS to identify resistance mutations in the treated ampicillin-treated wild-type and ∆recA strains. However, the WGS methodology described in the paper is inconsistent. Notably, wild-type WGS samples were picked from non-selective plates, while ΔrecA WGS isolates were picked from selective plates with 50 μg/mL ampicillin. Such an approach biases the frequency and identity of the mutations seen in the WGS and cannot be used to support the idea that ampicillin treatment induces higher levels of mutagenesis.

      Finally, it is important to establish what the basal mutation rates of both the WT and ∆recA strains are. Currently, only the ampicillin-treated populations were reported. It is possible that the ∆recA strain has inherently higher mutagenesis than WT, with a larger subpopulation of resistant clones. Thus, ampicillin treatment might not in fact induce higher mutagenesis in ∆recA.

      Comments on revisions:

      Thank you for responding to the concerns raised previously. The manuscript overall has improved.

    3. Reviewer #2 (Public review):

      Summary:

      This study aims to demonstrate that E. coli can acquire rapid antibiotic resistance mutations in the absence of a DNA damage response. The authors employed a modified Adaptive Laboratory Evolution (ALE) workflow to investigate this, initiating the process by diluting an overnight culture 50-fold into an ampicillin selection medium. They present evidence that a recA- strain develops ampicillin resistance mutations more rapidly than the wild-type, as indicated by the Minimum Inhibitory Concentration (MIC) and mutation frequency. Whole-genome sequencing of recA- colonies resistant to ampicillin showed predominant inactivation of genes involved in the multi-drug efflux pump system, contrasting with wild-type mutations that seem to activate the chromosomal ampC cryptic promoter. Further analysis of mutants, including a lexA3 mutant incapable of inducing the SOS response, led the authors to conclude that the rapid evolution of antibiotic resistance occurs via an SOS-independent mechanism in the absence of recA. RNA sequencing suggests that antioxidative response genes drive the rapid evolution of antibiotic resistance in the recA- strain. They assert that rapid evolution is facilitated by compromised DNA repair, transcriptional repression of antioxidative stress genes, and excessive ROS accumulation.

      Strengths:

      The experiments are well-executed and the data appear reliable. It is evident that the inactivation of recA promotes faster evolutionary responses, although the exact mechanisms driving this acceleration remain elusive and deserve further investigation.

      Weaknesses:

      Some conclusions are overstated. For instance, the conclusion regarding the LexA3 allele, indicating that rapid evolution occurs in an SOS-independent manner (line 217), contradicts the introductory statement that attributes evolution to compromised DNA repair. The claim made in the discussion of Figure 3 that the hindrance of DNA repair in recA- is crucial for rapid evolution is at best suggestive, not demonstrative. Additionally, the interpretation of the PolI data implies its role, yet it remains speculative. In Figure 2A table, mutations in amp promoters are leading to amino acid changes! The authors' assertion that ampicillin significantly influences persistence pathways in the wild-type strain, affecting quorum sensing, flagellar assembly, biofilm formation, and bacterial chemotaxis, lacks empirical validation. Figure 1G suggests that recA cells treated with ampicillin exhibit a strong mutator phenotype; however, it remains unclear if this can be linked to the mutations identified in Figure 2's sequencing analysis.

    4. Reviewer #3 (Public review):

      Summary:

      In the present work, Zhang et al investigate involvement of the bacterial DNA damage repair SOS response in the evolution of beta-lactam drug resistance evolution in Escherichia coli. Using a combination of microbiological, bacterial genetics, laboratory evolution, next-generation, and live-cell imaging approaches, the authors propose short-term (transient) drug resistance evolution can take place in RecA-deficient cells in an SOS response-independent manner. They propose the evolvability of drug resistance is alternatively driven by the oxidative stress imposed by accumulation of reactive oxygen species and compromised DNA repair. Overall, this is a nice study that addresses a growing and fundamental global health challenge (antimicrobial resistance).

      Strengths:

      The authors introduce new concepts to antimicrobial resistance evolution mechanisms. They show short-term exposure to beta-lactams can induce durably fixed antimicrobial resistance mutations. They propose this is due to comprised DNA repair and oxidative stress. Antibiotic resistance evolution under transient stress is poorly studied, so the authors' work is a nice mechanistic contribution to this field.

      Weaknesses:

      The authors do not show any direct evidence of altered mutation rate or accumulated DNA damage in their model.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Review #1:

      Summary:

      Jin et al. investigated how the bacterial DNA damage (SOS) response and its regulator protein RecA affect the development of drug resistance under short-term exposure to beta-lactam antibiotics. Canonically, the SOS response is triggered by DNA damage, which results in the induction of error-prone DNA repair mechanisms. These error-prone repair pathways can increase mutagenesis in the cell, leading to the evolution of drug resistance. Thus, inhibiting the SOS regulator RecA has been proposed as a means to delay the rise of resistance. 

      In this paper, the authors deleted the RecA protein from E. coli and exposed this ∆recA strain to selective levels of the beta-lactam antibiotic, ampicillin. After an 8-hour treatment, they washed the antibiotic away and allowed the surviving cells to recover in regular media. They then measured the minimum inhibitory concentration (MIC) of ampicillin against these treated strains. They note that after just 8-hour treatment with ampicillin, the ∆recA had developed higher MICs towards ampicillin, while by contrast, wild-type cells exhibited unchanged MICs. This MIC increase was also observed in subsequent generations of bacteria, suggesting that the phenotype is driven by a genetic change.

      The authors then used whole genome sequencing (WGS) to identify mutations that accounted for the resistance phenotype. Within resistant populations, they discovered key mutations in the promoter region of the beta-lactamase gene, ampC; in the penicillin-binding protein PBP3 which is the target of ampicillin; and in the AcrB subunit of the AcrAB-TolC efflux machinery. Importantly, mutations in the efflux machinery can impact the resistance towards other antibiotics, not just beta-lactams. To test this, they repeated the MIC experiments with other classes of antibiotics, including kanamycin, chloramphenicol, and rifampicin. Interestingly, they observed that the ∆recA strains pre-treated with ampicillin showed higher MICs towards all other antibiotics tested. This suggests that the mutations conferring resistance to ampicillin are also increasing resistance to other antibiotics.

      The authors then performed an impressive series of genetic, microscopy, and transcriptomic experiments to show that this increase in resistance is not driven by the SOS response, but by independent DNA repair and stress response pathways. Specifically, they show that deletion of the recA reduces the bacterium's ability to process reactive oxygen species (ROS) and repair its DNA. These factors drive the accumulation of mutations that can confer resistance to different classes of antibiotics. The conclusions are reasonably well-supported by the data, but some aspects of the data and the model need to be clarified and extended.

      We sincerely appreciate your overall summary of the manuscript and their positive evaluation of our work.

      Strengths:

      A major strength of the paper is the detailed bacterial genetics and transcriptomics that the authors performed to elucidate the molecular pathways responsible for this increased resistance. They systemically deleted or inactivated genes involved in the SOS response in E. coli. They then subjected these mutants to the same MIC assays as described previously. Surprisingly, none of the other SOS gene deletions resulted in an increase in drug resistance, suggesting that the SOS response is not involved in this phenotype. This led the authors to focus on the localization of DNA PolI, which also participates in DNA damage repair. Using microscopy, they discovered that in the RecA deletion background, PolI co-localizes with the bacterial chromosome at much lower rates than wild-type. This led the authors to conclude that deletion of RecA hinders PolI and DNA repair. Although the authors do not provide a mechanism, this observation is nonetheless valuable for the field and can stimulate further investigations in the future.

      In order to understand how RecA deletion affects cellular physiology, the authors performed RNA-seq on ampicillin-treated strains. Crucially, they discovered that in the RecA deletion strain, genes associated with antioxidative activity (cysJ, cysI, cysH, soda, sufD) and Base Excision Repair repair (mutH, mutY, mutM), which repairs oxidized forms of guanine, were all downregulated. The authors conclude that down-regulation of these genes might result in elevated levels of reactive oxygen species in the cells, which in turn, might drive the rise of resistance. Experimentally, they further demonstrated that treating the ∆recA strain with an antioxidant GSH prevents the rise of MICs. These observations will be useful for more detailed mechanistic follow-ups in the future.

      We are grateful to you for your positive assessment of the strengths of our manuscript and your recognition of its potential future applications.

      Weaknesses:

      Throughout the paper, the authors use language suggesting that ampicillin treatment of the ∆recA strain induces higher levels of mutagenesis inside the cells, leading to the rapid rise of resistance mutations. However, as the authors note, the mutants enriched by ampicillin selection can play a role in efflux and can thus change a bacterium's sensitivity to a wide range of antibiotics, in what is known as cross-resistance. The current data is not clear on whether the elevated "mutagenesis" is driven ampicillin selection or by a bona fide increase in mutation rate.

      We greatly appreciate you for raising this issue, as it is an important premise that must be clearly stated throughout the entire manuscript. To verify that the observed increase in mutation rate is a bona fide increase and not due to experimental error, we used a non-selective antibiotic, rifampicin, to evaluate the mutation frequency after drug induction, as it is a gold-standard method documented in other studies [Heterogeneity in efflux pump expression predisposes antibiotic-resistant cells to mutation, Science, 362, 6415, 686-690, 2018.]. In the absence of ampicillin treatment, the natural mutation rates detected using rifampicin were consistent between the wild-type and the ΔrecA strain. However, after ampicillin treatment, the mutation rate detected using rifampicin was significantly elevated only in the ΔrecA strain (Fig. 1G). We also employed other antibiotics, such as ciprofloxacin and chloramphenicol, in our experiments to treat the cells (data not shown). However, we observed that beta-lactam antibiotics specifically induced the emergence of resistance or altered the MIC in our bacterial populations. If resistance had pre-existed before antibiotic exposure or a bona fide increase in mutation rate, we would expect other antibiotics to exhibit a similar selective effect, particularly given the potential for cross-resistance to multiple antibiotics.

      Furthermore, on a technical level, the authors employed WGS to identify resistance mutations in the treated ampicillin-treated wild-type and ∆recA strains. However, the WGS methodology described in the paper is inconsistent. Notably, wild-type WGS samples were picked from non-selective plates, while ΔrecA WGS isolates were picked from selective plates with 50 μg/mL ampicillin. Such an approach biases the frequency and identity of the mutations seen in the WGS and cannot be used to support the idea that ampicillin treatment induces higher levels of mutagenesis.

      We appreciate your concern regarding potential inconsistencies in the WGS methodology. However, we would like to clarify that the primary aim of the WGS experiment was to identify the types of mutations present in the wild-type and ΔrecA strains after treatment of ampicillin, rather than to quantify or compare mutation frequencies. This purpose was explicitly stated in the manuscript.

      Furthermore, the choice of selective and non-selective conditions was made to ensure the successful isolation of mutants in both strains. Specifically, if selective conditions (50 μg/mL ampicillin) were applied to the wild-type strain, it would have been nearly impossible to recover colonies for WGS analysis, as wild-type cells are highly susceptible to ampicillin at this concentration (Top, Author response image 1). Conversely, under non-selective conditions, ΔrecA mutants carrying resistance mutations may not have been effectively isolated, which would have limited our ability to identify resistance mutations in these strains (Bottom, Author response image 1 Thus, the use of different selection pressures was essential for achieving the objective of mutation identification in this study.

      Author response image 1.

      After 8 hours of antibiotic treatment, the wild type or the ΔrecA cells were plated on agar plates either without ampicillin or with 50 μg/mL ampicillin and incubated for 24-48 hours. Top: Under selective conditions, no wild type colonies were recovered, indicating high susceptibility to the antibiotic, preventing further analysis. Bottom: In non-selective conditions, both ΔrecA resistant mutants and non-resistant cells grew, making it difficult to distinguish and isolate the mutants carrying resistance mutations.

      Finally, it is important to establish what the basal mutation rates of both the WT and ∆recA strains are. Currently, only the ampicillin-treated populations were reported. It is possible that the ∆recA strain has inherently higher mutagenesis than WT, with a larger subpopulation of resistant clones. Thus, ampicillin treatment might not in fact induce higher mutagenesis in ∆recA.

      Thanks for this suggestion. The basal mutation frequency of the wild-type and the ∆recA strain have been measured using rifampicin (Fig. 1G), and there is no significant difference between them.

      Reviewer #2:

      Summary:

      This study aims to demonstrate that E. coli can acquire rapid antibiotic resistance mutations in the absence of a DNA damage response. To investigate this, the authors employed a sophisticated experimental framework based on a modified Adaptive Laboratory Evolution (ALE) workflow. This workflow involves numerous steps culminating in the measurement of antibiotic resistance. The study presents evidence that a recA strain develops ampicillin resistance mutations more quickly than the wild-type, as shown by measuring the Minimum Inhibitory Concentration (MIC) and mutation frequency. Whole-genome sequencing of 15 recA-colonies resistant to ampicillin revealed predominantly inactivation of genes involved in the multi-drug efflux pump system, whereas, in the wild-type, mutations appear to enhance the activity of the chromosomal ampC cryptic promoter. By analyzing mutants involved in the SOS response, including a lexA3 mutant incapable of inducing the SOS response, the authors conclude that the rapid evolution of antibiotic resistance occurs in an SOS-independent manner when recA is absent.

      Furthermore, RNA sequencing (RNA-seq) of the four experimental conditions suggests that genes related to antioxidative responses drive the swift evolution of antibiotic resistance in the recA-strain.

      We greatly appreciate your overall summary of the manuscript and their positive evaluation of our work.

      Weaknesses:

      However, a potential limitation of this study is the experimental design used to determine the 'rapid' evolution of antibiotic resistance. It may introduce a significant bottleneck in selecting ampicillin-resistant mutants early on. A recA mutant could be more susceptible to ampicillin than the wild-type, and only resistant mutants might survive after 8 hours, potentially leading to their enrichment in subsequent steps. To address this concern, it would be critical to perform a survival analysis at various time points (0h, 2h, 4h, 6h, and 8h) during ampicillin treatment for both recA and wild-type strains, ensuring there is no difference in viability.

      We appreciate your suggestion. We measured the survival fraction at 0, 2, 4, 6, and 8 hours after ampicillin treatment. The results show no significant difference in antibiotic sensitivity between the wild-type and ΔrecA strain (Fig. S2). We therefore added a description int the main text, “Meanwhile, after 8 hours of treatment with 50 μg/mL ampicillin, the survival rates of both wild type and ΔrecA strain were consistent (Fig. S2)”.

      The observation that promoter mutations are absent in ΔrecA strains could be explained by previous research indicating that amplification of the AmpC genes is a mechanism for E. coli resistance to ampicillin, which does not occur in a recA-deficient background (PMID# 19474201).

      We are very grateful to you for providing this reference. We did examine the amplification of the ampC gene in both wild-type and _recA-_deficient strains, but we found no significant changes in its copy number after ampicillin treatment (Author response image 2). Therefore, the results and discussion regarding gene copy number were not included in this manuscript.

      Author response image 2.

      Copy number variations of genes in the chromosome before and after exposure to ampicillin at 50 µg/mL for 8 hours in the wild type and ΔrecA strain.

      The section describing Figure 3 is poorly articulated, and the conclusions drawn are apparent. The inability of a recA strain to induce the SOS response is well-documented (lines 210 and 278). The data suggest that merely blocking SOS induction is insufficient to cause 'rapid' evolution in their experimental conditions. To investigate whether SOS response can be induced independently of lexA cleavage by recA, alternative experiments, such as those using a sulA-GFP fusion, might be more informative.

      Thanks for your suggestion. We agree that detecting the expression level of SulA can provide valuable information to reveal the impact of the SOS system on rapid drug resistance. In addition to fluorescence visualization and quantification of SulA expression, regulating the transcription level of the sulA gene can achieve the same objective. Therefore, in our transcriptome sequencing analysis, we focused on evaluating the transcription level of sulA (Fig. 4E).

      In Figure 4E, the lack of increased SulA gene expression in the wild-type strain treated with ampicillin is unexpected, given that SulA is an SOS-regulated gene. The fact that polA (Pol I) is going down should be taken into account in the interpretation of Figures 2D and 2E.

      Thank you for your observation regarding the lack of increased SulA gene expression in the wild-type strain treated with ampicillin in Figure 4E. We agree that SulA is typically an SOS-regulated gene, and its expression is expected to increase in response to DNA damage induced by antibiotics like ampicillin. However, in our experimental conditions, the observed lack of increased SulA expression could be due to different factors. One possibility is that the concentration of ampicillin used, or the duration of treatment, was not applicable to induce a strong SOS response in the wild type strain under the specific conditions tested. Additionally, differences in experimental setups such as timing, sampling, or cellular stress responses could account for the lack of a pronounced upregulation of SulA.

      You may state that the fact that polA (Pol I) is going down should be taken into account in the interpretation of Figures 3D and 3E, and we agree with you.

      The connection between compromised DNA repair, the accumulation of Reactive Oxygen Species (ROS) based on RNA-seq data, and accelerated evolution is merely speculative at this point and not experimentally established.

      We greatly appreciate your comments. First, the correlation between DNA mutations and the accumulation of reactive oxygen species (ROS) has been experimentally confirmed. As shown in Fig. 4I, after the addition of the antioxidant GSH, DNA resistance mutations were not detected in the ΔrecA strain treated with ampicillin for 8 hours, compared to those without the addition of GSH, proving that the rapid accumulation of ROS induces the enhancement of DNA resistance mutations. Second, the enhancement of DNA resistance mutations in relation to bacterial resistance has been widely validated and is generally accepted. Finally, we appreciate the your suggestion to strengthen the evidence supporting ROS enhancement. To address this, we have added an experiment to measure ROS levels. Through flow cytometry, we found that ROS levels significantly increased in both the wild-type and ΔrecA strain after 8 hours of ampicillin treatment. However, ROS levels in the ΔrecA strain showed a significant further increase compared to the wild-type strain (Fig. 4G). Additionally, with the addition of 50 mM glutathione, no significant change in ROS levels was observed in either the wild-type or ΔrecA strain before and after ampicillin treatment (Fig. 4H). This result further confirms our finding in Fig. 4I, where adding GSH inhibited the development of antibiotic resistance.

      Reviewer #3:

      Summary:

      In the present work, Zhang et al investigate the involvement of the bacterial DNA damage repair SOS response in the evolution of beta-lactam drug resistance evolution in Escherichia coli. Using a combination of microbiological, bacterial genetics, laboratory evolution, next-generation, and live-cell imaging approaches, the authors propose short-term drug resistance evolution that can take place in RecA-deficient cells in an SOS response-independent manner. They propose the evolvability of drug resistance is alternatively driven by the oxidative stress imposed by the accumulation of reactive oxygen species and inhibition of DNA repair. Overall, this is a nice study that addresses a growing and fundamental global health challenge (antimicrobial resistance). However, although the authors perform several multi-disciplinary experiments, there are several caveats to the authors' proposal that ultimately do not fully support their interpretation that the observed antimicrobial resistance evolution phenotype is due to compromised DNA repair.

      We greatly appreciate your overall summary of the manuscript and positive evaluation of our work.

      Strengths:

      The authors introduce new concepts to antimicrobial resistance evolution mechanisms. They show short-term exposure to beta-lactams can induce durably fixed antimicrobial resistance mutations. They propose this is due to comprised DNA repair and oxidative stress. This is primarily supported by their observations that resistance evolution phenotypes only exist for recA deletion mutants and not other genes in the SOS response.

      Thanks for your positive comments.

      Weaknesses:

      The authors do not show any direct evidence (1) that these phenotypes exist in strains harboring deletions in other DNA repair genes outside of the SOS response, (2) that DNA damage is increased, (3) that reactive oxygen species accumulate, (4) that accelerated resistance evolution can be reversed by anything other than recA complementation. The authors do not directly test alternative hypotheses. The conclusions drawn are therefore premature.

      We sincerely thank you for your insightful comments. First, in this study, our primary focus is on the role of recA deficiency in bacterial antibiotic resistance evolution. Therefore, we conducted an in-depth investigation on E. coli strains lacking RecA and found that its absence promotes resistance evolution through mechanisms involving increased ROS accumulation and downregulation of DNA repair pathways. While we acknowledge the importance of other DNA repair genes outside of the SOS response, exploring them is beyond the scope of this paper. However, in a separate unpublished study, we have identified the involvement of another DNA recombination protein, whose role in resistance evolution is not yet fully elucidated, in promoting resistance development. This finding is part of another independent investigation.

      Regarding DNA damage and repair, our paper emphasizes that resistance-related mutations in DNA are central to the development of antibiotic resistance. These mutations are a manifestation of DNA damage. To demonstrate this, we measured mutation frequency and performed whole-genome sequencing, both of which confirmed an increase in DNA mutations.

      We appreciate the reviewer's suggestion to provide additional evidence for ROS accumulation, and we have now supplemented our manuscript with relevant experiments. Through flow cytometry, we found that ROS levels significantly increased in both the wild type and ΔrecA strains after 8 hours of ampicillin treatment. However, ROS levels in the ΔrecA strain showed a significant further increase compared to the wild-type strain (Fig. 4G). Additionally, with the addition of 50 mM glutathione, no significant change in ROS levels was observed in either the wild-type or ΔrecA strain before and after ampicillin treatment (Fig. 4H). This result further confirms our finding in Fig. 4I, where adding GSH inhibited the development of antibiotic resistance.

      Finally, in response to your question about reversing accelerated resistance evolution, we would like to highlight that, in addition to recA complementation, we successfully suppressed rapid resistance evolution by supplementing with an antioxidant, GSH (Fig. 4I). This further supports our hypothesis that increased ROS levels play a key role in driving accelerated resistance evolution in the absence of RecA.

      Recommendations for the authors:

      Reviewer #1:

      The author's model asserts that deletion of recA impairs DNA repair in E. coli, leading to an accumulation of ROS in the cell, and ultimately driving the rapid rise of resistance mutations. However, the experimental evidence does not adequately address whether the resistance mutations are true, de novo mutations that arose due to beta-lactam treatment, or mutations that confer cross-resistance enriched by ampicillin selection.

      a. Major: In Figure 1F & G, the authors show that the ∆recA strain, following ampicillin treatment, has higher resistance and mutation frequency towards rifampicin than WT. However, it is not clear whether the elevated resistance and mutagenesis are driven by mutations enriched by the ampicillin treatment (e.g. mutations in acrB, as seen in Figure 2) or by "new" mutations in the rpoB gene. As the authors note, the mutants enriched by ampicillin selection can play a role in efflux and can thus change a bacterium's sensitivity to a wide range of antibiotics, including rifampicin, in what is known as cross-resistance. Therefore, the mutation frequency calculation, which relies on quantifying rifampicin-resistant clones, might be confounded by bacteria with mutations that confer cross-resistance. A better approach to calculate mutation frequency would be to employ an assay that does not require antibiotic selection, such as a lac-reversion assay. This would mitigate the confounding effects of cross-resistance of drug-resistant mutations.

      We appreciate your thoughtful comments regarding the potential for cross-resistance to confound the mutation frequency calculation based on rifampicin-resistant clones. Indeed, as noted, ampicillin selection can enrich for mutants with enhanced efflux activity, which may confer cross-resistance to a range of antibiotics, including rifampicin.

      However, we believe that the current approach of calculating mutation frequency using rifampicin-resistant mutants is still valid in our specific context. Rifampicin targets the RNA polymerase β subunit, and resistance typically arises from specific mutations in the rpoB gene. These mutations are well-characterized and distinct from those typically associated with efflux-related cross-resistance. Thus, the likelihood of cross-resistance affecting our mutation frequency calculation is minimized in this scenario.

      Additionally, while the lac-reversion assay could be an alternative, it focuses on specific metabolic pathway mutations (such as those affecting lacZ) and would not necessarily capture the same types of mutations relevant to rifampicin resistance or antibiotic-induced mutagenesis. Given our experimental objective of understanding how ampicillin induces mutations that confer antibiotic resistance, the current approach of using rifampicin selection provides a direct and relevant measurement of mutation frequency under antibiotic stress.

      b. Major: It is important to establish what the basal mutation frequencies/rates of both the WT and ∆recA strains are. Currently, only the ampicillin-treated populations were reported. It is possible that the ∆recA strain has an inherently higher mutagenesis than WT. Thus, ampicillin treatment might not in fact induce higher mutagenesis in ∆recA.

      Thanks for your suggestion. The basal mutation frequency of the wild-type and the ∆recA strain have been measured using rifampicin (Fig. 1G), and there is no significant difference between them.

      c. Major: In the text, the authors write, "To verify whether drug resistance associated DNA mutations have led to the rapid development of antibiotic resistance in recA mutant strain, we randomly selected 15 colonies on non-selected LB agar plates from the wild type surviving isolates, and antibiotic screening plates containing 50 μg/mL ampicillin from the ΔrecA resistant isolates, respectively." Why were the WT clones picked from non-selective plates and the recA mutant from selective ones for WGS? It appears that such a procedure would bias the recA mutant clones to show more mutations (caused by selection on the ampicillin plate). The authors need to address this discrepancy.

      We appreciate your concern regarding potential inconsistencies in the WGS methodology. However, we would like to clarify that the primary aim of the WGS experiment was to identify the types of mutations present in the wild-type and ΔrecA strains after treatment of ampicillin, rather than to quantify or compare mutation frequencies. This purpose was explicitly stated in the manuscript.

      Furthermore, the choice of selective and non-selective conditions was made to ensure the successful isolation of mutants in both strains. Specifically, if selective conditions (50 μg/mL ampicillin) were applied to the wild type strain, it would have been nearly impossible to recover colonies for WGS analysis, as wild-type cells are highly susceptible to ampicillin at this concentration (Top, Author response image 1). Conversely, under non-selective conditions, ΔrecA mutants carrying resistance mutations may not have been effectively isolated, which would have limited our ability to identify resistance mutations in these strains (Bottom, Author response image 1). Thus, the use of different selection pressures was essential for achieving the objective of mutation identification in this study.

      d. Major: In some instances, the authors do not use accurate language to describe their data. In Figure 2A, the authors randomly selected 15 ∆recA clones from a selective plate with 50 µg/mL of ampicillin. These clones were then subjected to WGS, which subsequently identified resistant mutations. Based on the described methods, these mutations are a result of selection: in other words, resistant mutations were preexisting in the bacterial population, and the addition of ampicillin selection killed off the sensitive cells, enabling the proliferation of the resistant clones. However, the in Figure 2 legend and associated text, the authors suggest that these mutations were "induced" by beta-lactam exposure, which is misleading. The data does not support that.

      We appreciate your detailed feedback on the language used to describe our data. We understand the concern regarding the use of the term "induced" in relation to beta-lactam exposure. To clarify, we employed not only beta-lactam antibiotics but also other antibiotics, such as ciprofloxacin and chloramphenicol, in our experiments (data not shown). However, we observed that beta-lactam antibiotics specifically induced the emergence of resistance or altered the MIC in our bacterial populations. If resistance had pre-existed before antibiotic exposure, we would expect other antibiotics to exhibit a similar selective effect, particularly given the potential for cross-resistance to multiple antibiotics.

      Furthermore, we used two different ∆recA strains, and the results were consistent between the strains (Fig. S3). Given that spontaneous mutations can occur with significant variability in populations, if resistance mutations pre-existed before antibiotic exposure, the selective outcomes should have varied between the two strains.

      Most importantly, we found that the addition of anti-oxidative compound GSH prevented the evolution of antibiotic from the treatment of ampicillin in the ΔrecA strain. If we assume that resistant bacteria preexist in the ∆recA strain, then the addition of GSH should not affect the evolution of resistance. Therefore, we believe that the resistance mutations we detected were not simply the result of selection from preexisting mutations but were indeed induced by beta-lactam exposure.

      e. Major: For Figure 4J, using WGS the authors show that the addition of GSH to WT and ∆recA cells inhibited the rise of resistance mutations; no resistance mutations were reported. However, in the "Whole genome sequencing" section under "Materials and Methods", they state that "Resistant clones were isolated by selection using LB agar plates with the supplementation of ampicillin at 50 μg/mL". These clones were then genome-extracted and sequenced. Given the methodology, it is surprising that the WGS did not reveal any resistance mutations in the GSH-treated cells. How were these cells able to grow on 50 μg/mL ampicillin plates for isolation in the first place? The authors need to address this.

      We sincerely apologize for the confusion caused by the incorrect expression in the "Materials and Methods" section. Indeed, when bacteria were treated with the combination of antibiotics and GSH, resistance was significantly suppressed, and no resistant clones could be isolated from selective plates (i.e., LB agar supplemented with 50 μg/mL ampicillin).

      To address this, we instead plated the bacteria treated with antibiotics and GSH onto non-selective plates (without ampicillin) and randomly selected 15 colonies for WGS. None of them showed resistance mutations. We will revise the text in the "Materials and Methods" section to accurately reflect this procedure and provide clarity.

      f. Minor: for Figure 1G, it is misleading to have both "mutation frequency" and "mutant rate" in the y-axis; the two are defined and calculated differently. Based on the Materials and Materials, "mutation frequency" would be the appropriate term. Also, for the ∆recA strain, it is a bit unusual to see mutation frequencies that are tightly clustered. Usually, mutation frequencies follow the Luria-Delbruck distribution. Can the authors explain why the ∆recA data looks so different compared to, say, the WT mutation frequencies?

      Thank you for your insightful feedback. We agree that having both "mutation frequency" and "mutant rate" on the y-axis is misleading, as these terms are defined and calculated differently. To avoid confusion, we will revise Figure 1G to use only "mutation frequency" as the correct term, in line with the methods described in the Materials and Methods section.

      Regarding the ∆recA strain's mutation frequencies, we acknowledge that the data appear more tightly clustered compared to the expected Luria-Delbruck distribution seen in the wild type strain. In fact, the y-axis of the Figure 1G is logarithmic, this causes the data to appear more clustered.

      We further added the basal mutation frequency in the wild type and ∆recA strains before the exposure to ampicillin. The basal mutation frequency of the wild-type and the ∆recA strain have been measured using rifampicin (Fig. 1G), and there is no significant difference between them.

      g. Minor: It needs to be made clear in the Main Text what the selective antibiotic agar plate used was, rifampicin or ampicillin. I am assuming it was rifampicin, as ampicillin plates would yield resistance frequencies close to 100%, given the prior treatment of the culture with ampicillin.

      Thanks for your comments. Depending on the objective, we used different selective plates. For example, when testing the mutation frequency of antibiotic resistance, we used a selective plate containing rifampicin in order to utilize a non-inducing antibiotic, which is the standard method for calculating resistance mutation frequency. In the WGS experiment, to obtain mutations specific to ampicillin resistance, we selected a selective plate containing ampicillin.

      Reviewer #2:

      The Y-axis label (log10 mutant rate) in Figure 1G is misleading or incorrect.

      Thanks for your comments and we apologize for this misleading information. The Figure 1G has been revised accordingly.

      In line 393 of the discussion, the authors claim that excessive ROS accumulation drives the evolution of ampicillin resistance, which has not been conclusively demonstrated. Additional experiments are needed to support this statement.

      We greatly appreciate your comments. First, the correlation between DNA mutations and the accumulation of reactive oxygen species (ROS) has been experimentally confirmed. As shown in Fig. 4I, after the addition of the antioxidant GSH, DNA resistance mutations were not detected in the ΔrecA strain treated with ampicillin for 8 hours, compared to those without the addition of GSH, proving that the rapid accumulation of ROS induces the enhancement of DNA resistance mutations. Second, the enhancement of DNA resistance mutations in relation to bacterial resistance has been widely validated and is generally accepted. Finally, we appreciate the your suggestion to strengthen the evidence supporting ROS enhancement. To address this, we have added an experiment to measure ROS levels. Through flow cytometry, we found that ROS levels significantly increased in both the wild-type and ΔrecA strain after 8 hours of ampicillin treatment. However, ROS levels in the ΔrecA strain showed a significant further increase compared to the wild-type strain (Fig. 4G). Additionally, with the addition of 50 mM glutathione, no significant change in ROS levels was observed in either the wild-type or ΔrecA strain before and after ampicillin treatment (Fig. 4H). This result further confirms our finding in Fig. 4I, where adding GSH inhibited the development of antibiotic resistance.

      The abstract is overly complex and difficult to read, e.g. "Contrary to previous findings, it is shown that this accelerated resistance development process is dependent on the hindrance of DNA repair, which is completely orthogonal to the SOS response").

      Thank you for the valuable feedback regarding the complexity of the abstract. We agree that certain sections could be simplified for clarity. In response, we have revised the abstract to make it more concise and easier to understand. For example, the sentence “Contrary to previous findings, it is shown that this accelerated resistance development process is dependent on the hindrance of DNA repair, which is completely orthogonal to the SOS response” has been rewritten as: "Unlike earlier studies, we found that the rapid development of resistance relies on the hindrance of DNA repair, a mechanism that operates independently of the SOS response."

      Reviewer #3:

      As indicated above, direct evidence is needed to show (1) that these phenotypes exist in strains harboring deletions in other DNA repair genes outside of the SOS response, (2) that DNA damage is increased, (3) that reactive oxygen species accumulate, (4) that accelerated resistance evolution can be reversed by anything other than recA complementation. There are also other resistance evolution mechanisms untested here, including transcription-coupled repair (TCR) mechanisms involving Mfd. These need to be shown in order to draw the conclusions proposed.

      We sincerely thank you for your insightful comments. First, in this study, our primary focus is on the role of recA deficiency in bacterial antibiotic resistance evolution. Therefore, we conducted an in-depth investigation on E. coli strains lacking RecA and found that its absence promotes resistance evolution through mechanisms involving increased ROS accumulation and downregulation of DNA repair pathways. While we acknowledge the importance of other DNA repair genes outside of the SOS response and other resistance evolution mechanisms including the TCR mechanism, exploring them is beyond the scope of this paper. However, in a separate unpublished study, we have identified the involvement of another DNA recombination protein, whose role in resistance evolution is not yet fully elucidated, in promoting resistance development. This finding is part of another independent investigation.

      Regarding DNA damage and repair, our paper emphasizes that resistance-related mutations in DNA are central to the development of antibiotic resistance. These mutations are a manifestation of DNA damage. To demonstrate this, we measured mutation frequency and performed whole-genome sequencing, both of which confirmed an increase in DNA mutations.

      We appreciate the reviewer's suggestion to provide additional evidence for ROS accumulation, and we have now supplemented our manuscript with relevant experiments. Through flow cytometry, we found that ROS levels significantly increased in both the wild type and ΔrecA strains after 8 hours of ampicillin treatment. However, ROS levels in the ΔrecA strain showed a significant further increase compared to the wild-type strain (Fig. 4G). Additionally, with the addition of 50 mM glutathione, no significant change in ROS levels was observed in either the wild-type or ΔrecA strain before and after ampicillin treatment (Fig. 4H). This result further confirms our finding in Fig. 4I, where adding GSH inhibited the development of antibiotic resistance.

      Finally, in response to your question about reversing accelerated resistance evolution, we would like to highlight that, in addition to recA complementation, we successfully suppressed rapid resistance evolution by supplementing with an antioxidant, GSH (Fig. 4I). This further supports our hypothesis that increased ROS levels play a key role in driving accelerated resistance evolution in the absence of RecA.

    1. eLife Assessment

      The authors analyze the relationship between human mobility and genomic data of SARS-CoV-2 using mobile phone mobility data and sequence data and present a solid proof of concept. This useful work was conducted on a fine spatial scale and provides suggestions on how mobility-derived surveillance could be conducted, although these results are mixed. The primary significance of this work is the strong use of large datasets that were highly granular. The authors provide a rigorous study, but with less clear predictive power of mobility to inform transmission patterns.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript Spott et al. combine SARS-CoV-2 genomic data alongside granular mobility data to retrospectively evaluate the spread of SARS-CoV-2 alpha lineages throughout Germany and specifically Thuringia. They further prospectively identified districts with strong mobility links to the first district in which BQ.1.1 was observed to direct additional surveillance efforts to these districts. The additional surveillance effort resulted in the earlier identification of BQ.1.1 in districts with strong links to the district in which BQ.1.1 was first observed.

      Strengths:

      There are two important strengths of this work. The first, is the scale and detail in the data that has been generated an analyzed as part of this study. Specifically, the authors use 6,500 SARS-CoV-2 sequences and district level mobility data within Thuringia. I applaud the authors for making a subset of their analyses public e.g. on the associated micro react page.

      Further, the main focus of the article is on the potential utility of mobility-directed surveillance sequence. While I may certainly be mistaken, I have not seen this proposed elsewhere, at least in the context of SARS-CoV-2. The authors were further able to test this concept in a real world setting during the emergence of BQ.1.1 and compare it to the "gold standard" of random sampling. This is a unique real-world evaluation of a novel surveillance sequencing strategy and there is considerable value in publishing this analysis. Given the increased focus on optimizing sampling strategies for genomic surveillance, this work provides a novel strategy and will hopefully motivate additional modeling and real-world implementations.

      Weaknesses:

      The article is quite strong and I find the analyses to generally be rigorous. Limitations of the analysis, particularly due to the fact that BQ.1.1 remained a low-prevalence variant, are adequately addressed. The results do not provide quantitative, definitive proof that mobility-guided sampling is an optimal strategy, but they also do not claim to nor do I think they need to to make an important contribution to the field.

    3. Reviewer #2 (Public review):

      In the manuscript, the authors combine SARS-CoV-2 sequence data from a state in Germany and mobility data to help in understanding the movement of virus and the potential to help decide where to focus sequencing. The global expansion in sequencing capability is a key outcome of the public health response. However, there remains uncertainty how to maximise the insights the sequence data can give. Improved ability to predict the movement of emergent variants would be a useful public health outcome.

      However, I remain unconvinced that changing surveillance strategies is necessarily sensible as it remains unclear what the ultimate benefit of variant hunting is. Decisions to adapt surveillance strategies should not be taken lightly as there are substantial benefits of maintaining a stable and as representative as possible, system over time. It's unclear what public health action would result of detecting a few more sequences of a variant. Once a variant has been identified (arguably anywhere in the world/region), we already have the necessary information to motivate the development of updated vaccines/monoclonals.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Thank you for your assessment and constructive critique, which helped us to improve the manuscript and its clarity. Upon carefully reading through the comments, we noticed that, based on the Reviewer's questions, some of our answers were already available but “hidden” as supplementary data. Thus, we changed the following two figures and text accordingly to showcase our results to the reader better:

      A) To highlight how mobile service data can indicate the spread of highly prevalent variants, we added a high-prevalence subcluster to Figure 2 (previously shown in Supplementary Figures S4 and S5) and, in exchange, moved one low-prevalence subcluster from Figure 2 back into the supplement. The figure is now showing a low and a high prevalent subcluster instead of two low prevalent subclusters.

      B) Based on Reviewer 1’s question about where samples were taken in regards to the mobility data from the community of the first identification (negative controls), we now highlight all the mobility data that was available to us in Figure 3 (as triangles) instead of just a few top mobility hits for both - mobility guided and random surveillance (serving as a negative control for the former). This way, we think, it is clearer how random sampling was also performed in some regions where mobility was coming from the community of origin (as asked by Reviewer 1) - the detailed trips and sampling are now part of the supplement for data transparency reasons. We also noticed a typo in the GPS coordinates, aligning one of the arrows falsely, which is corrected in the improved Figure 3.

      We have also included the R-Scripts used to generate all the figures in the manuscript in an OSF repository (we updated the “Data sharing statement”). We also updated Figure 1 slightly and extended the supplemental material. The remaining comments to reviewers are addressed point-by-point below.

      Reviewer 1 (Public Review):

      In "1 Exploring the Spatial Distribution of Persistent SARS-CoV-2 Mutations -Leveraging mobility data for targeted sampling" Spott et al. combine SARS-CoV-2 genomic data alongside granular mobility data to retrospectively evaluate the spread of SARS-CoV-2 alpha lineages throughout Germany and specifically Thuringia. They further prospectively identified districts with strong mobility links to the first district in which BQ.1.1 was observed to direct additional surveillance efforts to these districts. The additional surveillance effort resulted in the earlier identification of BQ.1.1 in districts with strong links to the district in which BQ.1.1 was first observed.

      Thank you for taking the time to review our work.

      (1) It seems the mobility-guided increased surveillance included only districts with significant mobility links to the origin district and did not include any "control" districts (those without strong mobility links). As such, you can only conclude that increasing sampling depth increased the rate of detection for BQ.1.1., not necessarily that doing so in a mobility-guided fashion provided an additional benefit. I absolutely understand the challenges of doing this in a real-world setting and think that the work remains valuable even with this limitation, but I would like the lack of control districts to be more explicitly discussed.

      Thank you for the critical assessment of our work. We agree that a control is essential for interpreting the results. In our case, randomized surveillance (“the gold standard”) served as a control with a total sampling depth seven times higher than the mobility-guided sampling. To better reflect the sampling in regards to the available mobility data, we revisited Figure 3 and added all the mobility information from the origin that was available to us. We also added this information to the random surveillance to provide a clearer picture to the reader. This now clearly shows how randomized surveillance covered communities with varying degrees of incoming mobility from the community of first occurrences, thereby underlining its role as a negative control. We updated the manuscript to reflect these changes and included the October 2020 and June 2021 mobility datasets in Supplementary Table S6. We agree that the sampling depth increases the detection, which is the point of guided sampling to increase sampling, specifically in areas where mobility points towards a possible spread. In regards to the negative control: Random surveillance (not Mobility-guided) in October covered 40 samples in the northwest region of Thuringia (Mobility-guided covered 19 samples). Thus, random surveillance also contained 31 out of 132 samples with a mobility link towards the first occurrence of BQ1.1 but with varying amounts of mobility (low to high).

      We added this information to the main text:

      Line 270 to 293:

      Following its first Thuringian identification, we utilized the latest available dataset of the past two years of mobile service data (October 2020 and June 2021) to investigate the residential movements for the community of first detection. Considering the highest incoming mobility from both datasets, we identified 18 communities with high (> 10,000), 34 with medium (2,001-10,000), and 82 with low (30-2,000) number of incoming one-way trips from the originating community (purple triangles in Figure 3a). As a result, we specifically requested all the available samples from the eight communities with the highest incoming mobility. Still, we were restricted to the submission of third parties over whom we had no influence. This led to the inclusion of the following eight communities with the most residential movement from the originating community: four in central and three in NW of Thuringia, one in NW-neighboring state Saxony-Anhalt. The samples requested from central Thuringia were also due to their geographic arrangement as a “belt” in central Thuringia, linking three major cities (see Supplementary Figure S1). Subsequently, we collected 19 additional samples (isolated between the 17th and 25th of October 2022; see “Guided Sampling” for October 2022, Figure 3a) besides the randomized sampling strategy. Thus, the sampling depth was increased in communities with high incoming mobility from the first origin.

      As part of the general Thuringian surveillance, we collected 132 samples for October (covering dates between the 5th and 31st) and 69 samples in November (covering dates between the 1st and 25th; see Figure 3b and c). Randomized sampling was not influenced or adjusted based on the mobility-guided sample collection. Thus, it also contains samples from communities with a mobility link towards the first occurrence of BQ.1.1, as they were part of the regular random collection (see gray triangles in Figure 3b). A complete overview of all samples is provided in Supplementary Table S5. The mobility datasets from October 2020 and June 2021 for all sampled communities are provided in Supplementary Table S6.

      Line 305 to 313:

      Among the 19 samples specifically collected based on mobile service data, we identified one additional sample of the specific Omicron sublineage BQ.1.1 in a community with high incoming mobility (n = 14, number of trips = 37,499) with a distance of approximately 16 km between both towns. Our randomly sampled routine surveillance strategy did not detect another sample during the same period. This was despite a seven times higher overall sample rate, which included 31 samples from communities with an identified incoming mobility from the community of the first occurrence (October 2022, Figure 3b). Only in the one-month follow-up were four other samples identified across Thuringia through routine surveillance (November 2022, Figure 3c).

      Line 325 to 333:

      In summary, increasing the sampling depth in the suspected regions successfully identified the specified lineage using only a fraction of the samples from the randomized sampling. Conversely, randomized surveillance, the “gold standard” acting as our negative control, did not identify additional samples with similar sampling depths in regions with no or low incoming mobility or even in high mobility regions with less sampling depth. Implementing such an approach effectively under pandemic conditions poses difficult challenges due to the fluctuating sampling sizes. Although the finding of the sample may have been coincidental, our proof of concept demonstrated how we can leverage the potential of mobile service data for targeted surveillance sampling.

      (2) Line 313: While this work has reliably shown that the spread of Alpha was slower in Thuringia, I don't think there have been sufficient analyses to conclude that this is due to the lack of transportation hubs. My understanding is that only mobility within Thuringia has been evaluated here and not between Thuringia and other parts of Germany.

      Thank you for pointing this out. We noticed that the original sentence lacked the necessary clarity. The statement in line 313 was based on the observation that Alpha first occurred in federal states with major transport hubs, such as international airports and ports, which Thuringia lacks, as demonstrated in the Microreact dataset. For clarification, we adjusted the sentence as follows:

      Line 340 and following:

      A plausible explanation for the delayed spread of the Alpha lineage in Thuringia is the lack of major transport hubs, as Alpha first occurred in federal states with such hubs. Previous studies have already highlighted the impact of major transportation hubs in the spread of Sars-CoV-2.

      (3) Line 333 (and elsewhere): I'm not convinced, based on the results presented in Figure 2, that the authors have reliably identified a sampling bias here. This is only true if you assume (as in line 235) that the variant was in these districts, but that hasn't actually been demonstrated here. While I recognize that for high-prevalence variants, there is a strong correlation between inflow and variant prevalence, low-prevalence variants by definition spread less and may genuinely be missing from some districts. To support this conclusion that they identified a bias, I'd like to see some type of statistical model that is based e.g. on the number of sequences, prevalence of a given variant in other districts, etc. Alternatively, the language can be softened ("putative sampling bias").

      Thank you for addressing this legitimate point of criticism in our interpretation. Due to the retrospective nature of the analysis and the fact that we found no additional samples of the clusters after the specified timeframes, we were limited to the samples in our dataset. Therefore, it is impossible to demonstrate if a variant was present in the relevant districts afterward. We agree that the variant’s low prevalence means they may genuinely not have spread to some districts. For clarification, we added the following statements and changed the wording accordingly:

      Additional statement in line 248:

      However, due to their low prevalence, it is also possible that these subclusters have not spread to the indicated districts.

      Adjusted wording in line 361:

      We exemplified this approach with the Alpha lineage, where mobile service data indicated a putative sampling bias and partially predicted the spread of our Thuringian subclusters.

      Recommendations:

      (1) I applaud the use of the microreact page to make the data public, however, I don't see any reference to a GitHub or Zenodo repository with the analysis code. The NextStrain code is certainly appreciated but there is presumably additional code used to identify the clusters, generate figures, etc. I generally prefer this code be made public and it is recommended by eLife.

      Thank you for your appreciation. We have now included the R-scripts in the manuscript’s OSF repository. These were used to create the figures in the manuscript and supplement utilizing the supplementary tables 1-6, which are also stored in the repository. To clearly communicate which data is provided, we changed lines 513 and 514 of the “Data sharing statement” as follows:

      Line 513 and following:

      Supplementary tables and the R-scripts used to generate all figures are also provided in the repository under https://osf.io/n5qj6/. These include the mobile service data used in this study, which is available in processed and anonymized form.

      The subcluster identification was performed manually. By adding each sample's mutation profile to the Microreact metadata file, we visually screened the phylogenetic time tree for all non-Alpha specific mutations present in at least 20 Thuringian genomes. We then applied the criteria described in the Methods section to identify the nine Alpha subclusters. For clarification, we changed line 436:

      Line 436:

      We then manually screened for mutations present in at least 20 genomes with a small phylogenetic distance and a time occurrence of at least two months.

      Reviewer 2 (Public Review):

      In the manuscript, the authors combine SARS-CoV-2 sequence data from a state in Germany and mobility data to help in understanding the movement of the virus and the potential to help decide where to focus sequencing. The global expansion in sequencing capability is a key outcome of the public health response. However, there remains uncertainty about how to maximise the insights the sequence data can give. Improved ability to predict the movement of emergent variants would be a useful public health outcome. Also knowing where to focus sequencing to maximising insights is also key. The presented case study from one State in Germany is therefore a useful addition to the literature. Nevertheless, I have a few comments.

      Thank you for taking the time to review our work.

      (1) One of the key goals of the paper is to explore whether mobile phone data can help predict the spread of lineages. However, it appears unclear whether this was actually addressed in the analyses. To do this, the authors could hold out data from a period of time, and see whether they can predict where the variants end up being found.

      Based on your feedback, we noticed that the results of the other seven clusters presented in the supplement were not appropriately highlighted, causing them to be overlooked. We indeed demonstrated that predicting viral spread based on mobility data is possible, as shown for the high-prevalence subcluster 7 (Cluster “ORF1b:A520V”, 811 samples). This was briefly mentioned in lines 240-242, but the cluster was only shown in Supplementary Figures S4 and S5. Instead, we focused more on the putative sampling bias that the mobility for low-prevalence subclusters could indicate as an interesting use case of mobility data. This addresses a concrete problem of every surveillance: successfully identifying low-prevalence targets. However, based on your feedback, we revisited Figure 2, adding the plots of the high-prevalence subcluster: “ORF1b:A520V” from Supplementary Figures S4 and S5 while moving the low-prevalence subcluster “S:N185D” from Figure 2 into the Supplementary Figures S4 and S5. Additionally, we changed line 229 to highlight this result properly.

      line 229 and following:

      The mobile service data-based prediction of a subcluster’s spread aligned well with the subsequent regional coverage of fast-spreading, highly prevalent subclusters, such as subcluster 7, which covered 811 samples (see Figure 2). In contrast, the predicted spread for the low-prevalence subclusters did not correspond well with the actual occurrence.

      (2) The abstract presents the mobility-guided sampling as a success, however, the results provide a much more mixed result. Ultimately, it's unclear what having this strategy really achieved. In a quickly moving pandemic, it is unclear what hunting for extra sequences of a specific, already identified, variant really does. I'm not sure what public health action would result, especially given the variant has already been identified.

      Thank you for your critical assessment of the presented results and their interpretation.

      Here, we aimed to provide an alternative to the standard randomized surveillance strategy. Through mobility-guided sampling, we sought to increase identification chances while necessitating fewer samples and decreasing costs, ultimately enhancing surveillance efficiency. The Omicron-lineage BQ.1.1 was the perfect example to prove this concept under actual pandemic conditions. Yet, the strategy is not limited to low-prevalence sublineages but can be applied to virtually any surveillance case. However, from your question, we recognize that this conclusion was unclear from the text. Therefore, we adapted the conclusion to better communicate the real implications of our proof of concept. Additionally, we altered line 42 in the abstract for clarification.

      However, we did not assess the benefits of surveillance itself, as the German Robert Koch Institute (RKI) already had outlined its importance for tracking different viral variants. This tracking served several reasons, like monitoring vaccine escapism, mutational progress, and assessing available antibodies for treatment.

      Line 42:

      The latter concept was successfully implemented as a proof-of-concept for a mobility-guided sampling strategy in response to the surveillance of Omicron sublineage BQ.1.1.

      Line 364 to 374:

      Another approach is actively guiding the sampling process through mobile service data, which we demonstrated with our proof of principle focusing on the Omicron-lineage BQ.1.1 as a real-life example. This approach could allow for a flexible allocation of surveillance resources, enabling adaptation to specific circumstances and increasing sampling depth in regions where a variant is anticipated. By incorporating guided sampling, much fewer resources may be needed for unguided or random sampling, thereby reducing overall surveillance costs.

      Additionally, while this approach is particularly useful for identifying low-prevalence variants, it is not limited to such variants. Still, it can provide a guided, more cost-efficient, low-sampling alternative to general randomized surveillance that can also be applied to other viruses or lineages.

      (3) Relatedly, it is unclear to me whether simply relying on spatial distance would not be an alternative simpler approach than mobile phone data. From Figure 2, it seems clear that a simple proximity matrix would work well at reconstructing viral flow. The authors could compare the correlation of spatial, spatial proximity, and CDR data.

      Thank you for pointing this out. While proximity data might appear to be an obvious choice, it has significant limitations compared to mobility data, especially in the context of our study. Proximity data assumes that spatial distance alone can accurately represent movement patterns, which would only be true in a normally distributed traffic network. Geographic features such as mountains, cities, and highways affect traffic flows, leading to variability over distance and time, which are beyond the scope of spatial proximity but efficiently captured by mobility data. In Figure 2, we presented a simplified view of the mobility data. Hence, proximity and mobility data appear to provide the same insights. However, as shown in the updated Figure 3, a detailed overview of the available mobility data reveals obvious and non-obvious spatial connections that proximity data can not capture. Incorporating such a level of detail in Figure 2 would have cluttered the figure and reduced its clarity (e.g., adding triangles for each Thuringian community).

      While a comparison between proximity data and mobility data would indeed be informative, it is beyond the scope of our current study, as our primary focus was to examine the useability of mobility data in explaining our subcluster’s spread in the first place. However, we agree it would be a valuable direction for future research. We summarized our thoughts from above in the following additional sentence:

      Line 374:

      Pre-generated mobility networks automatically tailored to each state's unique infrastructure and population dynamics could provide better-targeted sampling guidance rather than simple geographical proximity.

      Recommendations:

      (1) Line 128: What do these percentages mean - the proportion of States with at least one Alpha variant? Please clarify.

      We clarified the values at their first appearance in the text:

      Line 127:

      By March, Alpha had spread to nearly all states and districts (districts are similar to counties or provinces) in Germany (Median: 76·47 % Alpha samples among a federal states total sequenced samples compared to 36·03 % in February, excluding Thuringia) and Thuringia (Median: 85·29 %, up from 50·00 % in February).

      (2) Line 134: It's a little strange to compare the dynamics of a state with that of the whole country. For it lagged as compared to all other States?

      Line 134: “In summary, the spread of the Alpha lineage in Thuringia lagged roughly two weeks behind the general spread in the rest of Germany but showed similar proportions.”

      Thank you for the feedback. The statement refers to the comparison of Alpha-lineage proportions across federal states, excluding Thuringia, in lines 118 to 130. To simplify, we collectively referred to these federal states as “Germany” in the text. However, we recognize that this formulation is misleading, so we adjusted line 135 for clarification:

      Line 135:

      In summary, the spread of the Alpha lineage in Thuringia lagged roughly two weeks behind the general spread of other German federal states but showed similar proportions.

    1. eLife Assessment

      In their important manuscript, Costa et al. establish an in vitro model for dorsal root ganglion (DRG) axonal asymmetry, revealing that central and peripheral axon branches have distinct patterns of microtubule populations that are linked to their differential regenerative capacities. The authors employ creative tissue culture methods to demonstrate how these branches develop uniquely in vitro, offering a potential explanation for long-observed regeneration disparities. The evidence provides a solid contribution to our understanding of the neuronal cytoskeleton and axonal regeneration, but the paper would benefit from additional methodology details and controls.

    2. Reviewer #1 (Public review):

      Summary:

      This paper describes a new in vitro model for DRG neurons that recapitulates several key differences between the peripheral and central branches of DRG axons in vivo. These differences include morphology (with one branch being thinner than the other), and regenerative capacity (with the peripheral branch displaying higher regenerative capacity). The authors analyze the abundance of various microtubule-associated protein (MAPs) in each branch, as well as the microtubule dynamics in each branch, and find significant differences between branches. Importantly, they found that a well-known conditioning paradigm (prior lesion of the peripheral branch improves the regenerative capacity of the central branch) is not only reproduced in this system but also leads to loss of the asymmetry of MAPs between branches. Zooming in on one MAP that shows differential abundance between the axons, they find that the severing enzyme Spastin is required for the asymmetry in microtubule dynamics and in regenerative capacity following a conditioning lesion.

      Strengths:

      The establishment of an experimental system that recapitulates DRG axon asymmetry in vitro is an important step that is likely to be useful for other studies. In addition, identifying key molecular signatures that differ between central and peripheral branches, and determining how they are lost following a conditioning lesion adds to our understanding of why peripheral axons have a better regenerative capacity. Last, the author's use of an in vivo model system to support some of their in vitro findings is a strength of this work.

      Weaknesses:

      The main weakness of the manuscript is that to a large degree, one of its main conclusions (MAP symmetry underlies differences in regenerative capacity) relies mainly on a correlation, without firmly establishing a causal link. However, this weakness is relatively minor because (1) it is partially addressed with the Spastin KO and (2) there isn't a trivial way to show a causal relationship in this case.

    3. Reviewer #2 (Public review):

      Summary:

      The authors set out to develop a tissue culture method in which to study the different regenerative abilities of the central and peripheral branch of sensory axons. Neurons developed a small and large branch, which have different regenerative abilities, different transport rates, and different microtubule properties. The study provides convincing evidence that the two axonal branches differ in a way to correspond to in vivo. The different regenerative abilities of the two branches are an important observation because until now it has not been clear whether this difference is intrinsic to the neuron and axons or due to differences in the environment surrounding the axons. The authors have then looked for molecular explanations of the differences between the branches. They find different transport rates and different microtubule dynamics. The different microtubule dynamics are explained by differing levels of spastin, an enzyme that severs microtubules encouraging dynamics.

      Strengths:

      The differences between the two branches are clearly shown, together with differences in transport, microtubule dynamics, and regeneration. The in vitro model is novel and could be widely used. The methods used are robust and generally accepted.

      Weaknesses:

      In order for the method to be used it needs to be better described. For instance what proportion of neurons develop just two axonal branches, one of which is different? How selective are the researchers in finding appropriate neurons?

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Costa and colleagues investigate how asymmetry in dorsal root ganglion (DRG) neurons is established. The authors developed an in vitro system that mimics the pseudo-unipolar morphology and asymmetry of DRG neurons during the regeneration of the peripheral and central branch axons. They suggest that central-like DRG axons exhibit a higher density of growing microtubules. By reducing the polymerization of microtubules in these central-like axons, they were able to eliminate the asymmetry in DRG neurons.

      Strengths:

      The authors point out a distinct microtubule-associated protein signature that differentiates between DRG neurons' central and peripheral axonal branches. Experimental results demonstrate that genetic deletion of spastin eliminated the differences in microtubule dynamics and axon regeneration between the central and peripheral branches.

      Weaknesses:

      While some of the data are compelling, experimental evidence only partially supports the main claims.

      In its current form, the study is primarily descriptive and lacks convincing mechanistic insights. It misses important controls and further validation using 3D in vitro models.

      Given the heterogeneity of dorsal root ganglion (DRG) neurons, it is unclear whether the in vitro model described in this study can be applied to all major classes of DRG neurons. Also unclear is the inconsistency with embryonic DRG cultures with embryonic (E)16 from rats and E13 from mice (spastin knockout and wild-type controls). Furthermore, the authors stated (line 393) that only a small subset of cultured DRG neurons exhibited a pseudo-unipolar morphology. The authors should include the percentage of the neurons that exhibit a pseudo-unipolar morphology.

      The significance of studying microtubule polymerization to DRG asymmetry in vitro is questionable, especially considering the model's validity. The authors might consider eliminating the in vitro data and instead focus on characterizing DRG asymmetry in vivo both before and after a conditioning lesion. If the authors choose to retain the in vitro data, classifying the central and peripheral-like branches in cultured DRG neurons will require further in-depth characterization. Additional validation should be performed in adult DRG neuron cultures not aged in vitro.

      The comparison of asymmetry associated with a regenerative response between in vitro and in vivo paradigms has significant limitations due to the nature of the in vitro culture system. When cultured in isolation, DRG neurons fail to form functional connections with appropriate postsynaptic target neurons (the central branch) or to differentiate the peripheral domains associated with the innervation of target organs. Rather than growing neurons on a flat, hard surface like glass, more physiologically relevant substrates and/or culturing conditions should be considered. This approach could help eliminate potential artifacts caused by plating adult DRG neurons on a flat surface. Additionally, the authors should consider replicating their findings in a 3D culture model or using dorsal root ganglia explants, where both centrally and peripherally projecting axons are present.

      Panels 5H-J require additional processing with astrocyte markers to accurately define the lesion borders. Furthermore, including a lower magnification would facilitate a direct comparison of the lesion site. The use of cholera toxin subunit B (CTB) to trace dorsal column sensory axons is prone to misinterpretation, as the tracer accumulates at the axon's tip. This limitation makes it extremely challenging to distinguish between regenerating and degenerating axons.

    5. Author response:

      Reviewer #1 (Public review)

      Weaknesses:

      The main weakness of the manuscript is that to a large degree, one of its main conclusions (MAP symmetry underlies differences in regenerative capacity) relies mainly on a correlation, without firmly establishing a causal link. However, this weakness is relatively minor because (1) it is partially addressed with the Spastin KO and (2) there isn't a trivial way to show a causal relationship in this case.

      We thank Reviewer #1 for their positive assessment of our manuscript. To further strengthen the claim that MAP asymmetry underlies differences in regenerative capacity, we could investigate the effect of depleting other MAPs that lose asymmetry after conditioning lesion (CRMP5 and katanin). One expects that similarly to spastin, this would disrupt the physiological asymmetry of DRG axons and impair axon regeneration. We will further discuss this issue in the revised version of the manuscript.

      Reviewer #2 (Public review):

      Weaknesses:

      In order for the method to be used it needs to be better described. For instance what proportion of neurons develop just two axonal branches, one of which is different? How selective are the researchers in finding appropriate neurons?

      We thank Reviewer #2 for their positive assessment of our manuscript. As suggested, we will include further methodological details on the in vitro system in the revised version of the manuscript. We have evaluated the percentage of DRG neurons exhibiting different morphologies in our cultures: multipolar (4%), bipolar, (35%) bell-shaped (17%), and pseudo-unipolar neurons (43%). This will be included in the revised manuscript. All the pseudo-unipolar neurons analysed had distinct axonal branches in terms of diameter and microtubule dynamics. For imaging purposes, we selected pseuso-unipolar neurons with axons unobstructed from other cells or neurites within a distance of at least 20–30 μm from the bifurcation point, to ensure optimal imaging. In the case of laser axotomy experiments, this distance was increased to 100–200 μm to ensure clear analysis of regeneration. These selection criteria will be detailed in the Methods of the revised manuscript.

      Reviewer #3 (Public review):

      Weaknesses:

      While some of the data are compelling, experimental evidence only partially supports the main claims. In its current form, the study is primarily descriptive and lacks convincing mechanistic insights. It misses important controls and further validation using 3D in vitro models.

      We recognize the importance of further exploring the contribution of other MAPs to microtubule asymmetry and regenerative capacity of DRG axons. In future work, we plan to investigate this issue by using knockout mice for katanin and CRMP5. To understand the mechanisms underlying the differential localization of MAPs in DRG axons, we performed in-situ hybridization to assess the availability of axonal mRNA but no differences were found between central and peripheral DRG axons (Figure 4 – figure supplement 2). To address whether differences in protein transport exist, we attempted to transduce DRG neurons with GFP-tagged spastin both in vitro and in vivo. However, these experiments were inconclusive as very low levels of spastin-GFP were detected. We are actively optimizing these approaches and will address this challenge in future studies. This will be further discussed in the revised manuscript.

      Given the heterogeneity of dorsal root ganglion (DRG) neurons, it is unclear whether the in vitro model described in this study can be applied to all major classes of DRG neurons.

      We acknowledge the diversity of DRG neurons and agree that assessing the presence of different DRG subtypes in our culture system will enrich its future use. Despite this heterogeneity, we focused on DRG neuron features that are common to all subtypes i.e, pseudo-unipolarization and higher regenerative capacity of peripheral branches. This will be further discussed in the revised version of the manuscript.

      Also unclear is the inconsistency with embryonic DRG cultures with embryonic (E)16 from rats and E13 from mice (spastin knockout and wild-type controls).

      Given our previous experience in establishing DRG neuron cultures from Wistar rats and C57BL/6 mice, these developmental stages are equivalent, yielding cultures of DRG neurons with similar percentages of different morphologies. Of note, in our colonies, gestation length is ~19 days in C57BL/6 mice (background of the spastin knockout line) and ~22 days in Wistar Han rats. This will be further clarified in the Methods.

      Furthermore, the authors stated (line 393) that only a small subset of cultured DRG neurons exhibited a pseudo-unipolar morphology. The authors should include the percentage of the neurons that exhibit a pseudo-unipolar morphology.

      We have previously evaluated the percentage of DRG neurons exhibiting different morphologies in our cultures: multipolar (4%), bipolar, (35%) bell-shaped (17%), and pseudo-unipolar neurons (43%). This will be included in the revised manuscript. In line 393, we referred specifically to an experimental setup where DRG neuron transduction was done and 30 transduced neurons were randomly selected for longitudinal imaging. From these, the number of viable pseudo-unipolar DRG neurons was limited by both the random nature of viral transduction and light-induced toxicity as continuous imaging over seven consecutive days at hourly intervals was done. This will be clarified in the revised manuscript.

      The significance of studying microtubule polymerization to DRG asymmetry in vitro is questionable, especially considering the model's validity. The authors might consider eliminating the in vitro data and instead focus on characterizing DRG asymmetry in vivo both before and after a conditioning lesion. If the authors choose to retain the in vitro data, classifying the central and peripheral-like branches in cultured DRG neurons will require further in-depth characterization. Additional validation should be performed in adult DRG neuron cultures not aged in vitro.

      The in vitro system here presented reliably reproduces several key features of DRG neurons observed in vivo, including asymmetry in axon diameter, regenerative capacity, axonal transport, and microtubule dynamics. Of note, most studies in the field were developed using multipolar DRG neurons that do not recapitulate in vivo morphology and asymmetries. Thus, the current in vitro system serves as a versatile tool for advancing our understanding of DRG biology and associated diseases. This system is particularly suited to study axon regeneration, and enables research on mechanisms occurring at the stem axon bifurcation, which are challenging to examine in vivo due to the length of the stem axon and the difficulty of locating the DRG T-junction. Optimizing similar cultures using adult DRG neurons comes with challenges, such as lower cell viability and decreased percentage of pseudo-unipolarization. This is the case with multiple other neuron types for which the vast majority of cultures are obtained from embryonic tissue. These embryonic cultures (as is the case with cortical and hippocampal neurons) are widely used to understand neuronal polarization, axon growth and/or regeneration. This will be further addressed in the revised manuscript.

      The comparison of asymmetry associated with a regenerative response between in vitro and in vivo paradigms has significant limitations due to the nature of the in vitro culture system. When cultured in isolation, DRG neurons fail to form functional connections with appropriate postsynaptic target neurons (the central branch) or to differentiate the peripheral domains associated with the innervation of target organs. Rather than growing neurons on a flat, hard surface like glass, more physiologically relevant substrates and/or culturing conditions should be considered. This approach could help eliminate potential artifacts caused by plating adult DRG neurons on a flat surface. Additionally, the authors should consider replicating their findings in a 3D culture model or using dorsal root ganglia explants, where both centrally and peripherally projecting axons are present.

      We agree that a more sophisticated system, such as a compartmentalized culture, holds great potential for future research. In this respect, we are currently engaged in developing such models. A compartmentalized system would enable the separation of three compartments: central nervous system neurons, DRG neurons, and peripheral targets. While previous efforts to create compartmentalized DRG cultures have been reported, these systems have not demonstrated the development of pseudo-unipolar morphology. Incorporating non-neuronal DRG cells into the DRG neuron compartment, may successfully support the development of a pseudo-unipolar morphology.

      We also recognize the importance of dimensionality in fostering pseudo-unipolar morphology. Of note, our model provides a 3D-like environment, as DRG glial cells are continuously replicating over the 21 days in culture. In relation to DRG explants, we attempted their use but encountered limitations with confocal microscopy as the axial resolution was insufficient to resolve adequately processes at the DRG T-junction or within individual branches. While tissue clearing could improve resolution, it would be incompatible with live imaging, which is essential for our experiments.

      The above issues will be further discussed in the revised manuscript.

      Panels 5H-J require additional processing with astrocyte markers to accurately define the lesion borders. Furthermore, including a lower magnification would facilitate a direct comparison of the lesion site.

      In our study, we relied on the alignment of nuclei to delineate the lesion site as in our accumulated experience, this provides an accurate definition of the lesion boarder. Outside the lesion, the nuclei are well-aligned, while at the lesion site, they become randomly distributed. Additionally, CTB staining further supports the identification of the rostral boarder of the lesion, as most injured central DRG axons stop their growth at the injury site. This will be further detailed in the Methods.

      The use of cholera toxin subunit B (CTB) to trace dorsal column sensory axons is prone to misinterpretation, as the tracer accumulates at the axon's tip. This limitation makes it extremely challenging to distinguish between regenerating and degenerating axons.

      While alternative methods to trace or label regenerating axons exist, CTB is a well-established and widely used tracer for central sensory projections, as shown in multiple studies. Regarding the concern of possible CTB labeling in degenerating axons, we believe this is unlikely to be the case in our study as in spinal cord injury controls, CTB-positive axons are nearly absent. Also, as regeneration was investigated six weeks after injury, axon degeneration has most likely already occurred, as shown in (PMID: 15821747 and PMID: 25937174).

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Rühling et al analyzes the mode of entry of S. aureus into mammalian cells in culture. The authors propose a novel mechanism of rapid entry that involves the release of calcium from lysosomes via NAADP-stimulated activation of TPC1, which in turn causes lysosomal exocytosis; exocytic release of lysosomal acid sphingomyelinase (ASM) is then envisaged to convert exofacial sphingomyelin to ceramide. These events not only induce the rapid entry of the bacteria into the host cells but are also described to alter the fate of the intracellular S. aureus, facilitating escape from the endocytic vacuole to the cytosol.

      Strengths:

      The proposed mechanism is novel and could have important biological consequences.

      Weaknesses:

      Unfortunately, the evidence provided is unconvincing and insufficient to document the multiple, complex steps suggested. In fact, there appear to be numerous internal inconsistencies that detract from the validity of the conclusions, which were reached mostly based on the use of pharmacological agents of imperfect specificity.

      We thank the reviewer for the detailed evaluation of our manuscript. We will address the criticism below.

      We agree with the reviewer that many of the experiments presented in our study rely on the usage of inhibitors. However, we want to emphasize that the main conclusion (invasion pathway affects the intracellular fate/phagosomal escape) was demonstrated without the use of inhibitors or genetic ablation in two key experiments (Figure4 G/H). These experiments were in line with the results we obtained with inhibitors (amitriptyline [Supp. Figure 4E], ARC39, PCK310, [Figure 4c] and Vacuolin-1 [Supp. Figure4f]). Importantly, the hypothesis was also supported by another key experiment, in which we showed the intracellular fate of bacteria is affected by removal of SM from the plasma membrane before invasion, but not by removal of SM from phagosomal membranes after bacteria internalization (Figure4d-f). Taken together, we thus believe that the main hypothesis is strongly supported by our data.

      Moreover, we either used different inhibitors for the same molecule (ASM was inhibited by ARC39, amitriptyline and PCK310 with similar outcome) or supported our hypothesis with gene-ablated cell pools (TPC1, Syt7, SARM1), as we will point out in more detail below.

      Firstly, the release of calcium from lysosomes is not demonstrated. Localized changes in the immediate vicinity of lysosomes need to be measured to ascertain that these organelles are the source of cytosolic calcium changes. In fact, 9-phenantrol, which the authors find to be the most potent inhibitor of invasion and hence of the putative calcium changes, is not a blocker of lysosomal calcium release but instead blocks plasmalemmal TRPM4 channels. On the other hand, invasion is seemingly independent of external calcium. These findings are inconsistent with each other and point to non-specific effects of 9-phenantrol. The fact that ionomycin decreases invasion efficiency is taken as additional evidence of the importance of lysosomal calcium release. It is not clear how these observations support involvement of lysosomal calcium release and exocytosis; in fact treatment with the ionophore should itself have induced lysosomal exocytosis and stimulated, rather than inhibited invasion. Yet, manipulations that increase and others that decrease cytosolic calcium both inhibited invasion.

      With respect to lysosomal Ca2+ release, we agree with the reviewer that direct visual demonstration of lysosomal Ca2+ release upon infection will improve the manuscript. We therefore will perform additional experimentation to show alterations of Ca2+ at the lysosomes during infection.

      As to the TRPM4 involvement in S. aureus host cell internalization, it has been reported that TRPM4 is activated by cytosolic Ca2+. However, the channel conducts monovalent cations such as K+ or Na+ but is impermeable for Ca2+ 1, 2. The following of our observations are supporting this:

      i) S. aureus invasion is dependent on intracellular Ca2+, but is independent from extracellular Ca2+  (Figure 1c).

      ii) 9-phenantrol treatment reduces S. aureus internalization by host cells, illustrating the dependence of this process on TRPM4 (Figure 1b). We therefore hypothesize that TRPM4 is activated by Ca2+ released from lysosomes (see above).

      TRPM4 is localized to focal adhesions and is connected to actin cytoskeleton3, 4 – a requisite of host cell entry of S. aureus.5, 6 This speaks for an important function of TRPM4 in uptake of S. aureus in general, but does not necessarily have to be involved exclusively in the rapid uptake pathway.

      TRPM4 itself is not permeable for Ca2+ but is activated by the cation.  Thus, it is unlikely to cause lysosomal exocytosis. The stronger bacterial uptake reduction by treatment with 9-phenantrol when compared to Ned19 thus may be caused by the involvement of TRPM4 in additional pathways of S. aureus host cell entry involving that association of TRPM4 with focal adhesions or, as pointed out by the reviewer, unspecific side effects of 9-phenantrol that we currently cannot exclude. We will include this information in the revised manuscript.

      Regarding the reduced S. aureus invasion after ionomycin treatment, we agree with the reviewer that ionomycin is known to lead to lysosomal exocytosis as was previously shown by others7 as well as our laboratory8.

      We hypothesized that pretreatment with ionomycin would trigger lysosomal exocytosis and thus would reduce the pool of lysosomes that can undergo exocytosis before host cells are contacted by S. aureus. As a result, we should observe a marked reduction of S. aureus internalization in such “lysosome-depleted cells”, if the lysosomal exocytosis is coupled to bacterial uptake. Our observation of reduced bacterial internalization after ionomycin treatment supports this hypothesis.

      However, ionomycin treatment and S. aureus infection of host cells are distinct processes.

      While ionomycin results in strong global and non-directional lysosomal exocytosis of all “releasable” lysosomes (~5-10 % of all lysosomes according to previous observations)7, we hypothesize that lysosomal exocytosis upon contact with S. aureus only involves a very small proportion of lysosomes at host-bacteria contact sites.

      Since ionomycin disturbs the overall cellular Ca2+ homeostasis, we agree with the reviewer that this does not directly show lysosomal Ca2+ liberation. We will discuss this in more detail in the revised manuscript.

      The proposed role of NAADP is based on the effects of "knocking out" TPC1 and on the pharmacological effects of Ned-19. It is noteworthy that TPC2, rather than TPC1, is generally believed to be the primary TPC isoform of lysosomes. Moreover, the gene ablation accomplished in the TPC1 "knockouts" is only partial and rather unsatisfactory. Definitive conclusions about the role of TPC1 can only be reached with proper, full knockouts. Even the pharmacological approach is unconvincing because the high doses of Ned-19 used should have blocked both TPC isoforms and presumably precluded invasion. Instead, invasion is reduced by only ≈50%. A much greater inhibition was reported using 9-phenantrol, the blocker of plasmalemmal calcium channels. How is the selective involvement of lysosomal TPC1 channels justified?

      As to partial gene ablation of TPC1: To avoid clonal variances, we usually perform pool sorting to obtain a cell population that predominantly contains cells -here- deficient in TPC1, but also a small proportion of wildtype cells as seen by the residual TPC1 protein on the Western blot. We observe a significant reduction of bacterial uptake in this cell pool suggesting that the uptake reduction in a pure K.O. population may be even larger.

      As to the inhibition by Ned19: We agree with the reviewer that Ned19 inhibits TPC1 and TPC2. Since ablation of TPC1 reduced invasion of S. aureus, we concluded that TPC1 is important for S. aureus host cell invasion. We thus agree with the reviewer that a role for TPC2 cannot be excluded. We will clarify this in the reviewed manuscript. It needs to be noted, however, that deficiency in either TPC1 or TPC2 alone was sufficient to prevent Ebola virus infection9, which is in line with our observations.

      The 50% reduction of invasion upon Ned19 treatment (Figure 1d) is comparable with the reduction caused by other compounds that influence the ASM-dependent pathway (such as amitriptyline, ARC39 [Figure 2c], BAPTA-AM [Figure 1c], Vacuolin-1 [Figure 2a], β-toxin [Figure 2e] and ionomycin [Figure 1a]). Further, the partial reduction of invasion is most likely due to the concurrent activity of multiple internalization pathways which are not all targeted by the used compounds.

      Invoking an elevation of NAADP as the mediator of calcium release requires measurements of the changes in NAADP concentration in response to the bacteria. This was not performed. Instead, the authors analyzed the possible contribution of putative NAADP-generating systems and reported that the most active of these, CD38, was without effect, while the elimination of SARM1, another potential source of NAADP, had a very modest (≈20%) inhibitory effect that may have been due to clonal variation, which was not ruled out. In view of these data, the conclusion that NAADP is involved in the invasion process seems unwarranted.

      Our results from two independent experimental set-ups (Ned19 [Figure 1d] and TPC1 K.O. [Figure 1e & Figure 2f]) indicate the involvement of NAADP in the process. However, the measurement of NAADP concentration is non-trivial. However, we can rule out clonal variation in the SARM1 mutant since experiments were conducted with a cell pool as described above in order to avoid clonal variation of single clones.

      The mechanism behind biosynthesis of NAADP is still debated. CD38 was the first enzyme discovered to possess the ability of producing NAADP. However, it requires acidic pH to produce NAADP10 -which does not match the characteristics of a cytosolic NAADP producer. HeLa cells do not express CD38 and hence, it is not surprising that inhibition of CD38 had no effect on S. aureus invasion in HeLa cells. However, NAADP production by HeLa cells was observed in absence of CD3811. Thus CD38-independent NAADP generation is likely. SARM1 can produce NAADP at neutral pH12 and is expressed in HeLa, thus providing a more promising candidate.

      We agree with the reviewer that the reduction of S. aureus internalization after ablation of SARM1 is less pronounced than in other experiments of ours. This may be explained by NAADP originating from other enzymes, such as the recently discovered DUOX1, DUOX2, NOX1 and NOX213, which – with exception of DUOX2- possess a low expression even in HeLa cells. We will discuss this in the revised manuscript.

      The involvement of lysosomal secretion is, again, predicated largely on the basis of pharmacological evidence. No direct evidence is provided for the insertion of lysosomal components into the plasma membrane, or for the release of lysosomal contents to the medium. Instead, inhibition of lysosomal exocytosis by vacuolin-1 is the sole source of evidence. However, vacuolin-1 is by no means a specific inhibitor of lysosomal secretion: it is now known to act primarily as a PIKfyve inhibitor and to cause massive distortion of the endocytic compartment, including gross swelling of endolysosomes. The modest (20-25%) inhibition observed when using synaptotagmin 7 knockout cells is similarly not convincing proof of the requirement for lysosomal secretion.

      We agree that the manuscript will strongly benefit from a functional analysis of lysosomal exocytosis. We therefore will conduct assays to investigate exocytosis in the revision. However, we previously showed i) by addition of specific antisera that LAMP1 transiently is exposed on the plasma membrane during ionomycin and pore-forming toxin challenge and ii) demonstrated the release of ASM activity into the culture medium under these conditions.8 Both measurements are not compatible with S. aureus infection, since LAMP1 antibodies also are non-specifically bound by protein A and another IgG-binding protein on the S. aureus surface, which would bias the results. Since protein A also serves as an adhesin, we cannot simply delete the ORF without changing other aspects of staphylococcal virulence. Further, FBS contains a ASM background activity that impedes activity measurements of cell culture medium. We previously removed this background activity by a specific heat-inactivation protocol.8 However, S. aureus invasion is strongly reduced in culture medium containing this heat-inactivated FBS.

      We agree with the reviewer that Vacuolin-1 has unspecific side effects. We will address this in the revised version of the manuscript.

      As to the involvement of synaptotagmin 7:

      Synaptotagmin 7 is not the only protein possibly involved in Ca-dependent exocytosis. For instance, SYT1 has been shown to possess an overlapping function.14 This may explain the discrepancy between our vacuolin-1 and SYT7 ablation experiments. We will add an according section to the discussion.

      ASM is proposed to play a central role in the rapid invasion process. As above, most of the evidence offered in this regard is pharmacological and often inconsistent between inhibitors or among cell types. Some drugs affect some of the cells, but not others. It is difficult to reach general conclusions regarding the role of ASM. The argument is made even more complex by the authors' use of exogenous sphingomyelinase (beta-toxin). Pretreatment with the toxin decreased invasion efficiency, a seemingly paradoxical result. Incidentally, the effectiveness of the added toxin is never quantified/validated by directly measuring the generation of ceramide or the disappearance of SM.

      Although pharmacological inhibitors can have unspecific side effects, we want to emphasize that the inhibitors used in our study act on the enzyme ASM by completely different mechanisms. Amitriptyline is a so called functional inhibitor of ASM (FIASMA) which induces the detachment of ASM from lysosomal membranes resulting in degradation of the enzyme.15 By contrast, ARC39 is a competitive inhibitor.16, 17

      We do not see inconsistencies in our data obtained with ASM inhibitors. Amitriptyline and ARC39 both reduce the invasion of S. aureus in HuLEC, HuVEC and HeLa cells (Figure 2c). ARC39 needs a longer pre-incubation, since its uptake by host cells is slower (data not shown). We observe a different outcome in 16HBE14o- and Ea.Hy 926 cells, with 16HBE14o- even demonstrating a slightly increased invasion of S. aureus upon ARC39 treatment. Amitriptyline had no effect (Figure 2c). Moreover, both inhibitors affected the invasion dynamics (Figure 3d), phagosomal escape (Figure 4c and Supp. Figure 4e) and Rab7 recruitment (Figure 4a and Supp. Figure 4b) in a similar fashion. Proper inhibition of ASM by both compounds in all cell lines used was validated by enzyme assays (Supp. Figure 2e), which suggests that the ASM-dependent pathway does only exist in specific cell lines. This also may serve as an argument that we here do not observe unspecific side effects of the compounds. We will clarify this in the revised manuscript.

      ASM is a key player for SM degradation and recycling. In clinical context, deficiency in ASM results in the so-called Niemann Pick disease type A/B. The lipid profile of ASM-deficient cells is massively altered18, which will result in severe side effects. Short-term inhibition by small molecules therefore poses a clear benefit when compared to the usage of ASM K.O. cells.

      As to the treatment with a bacterial sphingomyelinase:

      Treatment with the bacterial SMase (bSMase, here: β-toxin) was performed in two different ways:

      i) Pretreatment of host cells with β-toxin to remove SM from the host cell surface before infection. This removes the substrate of ASM from the cell surface prior to addition of the bacteria (Figure 2e, Figure 4d-f). Since SM is not present on the extracellular plasma membrane leaflet after treatment, a release of ASM cannot cause localized ceramide formation at the sites of lysosomal exocytosis. Similar observations were made by others.19

      ii) Addition of bSMase to host cells together with the bacteria to complement for the absence of ASM (Figure 2f).

      Removal of the ASM substrate before infection (i) prevents localized ASM-mediated conversion of SM to Cer during infection and resulted in a decreased invasion, while addition of the SMase during infection resulted in an increased invasion in TPC1 and SYT7 ablated cells. Thus, both experiments are consistent with each other and in line with our other observations.

      Removal of SM from the plasma membrane by β-toxin was indirectly demonstrated by the absence of Lysenin recruitment to phagosomes/escaped bacteria when host cells were pretreatment with the toxin before infection (Figure4F). In another publication, we recently quantified the effectiveness of β-toxin treatment, even though with slightly longer treatment times (75 min vs. 3h).20 We will repeat the measurements also for shorter treatment times.

      To clarify our experimental approaches to the readership we will add an explanatory section to the revised manuscript.

      As to the general conclusions regarding the role of ASM: ASM and lysosomal exocytosis has been shown to be involved in uptake of a variety of pathogens19, 21-25 supporting its role in the process.

      The use of fluorescent analogs of sphingomyelin and ceramide is not well justified and it is unclear what conclusions can be derived from these observations. Despite the low resolution of the images provided, it appears as if the labeled lipids are largely in endomembrane compartments, where they would presumably be inaccessible to the secreted ASM. Moreover, considering the location of the BODIPY probe, the authors would be unable to distinguish intact sphingomyelin from its breakdown product, ceramide. What can be concluded from these experiments? Incidentally, the authors report only 10% of BODIPY-positive events after 10 min. What are the implications of this finding? That 90% of the invasion events are unrelated to sphingomyelin, ASM, and ceramide?

      During the experiments with fluorescent SM analogues (Figure 3a,b), S. aureus was added to the samples immediately before start of video recording. Hence, bacteria are slowly trickling onto the host cells and we thus can image the initial contact between them and the bacteria, for instance, the bacteria depicted in Figure 3a contact the host cell about 9 min before becoming BODIPY-FL-positive (see Supp. Video 1, 55 min). Hence, we think that in these cases we see the formation of phagosomes around bacteria rather than bacteria in endomembrane compartments. Since generation of phagosomes happens at the plasma membrane, SM is accessible to secreted ASM.

      The “trickling” approach for infection is an experimental difference to our invasion measurements, in which we synchronized the infection by a very slow centrifugation. This ensures that all bacteria have contact to host cells and are not just floating in the culture medium. However, live cell imaging of initial bacterial-host contact and synchronization of infection is technically not combinable.

      In our invasion measurements -with synchronization-, we typically see internalization of ~20% of all added bacteria after 30 min. Hence, most bacteria that are visible in our videos likely are still extracellular and only a small proportion was internalized. This explains why only 10% of total bacteria are positive for BODIPY-FL-SM after 10 min. The proportion of internalized bacteria that are positive for BODIPY-FL-SM should be way higher but cannot be determined with this method.

      We agree with the reviewer that we cannot observe conversion of BODIPY-FL-SM by ASM. In order to do that, we attempted to visualize the conversion of a visible-range SM FRET probe (Supp. Figure 3), but the structure of the probe is not compatible with measurement of conversion on the plasma membrane, since the FITC fluorophore released into the culture medium by the ASM activity thereby gets lost for imaging. In general, the visualization of SM conversion with subcellular resolution is challenging and even with novel tools developed in our lab26 visualization of SM on the plasma membrane is difficult.

      The conclusion we draw from these experiments are that i.) S. aureus invasion is associated with SM and ii.) SM-associated invasion can be very fast, since bacteria are rapidly engulfed by BODIPY-FL-SM containing membranes.

      It is also unclear how the authors can distinguish lysenin entry into ruptured vacuoles from the entry of RFP-CWT, used as a criterion of bacterial escape. Surely the molecular weights of the probes are not sufficiently different to prevent the latter one from traversing the permeabilized membrane until such time that the bacteria escape from the vacuole.

      We here want to clarify that both, the Lysenin as well as the CWT reporter have access to rupture vacuoles (Figure 4b). We used the Lysenin reporter in these experiments for estimation of SM content of phagosomal membranes. If a vacuole is ruptured, both the bacteria and the luminal leaflet of the phagosomal membrane remnants get in contact with the cytosol and hence with the cytosolically expressed reporters YFP-Lysenin as well as RFP-CWT resulting in “Lysenin-positive escape” when phagosomes contained SM (see Figure 4f). By contrast, either β-toxin expression by S. aureus or pre-treatment with the bSMase resulted in absence of Lysenin recruitment suggesting that the phagosomal SM levels were decreased/undetectable (Figure 4f, Supp Figure 5f, g, i, j).

      This approach does not enable a quantitative measurement of phagosomal SM and rather gives a “yes or no” answer. However, we think this method is sufficient to show that β-toxin expression and pretreatment markedly decreased phagosomal SM levels in the host cells.

      The approach we used here to analyze “Lysenin-positive escape” can clearly be distinguished from Lysenin-based methods that were used by others.27 There Lysenin was used to show trans-bilayer movement of SM before rupture of bacteria-containing phagosomes.

      To clarify the function of Lysenin in our approach we will add an additional figure to the revised manuscript.

      Both SMase inhibitors (Figure 4C) and SMase pretreatment increased bacterial escape from the vacuole. The former should prevent SM hydrolysis and formation of ceramide, while the latter treatment should have the exact opposite effects, yet the end result is the same. What can one conclude regarding the need and role of the SMase products in the escape process?

      As pointed out above, pretreatment of host cells with SMase removes SM from the plasma membrane and hence, ASM does not have access to its substrate. Hence, both treatment with either ASM inhibitors or pretreatment with bacterial SMase prevent ASM from being active on the plasma membrane and hence block the ASM-dependent uptake (Figure 2 c, e). Although overall less bacteria were internalized by host cells under these conditions, the bacteria that invaded host cells did so in an ASM-independent manner.

      Since blockage of the ASM-dependent internalization pathway (with ASM inhibitor [Figure 4c], SMase pretreatment [Figure 4e] and Vacuolin-1[Supp. Fig.4f]) always resulted in enhanced phagosomal escape, we conclude that bacteria that were internalized in an ASM-independent fashion cause enhanced escape. Vice versa, bacteria that enter host cells in an ASM-dependent manner demonstrate lower escape rates.

      This is supported by comparing the escape rates of “early” and “late” invaders [Figure 4g/h], which in our opinion is a key experiment that supports this hypothesis. The “early” invaders are predominantly ASM-dependent (see e.g. Figure 3e) and thus, bacteria that entered host cell in the first 10 min of infection should have been internalized predominantly in an ASM-dependent fashion, while slower entry pathways are active later during infection. The early ASM dependent invaders possessed lower escape rates, which is in line with the data obtained with inhibitors (e.g. Figure 4c and Supp. Fig. 4f).

      We hypothesize that the activity of ASM on the plasma membrane during invasion mediates the recruitment of a specific subset of receptors, which then influence downstream phagosomal maturation and escape. This hypothesis is supported by the fact that the subset of receptors interacting with S. aureus is altered upon inhibition of the ASM-dependent uptake pathway. We describe this in another study that is currently under evaluation elsewhere.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Ruhling et al propose a rapid uptake pathway that is dependent on lysosomal exocytosis, lysosomal Ca2+ and acid sphingomyelinase, and further suggest that the intracellular trafficking and fate of the pathogen is dictated by the mode of entry.

      The evidence provided is solid, methods used are appropriate and results largely support their conclusions, but can be substantiated further as detailed below. The weakness is a reliance on chemical inhibitors that can be non-specific to delineate critical steps.

      Specific comments:

      A large number of experiments rely on treatment with chemical inhibitors. While this approach is reasonable, many of the inhibitors employed such as amitriptyline and vacuolin1 have other or non-defined cellular targets and pleiotropic effects cannot be ruled out. Given the centrality of ASM for the manuscript, it will be important to replicate some key results with ASM KO cells.

      We thank the reviewer for the critical evaluation of our manuscript and plenty of constructive comments.

      We agree with the reviewer, that ASM inhibitors such as functional inhibitors of ASM (FIASMA) like amitriptyline used in our study have unspecific side effects given their mode-of-action. FIASMAs induce the detachment of ASM from lysosomal membranes resulting in degradation of the enzyme.15  However, we want to emphasize that we also used the competitive inhibitor ARC39 in our study16, 17 which acts on the enzyme by a completely different mechanism. All phenotypes (reduced invasion [Figure 2c, d], effect on invasion dynamics [Figure 3d], enhanced escape [Figure 4c and Supp Figure 4e] and differential recruitment of Rab7 [Supp. Figure 4b]) were observed with both inhibitors thereby supporting the role of ASM in the process.

      We further agree that experiments with genetic evidence usually support and improve scientific findings. However, ASM is a cellular key player for SM degradation and recycling. In a clinical context, deficiency in ASM results in a so-called Niemann Pick disease type A/B. The lipid profile of ASM-deficient cells is massively altered18, which in itself will result in severe side effects. Thus, the usage of inhibitors provides a clear benefit when compared to ASM K.O. cells, since ASM activity can be targeted in a short-term fashion thereby preventing larger alterations in cellular lipid composition.

      Most experiments are done in HeLa cells. Given the pathway is projected as generic, it will be important to further characterize cell type specificity for the process. Some evidence for a similar mechanism in other cell types S. aureus infects, perhaps phagocytic cell type, might be good.

      Whenever possible we performed the experiments not only in HeLa but also in HuLECs. For example, we refer to experiments concerning the role of Ca2+ (Figure 1c/Supp.Figure1e), lysosomal Ca2+/Ned19 (Figure1d/Supp Figure 1g), lysosomal exocytosis/Vacuolin-1 (Figure 2a/Supp. Figure2a), ASM/ARC39 and amitriptyline (Figure 2c), surface SM/β-toxin (Figure 2e/Supp. Figure 2g), analysis of invasion dynamics (complete Figure 3) and measurement of cell death during infection (Figure 5c-e, Supp. Figure 6a+b).

      HuLECs, however, are not really genetically amenable and hence we were not able to generate gene deletions in these cells and upon introduction of the fluorescence escape reporter the cells are not readily growing.

      As to ASM involvement in phagocytic cells: a role for ASM during the uptake of S. aureus by macrophages was previously reported by others.23 However, in professional phagocytes S. aureus does not escape from the phagosome and replicates within the vacuole.28

      I'm a little confused about the role of ASM on the surface. Presumably, it converts SM to ceramide, as the final model suggests. Overexpression of b-toxin results in the near complete absence of SM on phagosomes (having representative images will help appreciate this), but why is phagosomal SM detected at high levels in untreated conditions? If bacteria are engulfed by SM-containing membrane compartments, what role does ASM play on the surface? If surface SM is necessary for phagosomal escape within the cell, do the authors imply that ASM is tuning the surface SM levels to a certain optimal range? Alternatively, can there be additional roles for ASM on the cell surface? Can surface SM levels be visualized (for example, in Figure 4 E, F)?

      We initially hypothesized that we would detect higher phagosomal SM levels upon inhibition of ASM, since our model suggests SM cleavage by ASM on the host cell surface during bacterial cell entry. However, we did not detect any changes in our experiments (Supp. Figure 4d). We currently favor the following explanation: SM is the most abundant sphingolipid in human cells.29 If peripheral lysosomes are exocytosed and thereby release ASM, only a localized and relative small proportion of SM may get converted to Cer, which most likely is below our detection limit. In addition, the detection of cytosolically exposed phagosomal SM by YFP-Lysenin is not quantitative and provides a “Yes or No” measurement. Hence, we think that the rather limited SM to Cer conversion in combination with the high abundance of SM in cellular membranes does not visibly affect the recruitment of the Lysenin reporter.

      In our experiments that employ BODIPY-FL-SM (Figure 3a+b), we cannot distinguish between native SM and downstream metabolites such as Cer. Hence, again we cannot make any assumptions on the extent to which SM is converted on the surface during bacterial internalization. Although our laboratory recently used trifunctional sphingolipid analogs to analyze the SM to Cer conversion20, the visualization of this process on the plasma membrane is currently still challenging.

      Overall, we hypothesize that the localized generation of Cer on the surface by released ASM leads to generation of Cer-enriched platforms. Subsequently, a certain subset of receptors may be recruited to these platforms and influence the uptake process. These platforms are supposed to be very small, which also would explain that we did not detect changes in Lysenin recruitment.

      Related to that, why is ASM activity on the cell surface important? Its role in non-infectious or other contexts can be discussed.

      ASM release by lysosomal exocytosis is implied in plasma membrane repair upon injury. We will this discuss this in the revised version of the manuscript.

      If SM removal is so crucial for uptake, can exocytosis of lysosomes alone provide sufficient ASM for SM removal? How much or to what extent is lysosomal exocytosis enhanced by initial signaling events? Do the authors envisage the early events in their model happening in localized confines of the PM, this can be discussed.

      Ionomycin treatment led to a release of ~10 % of all lysosomes and also increased extracellular ASM activity.7, 8 However, it is currently unclear– to our knowledge -to which extent the released ASM affects surface SM levels. Also, it is unknown which percentage of the lysosomes is released during infection with S. aureus. However, one has to speculate that this will be only a fraction of the “releasable lysosomes” as we assume that the effects (lysosomal Ca2+ liberation, lysosomal exocytosis and ASM activity) are very localized and take place only at host-pathogen contact sites (see also above). In initial experimentation we attempted to visualize the local ASM activity on the cell surface by using a visible range FRET probe (Supp. Fig. 3). Cleavage of the probe by ASM on the surface leads to release of FITC into the cell culture medium which does not contribute a measurable signal at the surface.

      How are inhibitor doses determined? How efficient is the removal of extracellular bacteria at 10 min? It will be good to substantiate the cfu experiments for infectivity with imaging-based methods. Are the roles of TPC1 and TPC2 redundant? If so, why does silencing TPC1 alone result in a decrease in infectivity? For these and other assays, it would be better to show raw values for infectivity. Please show alterations in lysosomal Ca2+ at the doses of inhibitors indicated. Is lysosomal Ca2+ released upon S. aureus binding to the cell surface? Will be good to directly visualize this.

      Concerning the inhibitor concentrations, we either used values established in published studies or recommendations of the suppliers (e.g. 2-APB, Ned19, Vacuolin-1). For ASM inhibitors, we determined proper inhibition of ASM by activity assays. Concentrations of ionomycin resulting in Ca2+ influx and lysosomal exocytosis was determined in earlier studies of our lab.8, 30

      As to the removal of bacteria at 10 min p.i.: Lysostaphin is very efficient for removal of extracellular S. aureus and sterilizes the tissue culture supernatant. It significantly lyses bacteria within a few minutes, as determined by turbidity assays.31

      As to imaging-based infectivity assays: We will add an analysis of imaging-based invasion assays in the revised manuscript.

      Regarding the roles of TPC1 and TPC2: from our data we cannot conclude whether the roles of TPC1 and TPC2 are redundant. One could speculate that since blockage of TPC1 alone is sufficient to reduce internalization of bacteria, that both channels may have distinct roles. On the other hand, there might be a Ca2+ threshold in order to initiate lysosomal exocytosis that can only be attained if TPC1 and TPC2 are activated in parallel. Thus, our observations are in line with another study that shows reduced Ebola virus infection in absence of either TPC1 or TPC2.32

      As to raw CFU counts: whereas the observed effects upon blocking the invasion of S. aureus are stable, the number of internalized bacteria varies between individual biological replicates, for instance, by differences in host cell fitness or growth differences in bacterial cultures, which are prepared freshly for each experiment.

      With respect to visualization of lysosomal Ca2+ release: we agree with the reviewer that direct visual demonstration of lysosomal Ca2+ release upon infection will improve the manuscript. We therefore will perform additional experimentation to show alterations of Ca2+ at the lysosomes during infection.

      The precise identification of cytosolic vs phagosomal bacteria is not very easy to appreciate. The methods section indicates how this distinction is made, but how do the authors deal with partial overlaps and ambiguities generally associated with such analyses? Please show respective images. The number of events (individual bacteria) for the live cell imaging data should be clearly mentioned.

      We apologize for not having sufficiently explained the technology to detect escaped S. aureus. The cytosolic location of S. aureus is indicated by recruitment of RFP-CWT.33 CWT is the cell wall targeting domain of lysostaphin, which efficiently binds to the pentaglycine cross bridge in the peptidoglycan of S. aureus. This reporter is exclusively and homogenously expressed in the host cytosol. Only upon rupture of phagoendosomal membranes the reporter can be recruited to the cell wall of now cytosolically located bacteria. S. aureus mutants, for instance in the agr quorum sensing system, cannot break down the phagosomal membrane in non-professional phagocytes and thus stay unlabeled by the CWT-reporter.33 We will include respective images/movies of escape events and the bacteria numbers for live cell experiments in the revised version of the manuscript.

      In the phagosome maturation experiments, what is the proportion of bacteria in Rab5 or Rab7 compartments at each time point? Will the decreased Rab7 association be accompanied by increased Rab5? Showing raw values and images will help appreciate such differences. Given the expertise and tools available in live cell imaging, can the authors trace Rab5 and Rab7 positive compartment times for the same bacteria?

      We will include the proportion of Rab7-associated bacteria in the revised manuscript. Usually, we observe that Rab5 is only transiently (for a few minutes) present on phagosomes and only afterwards the phagosomes become positive for Rab7. We do not think that a decrease in Rab7-positive phagosomes would increase the proportion of Rab5-positive phagosomes. However, we cannot exclude this hypothesis with our data.

      We can achieve tracing of individual bacteria for recruitment of Rab5/Rab7 only manually, which impedes a quantitative evaluation. However, we will include information that illustrates the consecutive recruitment of the GTPases.

      The results with longer-term infection are interesting. Live cell imaging suggests that ASM-inhibited cells show accelerated phagosomal escape that reduces by 6 hpi. Where are the bacteria at this time point ? Presumably, they should have reached lysosomes. The relationship between cytosolic escape, replication, and host cell death is interesting, but the evidence, as presented is correlative for the populations. Given the use of live cell imaging, can the authors show these events in the same cell?

      We think that most bacteria-containing phagoendosomes should have fused with lysosomes 6 h p.i. as we have previously shown by acidification to pH of 5 and LAMP1 decoration.34

      We will provide images/videos to show the correlation between escape and replication in the revised manuscript.

      Given the inherent heterogeneity in uptake processes and the use of inhibitors in most experiments, the distinction between ASM-dependent and independent pathways might not be as clear-cut as the authors suggest. Some caution here will be good. Can the authors estimate what fraction of intracellular bacteria are taken up ASM-dependent?

      We agree with the reviewer that an overlap between internalization pathways is likely. A clear distinction is therefore certainly non-trivial. Alternative to ASM-dependent and ASM-independent pathways, the ASM activity may also accelerate one or several internalization pathways. We will address this limitation in the revised manuscript. 

      Early in infection (~10 min after contact with the cells), the proportion of bacteria that enter host cells ASM-dependently is relatively high amounting to roughly 75% in HuLEC. After 30 min, this proportion is decreasing to about 50%. We will include this information in the revised version of the manuscript.

      References

      (1) Launay, P. et al. TRPM4 Is a Ca2+-Activated Nonselective Cation Channel Mediating Cell Membrane Depolarization. Cell 109, 397-407 (2002).

      (2) Nilius, B. et al. The Ca<sup>2+</sup>‐activated cation channel TRPM4 is regulated by phosphatidylinositol 4,5‐biphosphate. The EMBO Journal 25, 467-478-478 (2006).

      (3) Cáceres, M. et al. TRPM4 Is a Novel Component of the Adhesome Required for Focal Adhesion Disassembly, Migration and Contractility. PLoS One 10, e0130540 (2015).

      (4) Silva, I., Brunett, M., Cáceres, M. & Cerda, O. TRPM4 modulates focal adhesion-associated calcium signals and dynamics. Biophysical Journal 123, 390a (2024).

      (5) Schlesier, T., Siegmund, A., Rescher, U. & Heilmann, C. Characterization of the Atl-mediated staphylococcal internalization mechanism. International Journal of Medical Microbiology 310, 151463 (2020).

      (6) Jevon, M. et al. Mechanisms of Internalization ofStaphylococcus aureus by Cultured Human Osteoblasts. Infection and Immunity 67, 2677-2681 (1999).

      (7) Rodriguez, A., Webster, P., Ortego, J. & Andrews, N.W. Lysosomes behave as Ca2+-regulated exocytic vesicles in fibroblasts and epithelial cells. J Cell Biol 137, 93-104 (1997).

      (8) Krones & Rühling et al. Staphylococcus aureus alpha-Toxin Induces Acid Sphingomyelinase Release From a Human Endothelial Cell Line. Front Microbiol 12, 694489 (2021).

      (9) Sakurai, Y. et al. Two-pore channels control Ebola virus host cell entry and are drug targets for disease treatment. Science 347, 995-998 (2015).

      (10) Aarhus, R., Graeff, R.M., Dickey, D.M., Walseth, T.F. & Lee, H.C. ADP-ribosyl cyclase and CD38 catalyze the synthesis of a calcium-mobilizing metabolite from NADP. J Biol Chem 270, 30327-30333 (1995).

      (11) Schmid, F., Fliegert, R., Westphal, T., Bauche, A. & Guse, A.H. Nicotinic acid adenine dinucleotide phosphate (NAADP) degradation by alkaline phosphatase. J Biol Chem 287, 32525-32534 (2012).

      (12) Angeletti, C. et al. SARM1 is a multi-functional NAD(P)ase with prominent base exchange activity, all regulated bymultiple physiologically relevant NAD metabolites. iScience 25, 103812 (2022).

      (13) Gu, F. et al. Dual NADPH oxidases DUOX1 and DUOX2 synthesize NAADP and are necessary for Ca(2+) signaling during T cell activation. Sci Signal 14, eabe3800 (2021).

      (14) Schonn, J.-S., Maximov, A., Lao, Y., Südhof, T.C. & Sørensen, J.B. Synaptotagmin-1 and -7 are functionally overlapping Ca<sup>2+</sup> sensors for exocytosis in adrenal chromaffin cells. Proceedings of the National Academy of Sciences 105, 3998-4003 (2008).

      (15) Kornhuber, J. et al. Functional Inhibitors of Acid Sphingomyelinase (FIASMAs): a novel pharmacological group of drugs with broad clinical applications. Cell Physiol Biochem 26, 9-20 (2010).

      (16) Naser, E. et al. Characterization of the small molecule ARC39, a direct and specific inhibitor of acid sphingomyelinase in vitro. J Lipid Res 61, 896-910 (2020).

      (17) Roth, A.G. et al. Potent and selective inhibition of acid sphingomyelinase by bisphosphonates. Angew Chem Int Ed Engl 48, 7560-7563 (2009).

      (18) Schuchman, E.H. & Desnick, R.J. Types A and B Niemann-Pick disease. Mol Genet Metab 120, 27-33 (2017).

      (19) Miller, M.E., Adhikary, S., Kolokoltsov, A.A. & Davey, R.A. Ebolavirus Requires Acid Sphingomyelinase Activity and Plasma Membrane Sphingomyelin for Infection. Journal of Virology 86, 7473-7483 (2012).

      (20) M. Rühling, L.K., F. Wagner, F. Schumacher, D. Wigger, D. A. Helmerich, T. Pfeuffer, R. Elflein, C. Kappe, M. Sauer, C. Arenz, B. Kleuser, T. Rudel, M. Fraunholz, J. Seibel Trifunctional sphingomyelin derivatives enable nanoscale resolution of sphingomyelin turnover in physiological and infection processes via expansion microscopy. Nat Commun accepted in principle (2024).

      (21) Peters, S. et al. Neisseria meningitidis Type IV Pili Trigger Ca(2+)-Dependent Lysosomal Trafficking of the Acid Sphingomyelinase To Enhance Surface Ceramide Levels. Infect Immun 87 (2019).

      (22) Grassmé, H. et al. Acidic sphingomyelinase mediates entry of N. gonorrhoeae into nonphagocytic cells. Cell 91, 605-615 (1997).

      (23) Li, C. et al. Regulation of Staphylococcus aureus Infection of Macrophages by CD44, Reactive Oxygen Species, and Acid Sphingomyelinase. Antioxid Redox Signal 28, 916-934 (2018).

      (24) Fernandes, M.C. et al. Trypanosoma cruzi subverts the sphingomyelinase-mediated plasma membrane repair pathway for cell invasion. J Exp Med 208, 909-921 (2011).

      (25) Luisoni, S. et al. Co-option of Membrane Wounding Enables Virus Penetration into Cells. Cell Host & Microbe 18, 75-85 (2015).

      (26) Rühling, M. et al. Trifunctional sphingomyelin derivatives enable nanoscale resolution of sphingomyelin turnover in physiological and infection processes via expansion microscopy. Nature Communications 15, 7456 (2024).

      (27) Ellison, C.J., Kukulski, W., Boyle, K.B., Munro, S. & Randow, F. Transbilayer Movement of Sphingomyelin Precedes Catastrophic Breakage of Enterobacteria-Containing Vacuoles. Curr Biol 30, 2974-2983 e2976 (2020).

      (28) Moldovan, A. & Fraunholz, M.J. In or out: Phagosomal escape of Staphylococcus aureus. Cell Microbiol 21, e12997 (2019).

      (29) Slotte, J.P. Biological functions of sphingomyelins. Progress in Lipid Research 52, 424-437 (2013).

      (30) Stelzner, K. et al. Intracellular Staphylococcus aureus Perturbs the Host Cell Ca(2+) Homeostasis To Promote Cell Death. mBio 11 (2020).

      (31) Kunz, T.C. et al. The Expandables: Cracking the Staphylococcal Cell Wall for Expansion Microscopy. Front Cell Infect Microbiol 11, 644750 (2021).

      (32) Sakurai, Y. et al. Ebola virus. Two-pore channels control Ebola virus host cell entry and are drug targets for disease treatment. Science 347, 995-998 (2015).

      (33) Grosz, M. et al. Cytoplasmic replication of Staphylococcus aureus upon phagosomal escape triggered by phenol-soluble modulin alpha. Cell Microbiol 16, 451-465 (2014).

      (34) Giese, B. et al. Staphylococcal alpha-toxin is not sufficient to mediate escape from phagolysosomes in upper-airway epithelial cells. Infect Immun 77, 3611-3625 (2009).

    2. eLife Assessment

      This valuable study proposes a novel rapid-entry mechanism of S. aureus that involves the rapid release of calcium from lysosomes. The strength of the paper lies in a very interesting hypothesis; what diminishes enthusiasm is the lack of appropriate methodology, thus making the study incomplete. The methods used are deficient: they are largely reliant on the use of chemical inhibitors and do not adequately support the conclusions.

    3. Reviewer #1 (Public review):

      Summary:

      The manuscript by Rühling et al analyzes the mode of entry of S. aureus into mammalian cells in culture. The authors propose a novel mechanism of rapid entry that involves the release of calcium from lysosomes via NAADP-stimulated activation of TPC1, which in turn causes lysosomal exocytosis; exocytic release of lysosomal acid sphingomyelinase (ASM) is then envisaged to convert exofacial sphingomyelin to ceramide. These events not only induce the rapid entry of the bacteria into the host cells but are also described to alter the fate of the intracellular S. aureus, facilitating escape from the endocytic vacuole to the cytosol.

      Strengths:

      The proposed mechanism is novel and could have important biological consequences.

      Weaknesses:

      Unfortunately, the evidence provided is unconvincing and insufficient to document the multiple, complex steps suggested. In fact, there appear to be numerous internal inconsistencies that detract from the validity of the conclusions, which were reached mostly based on the use of pharmacological agents of imperfect specificity.

      Firstly, the release of calcium from lysosomes is not demonstrated. Localized changes in the immediate vicinity of lysosomes need to be measured to ascertain that these organelles are the source of cytosolic calcium changes. In fact, 9-phenantrol, which the authors find to be the most potent inhibitor of invasion and hence of the putative calcium changes, is not a blocker of lysosomal calcium release but instead blocks plasmalemmal TRPM4 channels. On the other hand, invasion is seemingly independent of external calcium. These findings are inconsistent with each other and point to non-specific effects of 9-phenantrol. The fact that ionomycin decreases invasion efficiency is taken as additional evidence of the importance of lysosomal calcium release. It is not clear how these observations support involvement of lysosomal calcium release and exocytosis; in fact treatment with the ionophore should itself have induced lysosomal exocytosis and stimulated, rather than inhibited invasion. Yet, manipulations that increase and others that decrease cytosolic calcium both inhibited invasion.

      The proposed role of NAADP is based on the effects of "knocking out" TPC1 and on the pharmacological effects of Ned-19. It is noteworthy that TPC2, rather than TPC1, is generally believed to be the primary TPC isoform of lysosomes. Moreover, the gene ablation accomplished in the TPC1 "knockouts" is only partial and rather unsatisfactory. Definitive conclusions about the role of TPC1 can only be reached with proper, full knockouts. Even the pharmacological approach is unconvincing because the high doses of Ned-19 used should have blocked both TPC isoforms and presumably precluded invasion. Instead, invasion is reduced by only ≈50%. A much greater inhibition was reported using 9-phenantrol, the blocker of plasmalemmal calcium channels. How is the selective involvement of lysosomal TPC1 channels justified?

      Invoking an elevation of NAADP as the mediator of calcium release requires measurements of the changes in NAADP concentration in response to the bacteria. This was not performed. Instead, the authors analyzed the possible contribution of putative NAADP-generating systems and reported that the most active of these, CD38, was without effect, while the elimination of SARM1, another potential source of NAADP, had a very modest (≈20%) inhibitory effect that may have been due to clonal variation, which was not ruled out. In view of these data, the conclusion that NAADP is involved in the invasion process seems unwarranted.

      The involvement of lysosomal secretion is, again, predicated largely on the basis of pharmacological evidence. No direct evidence is provided for the insertion of lysosomal components into the plasma membrane, or for the release of lysosomal contents to the medium. Instead, inhibition of lysosomal exocytosis by vacuolin-1 is the sole source of evidence. However, vacuolin-1 is by no means a specific inhibitor of lysosomal secretion: it is now known to act primarily as a PIKfyve inhibitor and to cause massive distortion of the endocytic compartment, including gross swelling of endolysosomes. The modest (20-25%) inhibition observed when using synaptotagmin 7 knockout cells is similarly not convincing proof of the requirement for lysosomal secretion.

      ASM is proposed to play a central role in the rapid invasion process. As above, most of the evidence offered in this regard is pharmacological and often inconsistent between inhibitors or among cell types. Some drugs affect some of the cells, but not others. It is difficult to reach general conclusions regarding the role of ASM. The argument is made even more complex by the authors' use of exogenous sphingomyelinase (beta-toxin). Pretreatment with the toxin decreased invasion efficiency, a seemingly paradoxical result. Incidentally, the effectiveness of the added toxin is never quantified/validated by directly measuring the generation of ceramide or the disappearance of SM.

      The use of fluorescent analogs of sphingomyelin and ceramide is not well justified and it is unclear what conclusions can be derived from these observations. Despite the low resolution of the images provided, it appears as if the labeled lipids are largely in endomembrane compartments, where they would presumably be inaccessible to the secreted ASM. Moreover, considering the location of the BODIPY probe, the authors would be unable to distinguish intact sphingomyelin from its breakdown product, ceramide. What can be concluded from these experiments? Incidentally, the authors report only 10% of BODIPY-positive events after 10 min. What are the implications of this finding? That 90% of the invasion events are unrelated to sphingomyelin, ASM, and ceramide?

      It is also unclear how the authors can distinguish lysenin entry into ruptured vacuoles from the entry of RFP-CWT, used as a criterion of bacterial escape. Surely the molecular weights of the probes are not sufficiently different to prevent the latter one from traversing the permeabilized membrane until such time that the bacteria escape from the vacuole.

      Both SMase inhibitors (Figure 4C) and SMase pretreatment increased bacterial escape from the vacuole. The former should prevent SM hydrolysis and formation of ceramide, while the latter treatment should have the exact opposite effects, yet the end result is the same. What can one conclude regarding the need and role of the SMase products in the escape process?

    4. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Ruhling et al propose a rapid uptake pathway that is dependent on lysosomal exocytosis, lysosomal Ca2+ and acid sphingomyelinase, and further suggest that the intracellular trafficking and fate of the pathogen is dictated by the mode of entry.

      The evidence provided is solid, methods used are appropriate and results largely support their conclusions, but can be substantiated further as detailed below. The weakness is a reliance on chemical inhibitors that can be non-specific to delineate critical steps.

      Specific comments:

      A large number of experiments rely on treatment with chemical inhibitors. While this approach is reasonable, many of the inhibitors employed such as amitriptyline and vacuolin1 have other or non-defined cellular targets and pleiotropic effects cannot be ruled out. Given the centrality of ASM for the manuscript, it will be important to replicate some key results with ASM KO cells.

      Most experiments are done in HeLa cells. Given the pathway is projected as generic, it will be important to further characterize cell type specificity for the process. Some evidence for a similar mechanism in other cell types S. aureus infects, perhaps phagocytic cell type, might be good.

      I'm a little confused about the role of ASM on the surface. Presumably, it converts SM to ceramide, as the final model suggests. Overexpression of b-toxin results in the near complete absence of SM on phagosomes (having representative images will help appreciate this), but why is phagosomal SM detected at high levels in untreated conditions? If bacteria are engulfed by SM-containing membrane compartments, what role does ASM play on the surface? If surface SM is necessary for phagosomal escape within the cell, do the authors imply that ASM is tuning the surface SM levels to a certain optimal range? Alternatively, can there be additional roles for ASM on the cell surface? Can surface SM levels be visualized (for example, in Figure 4 E, F)?

      Related to that, why is ASM activity on the cell surface important? Its role in non-infectious or other contexts can be discussed.

      If SM removal is so crucial for uptake, can exocytosis of lysosomes alone provide sufficient ASM for SM removal? How much or to what extent is lysosomal exocytosis enhanced by initial signaling events? Do the authors envisage the early events in their model happening in localized confines of the PM, this can be discussed.

      How are inhibitor doses determined? How efficient is the removal of extracellular bacteria at 10 min? It will be good to substantiate the cfu experiments for infectivity with imaging-based methods. Are the roles of TPC1 and TPC2 redundant? If so, why does silencing TPC1 alone result in a decrease in infectivity? For these and other assays, it would be better to show raw values for infectivity. Please show alterations in lysosomal Ca2+ at the doses of inhibitors indicated. Is lysosomal Ca2+ released upon S. aureus binding to the cell surface? Will be good to directly visualize this.

      The precise identification of cytosolic vs phagosomal bacteria is not very easy to appreciate. The methods section indicates how this distinction is made, but how do the authors deal with partial overlaps and ambiguities generally associated with such analyses? Please show respective images. The number of events (individual bacteria) for the live cell imaging data should be clearly mentioned.

      In the phagosome maturation experiments, what is the proportion of bacteria in Rab5 or Rab7 compartments at each time point? Will the decreased Rab7 association be accompanied by increased Rab5? Showing raw values and images will help appreciate such differences. Given the expertise and tools available in live cell imaging, can the authors trace Rab5 and Rab7 positive compartment times for the same bacteria?

      The results with longer-term infection are interesting. Live cell imaging suggests that ASM-inhibited cells show accelerated phagosomal escape that reduces by 6 hpi. Where are the bacteria at this time point ? Presumably, they should have reached lysosomes. The relationship between cytosolic escape, replication, and host cell death is interesting, but the evidence, as presented is correlative for the populations. Given the use of live cell imaging, can the authors show these events in the same cell?

      Given the inherent heterogeneity in uptake processes and the use of inhibitors in most experiments, the distinction between ASM-dependent and independent pathways might not be as clear-cut as the authors suggest. Some caution here will be good. Can the authors estimate what fraction of intracellular bacteria are taken up ASM-dependent?

    1. eLife Assessment

      This fundamental study explores how genotypic changes relate to phenotypic stasis or variation within chitons, a molluscan group. Chitons are significant because their ancient body plan has remained largely unchanged for millions of years, yet the paper reveals rapid and large-scale genomic changes. This compelling study is a splendid advance in approximately doubling the number of sequenced chiton genomes, providing what appears to be among the best genome annotations for chiton genomes available to date. The study's key focus is on the genomic rearrangements across five reference-quality genomes of chitons and their implications for understanding evolutionary mechanisms, particularly in comparison to other molluscan clades.

    2. Reviewer #1 (Public review):

      Summary of Key Findings:

      The authors identified 20 ancient molluscan linkage groups (MLGs) that are largely conserved in other molluscan groups but highly dynamic and rearranged in chitons. This contrasts with the stability seen in other animal groups.

      Significant chromosome rearrangements, fusions, and duplications were observed in chitons, particularly in the most basal clades like Lepidopleurida, indicating that chitons undergo more extensive genomic changes than expected.

      Chitons exhibit extremely high levels of genomic heterozygosity, exceeding that of other molluscan species and even Lepidoptera. This presents challenges for assembling high-quality genomes but also points to genetic diversity as a driver of evolutionary processes.

      Partial genome duplications, particularly in Liolophura japonica, extend the knowledge of gene duplication events within the broader Mollusca clade.

      The paper speculates that these genomic rearrangements may contribute to maintaining species boundaries in sympatric and parapatric radiations, as observed in certain Acanthochitona species.

      Strengths:

      The use of high-quality genomic data, including four de novo genome assemblies, provides robust evidence for the conclusions.

      The research challenges the common assumption that chitons are evolutionarily conservative, showing that their genomes are highly dynamic despite their morphological stasis.

      The study adds to the understanding of how chromosomal rearrangements might contribute to speciation, a concept that can be applied to other taxa.

      Limitations:

      The paper acknowledges that the limited availability of high-quality genomes across molluscs may restrict the scope of comparative analyses. More genomic data from other molluscan groups could strengthen the conclusions.

      The role of high heterozygosity in chitons is highlighted, but more information is needed to clarify how this affects genome assembly and evolutionary outcomes.

      Implications for Future Research:

      The research raises important questions about the relationship between genomic instability and phenotypic stasis, which can inform studies in other animal groups.

      The findings call for a re-evaluation of how we define and measure biodiversity, particularly in "neglected" clades like chitons. Further studies could focus on linking the observed genomic changes to specific adaptive traits or ecological niches.

    3. Reviewer #2 (Public review):

      Summary:

      The authors provide four new annotated genomes for an important taxon within Mollusca known as Polyplacophora (chitons). They provide an impressive analysis showing syntenic relationships between the chromosomes of these four genomes but also other available chiton genome sequences and analysis of 20 molluscan linkage groups to expand this analysis across Mollusca.

      Strengths:

      The authors have selected particular chiton species for genome sequencing and annotation that expand what is known about genomes across portions of chiton phylogenetic diversity lacking genome sequences. The manuscript is well-written and illustrated in a concise manner. The figures are mostly clear, allowing a reader to visually compare the syntenic relationships of chromosomes, especially within chitons. Their phylogenetic analysis provides a simple manner to map important events in molluscan genome evolution. This study greatly expands what is known about molluscan and chiton comparative genomics.

      Weaknesses:

      I am not especially convinced that chitons have experienced more substantial genomic rearrangements or other genomic events than other molluscan classes, and for this reason, I did not personally find the title compelling: "Still waters run deep: Large scale genome rearrangements in the evolution of morphologically conservative Polyplacophora." Are the documented events "large scale genomic rearrangements"? It seems that mostly they found two cases of chromosome fusion, plus one apparent case of whole genome duplication. What do they mean by "Still waters run deep"? I have no idea. I guess they consider chitons to be morphologically conservative in their appearance and lifestyle so they are calling attention to this apparent paradox. However, most chiton genomes seem to be relatively conserved, but there are unexpected chromosome fusion events within a particular genus, Acanthochitona. Likewise, they found a large-scale gene duplication event in Acanthopleurinae, a different subfamily of chitons, which is quite interesting but these seem to be geologically recent events that do not especially represent the general pattern of genome evolution across this ancient molluscan taxon.

    1. eLife Assessment

      This important study reports new insights into the roles of a long noncoding RNA, lnc-FANCI-2, in the progression of cervical cancer induced by a type of human papillomavirus. Through a blend of cell biological, biochemical, and genetic analyses of RNA and protein expression, protein-protein interaction, cell signaling, and cell morphology, the authors provide convincing evidence that lnc-FANCI-2 affects cervical cancer outcome by regulating the RAG signaling pathway. These findings will be of interest to scientists in the fields of cervical cancer, long noncoding RNA, and cell signaling.

    2. Reviewer #1 (Public review):

      Summary:

      The authors attempted to dissect the function of a long non-coding RNA, lnc-FANCI-2, in cervical cancer. They profiled lnc-FANCI-2 in different cell lines and tissues, generated knockout cell lines, and characterized the gene using multiple assays.

      Strengths:

      A large body of experimental data has been presented and can serve as a useful resource for the scientific community, including transcriptomics and proteomics datasets. The reported results also span different parts of the regulatory network and open up multiple avenues for future research.

      Weaknesses:

      The write-up is somewhat unfocused and lacks deep mechanistic insights in some places.

    3. Reviewer #2 (Public review):

      The study by Liu et al provides a functional analysis of lnc-FANCI-2 in cervical carcinogenesis, building on their previous discovery of FANCI-2 being upregulated in cervical cancer by HPV E7.

      The authors conducted a comprehensive investigation by knocking out (KO) FANCI-2 in CaSki cells and assessing viral gene expression, cellular morphology, altered protein expression and secretion, altered RNA expression through RNA sequencing (verification of which by RT-PCR is well appreciated), protein binding, etc. Verification experiments by RT-PCR, Western blot, etc are notable strengths of the study.

      The KO and KD were related to increased Ras signaling and EMT and reduced IFN-y/a responses.

      Although the large amount of data is well acknowledged, it is a limitation that most data come from CaSki cells, in which FANCI-2 localization is different from SiHa cells and cancer tissues (Figure 1). The cytoplasmic versus nuclear localization is somewhat puzzling.

    4. Reviewer #3 (Public review):

      Summary:

      A long noncoding RNA, lnc-FANCI-2, was reported to be regulated by HPV E7 oncoprotein and a cell transcription factor, YY1 by this group. The current study focuses on the function of lnc-FANCI-2 in HPV-16 positive cervical cancer is to intrinsically regulate RAS signaling, thereby facilitating our further understanding of additional cellular alterations during HPV oncogenesis. The authors used advanced technical approaches such as KO, transcriptome and (IRPCRP) and LC- MS/MS analyses in the current study and concluded that KO Inc-FANCI-2 significantly increases RAS signaling, especially phosphorylation of Akt and Erk1/2.

      Strengths:

      (1) HPV E6E7 are required for full immortalization and maintenance of the malignant phenotype of cervical cancer, but they are NOT sufficient for full transformation and tumorigenesis. This study helps further understanding of other cellular alterations in HPV oncogenesis.

      (2) lnc-FANCI-2 is upregulated in cervical lesion progression from CIN1, CIN2-3 to cervical cancer, cancer cell lines, and HPV transduced cell lines.

      (3) Viral E7 of high-risk HPVs and host transcription factor YY1 are two major factors promoting lnc-FANCI-2 expression.

      (4) Proteomic profiling of cytosolic and secreted proteins showed inhibition of MCAM, PODXL2, and ECM1 and increased levels of ADAM8 and TIMP2 in KO cells.

      (5) RNA-seq analyses revealed that KO cells exhibited significantly increased RAS signaling but decreased IFN pathways.

      (6) Increased phosphorylated Akt and Erk1/2, IGFBP3, MCAM, VIM, and CCND2 (cyclin D2) and decreased RAC3 were observed in KO cells.

      Weaknesses:

      (1) The authors observed the increased Inc-FANCI-2 in HPV 16 and 18 transduced cells, and other cervical cancer tissues as well, HPV-18 positive HeLa cells exhibited different expressions of Inc-FANCI-2.

      (2) Previous studies and data in the current showed a steadily increased Inc-FANCI-2 during cancer progression, however, the authors did not observe significant changes in cell behaviors (both morphology and proliferation) in KO Inc-FANCI-2.

      (3) The authors observed the significant changes of RAS signaling (downstream) in KO cells, but they provided limited interpretations of how these results contributed to full transformation or tumorigenesis in HPV-positive cancer.

    1. eLife Assessment

      In this potentially important study, the authors employed advanced computational techniques to explore a detailed atomistic description of the mechanism and energetics of substrate translocation in the MelB transporter. The overall approach is solid and reveals the coupling between sodium binding and melibiose transport through a series of conformational transitions, and the results for a mutant are also in qualitative agreement with the experiment, providing further support to the computational analyses. Nevertheless, the level of evidence is considered incomplete since there are concerns regarding the convergence and initial guess of the string calculations, leaving doubts that the computed pathway does not reflect the most energetically favorable mechanism.

    2. Reviewer #1 (Public review):

      Summary:

      Liang and Guan have studied the transport mechanism of Melbiose transporter MelB using the string method in collective variables and replica-exchange umbrella sampling simulations. The authors study the mechanism of substrate binding to the outward-facing state, conformational change of the transporter from outward-facing to inward-facing, and substrate unbinding from inward-facing state. In their analysis, they also highlight the effects of mutant D59C and the effect of sodium binding on the substrate transport process.

      Strengths:

      The authors employ a combination of string method and replica-exchange umbrella sampling simulation techniques to provide a complete map of the free energy landscape for sodium-coupled melibiose transport in MelB.

      Weaknesses:

      (1) Free energy barriers appear to be very high for a substrate transport process. In Figure 3, the transitions from IF (Inward facing) to OF (Outward facing) state appear to have a barrier of 12 kcal/mol. Other systems with mutant or sodium unbound have even higher barriers. This does not seem consistent with previous studies where transport mechanisms of transporters have been explored using molecular dynamics.

      (2) Figure 2b: The PMF between images 20-30 shows the conformation change from OF to IF, where the occluded (OC) state is the highest barrier for transition. However, OC state is usually a stable conformation and should be in a local minimum. There should be free energy barriers between OF and OC and in between OC and IF.

      (3) String method pathway is usually not the only transport pathway and alternate lower energy pathways should be explored. The free energy surface looks like it has not deviated from the string pathway. Longer simulations can help in the exploration of lower free energy pathways.

      (4) The conformational change in transporters from OF to IF state is a complicated multi-step process. First, only 10 images in the string pathway are used to capture the transition from OF to IF state. I am not sure is this number is enough to capture the process. Second, the authors have used geodesic interpolation algorithm to generate the intermediate images. However, looking at Figure 3B, it looks like the transition pathway has not captured the occluded (OC) conformation, where the transport tunnel is closed at both the ends. Transporters typically follow a stepwise conformational change mechanism where OF state transitions to OC and then to IF state. It appears that the interpolation algorithm has created a hourglass-like state, where IF gates are opening and OF gates are closing simultaneously thereby creating a state where the transport tunnel is open on both sides of the membrane. These states are usually associated with high energy. References 30-42 cited in the manuscript reveal a distinct OC state for different transporters.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Liang and Guan provides an impressive attempt to characterize the conformational free energy landscape of a melibiose permease (MelB), a symporter member of major facilitator superfamily (MFS) of transporters. Although similar studies have been conducted previously for other members of MFS, each member or subfamily has its own unique features that make the employment of such methods quite challenging. While the methodology is indeed impressive, characterizing the coupling between large-scale conformational changes and substrate binding in membrane transporters is quite challenging and requires a sophisticated methodology. The conclusions obtained from the three sets of path-optimization and free energy calculations done by the authors are generally supported by the provided data and certainly add to our understanding of how sodium binding facilitates the transport of melibiose in MelB. However, the data is not generated reliably which questions the relevance of the conclusions as well. I particularly have some concerns regarding the implementation of the methodology that I will discuss below.

      (1) In enhanced sampling techniques, often much attention is given to the sampling algorithm. Although the sampling algorithm is quite important and this manuscript has chosen an excellent pair: string method with swarms of trajectories (SMwST) and replica-exchange umbrella sampling (REUS) for this task, there are other important factors that must be taken into account. More specifically, the collective variables used and the preparation of initial conformations for sampling. I have objectives for both of these (particularly the latter) that I detail below. Overall, I am not confident that the free energy profiles generated (summarized in Figure 5) are reliable, and unfortunately, much of the data presented in this manuscript heavily relies on these free energy profiles.

      (2) The authors state that they have had an advantage over other similar studies in that they had two endpoints of the string to work from experimental data. I agree that this is an advantage. However, this could lead to some dangerous flaws in the methodology if not appropriately taken into account. Proteins such as membrane transporters have many slow degrees of freedom that can be fully captured within tens of nanoseconds (90 ns was the simulation time used here for the REUS). Biased sampling allows us to overcome this challenge to some extent, but it is virtually impossible to take into account all slow degrees of freedom in the enhanced sampling protocol (e.g., the collective variables used here do not represent anything related to sidechain dynamics). Therefore, if one mixes initial conformations that form different initial structures (e.g., an OF state and an IF state from two different PDB files), it is very likely that despite all equilibration and relaxation during SMwST and REUS simulations, the conformations that come from different sources never truly mix. This is dangerous in that it is quite difficult to detect such inconsistencies and from a theoretical point of view it makes the free energy calculations impossible. Methods such as WHAM and its various offshoots all rely on overlap between neighboring windows to calculate the free energy difference between two windows and the overlap should be in all dimensions and not just the ones that we use for biasing. This is related to well-known issues such as hidden barriers and metastability. If one uses two different structures to generate the initial conformations, then the authors need to show their sampling has been long enough to allow the two sets of conformations to mix and overlap in all dimensions, which is a difficult task to do.

      (3) I also have concerns regarding the choice of collective variables. The authors have split the residues in each transmembrane helix into the cyto- and periplasmic sides. Then they have calculated the mass center distance between the cytoplasmic sides of certain pairs of helices and have also done the same for the periplasmic side. Given the shape of a helix, this does not seem to be an ideal choice since rather than the rotational motion of the helix, this captures more the translational motion of the helix. However, the transmembrane helices are more likely to undergo rotational motion than the translational one.

      (4) Convergence: String method convergence data does not show strong evidence for convergence (Figure S2) in my opinion. REUS convergence is also not discussed. No information is provided on the exchange rate or overlap between the windows.

    4. Reviewer #3 (Public review):

      The paper from Liang and Guan details the calculation of the potential mean force for the transition between two key states of the melibiose (Mel) transporter MelB. The authors used the string method along with replica-exchange umbrella sampling to model the transition between the outward and inward-facing Mel-free states, including the binding and subsequent release of Mel. They find a barrier of ~6.8 kcal/mol and an overall free-energy difference of ~6.4 kcal/mol. They also investigate the same process without the co-transported Na+, finding a higher barrier, while in the D59C mutant, the barrier is nearly eliminated.

      I found this to be an interesting and technically competent paper. I was disappointed actually to see that the authors didn't try to complete the cycle. I realize this is beyond the scope of the study as presented.

      The results are in qualitative agreement with expectations from experiments. Could the authors try to make this comparison more quantitative? For example, by determining the diffusivity along the path, the authors could estimate transition rates.

      Relatedly, could the authors comment on how typical concentration gradients of Mel and Na+ would affect these numbers?

    5. Author response:

      Reviewer 1:

      (1) Free energy barriers appear to be very high for a substrate transport process. In Figure 3, the transitions from IF (Inward facing) to OF (Outward facing) state appear to have a barrier of 12 kcal/mol. Other systems with mutant or sodium unbound have even higher barriers. This does not seem consistent with previous studies where transport mechanisms of transporters have been explored using molecular dynamics. 

      First, in Figure 3, the transition from IF to OF state doesn’t have a barrier of 12 kcal/mol. The IFF to OFB transition is almost barrierless, and from OFB to OFF is ~5 kcal/mol, which is also evident in Figure 2.

      If the reviewer was referring to the transition from OFB to IFB states, the barrier is 6.8 kcal/mol (Na+ bound state), and the rate-limiting barrier in the entire sugar transport process (Na+ bound state) is 8.4 kcal/mol, as indicated in Figure 2 and Table 1, which is much lower than the 12 kcal/mol barrier the reviewer mentioned. When the Na+ is unbound, the barrier can be as high as 12 kcal/mol, but it is this high barrier that leads to our conclusion that the Na+ binding is essential for sugar transport, and the 12 kcal/mol barrier indicates an energetically unfavorable sugar translocation process when the Na+ is unbound, which is unlikely to be the major translocation process in nature. 

      Even for the 12 kcal/mol barrier reported for the Na+ unbound state, it is still not too high considering the experimentally measured MelB sugar active transport rate, which is estimated to be on the order of 10 to 100 s-1. This range of transport rate is typical for similar MFS transporters such as the lactose permease (LacY), which has an active transport rate of 20 s-1. The free energy barrier associated with the active transport is thus on the order of ~15-16 kcal/mol based on transition state theory assuming kBT/h as the prefactor. This experimentally estimated barrier is higher than all of our calculated barriers. Our calculated barrier for the sugar translocation with Na+ bound is 8.4 kcal/mol, which means an additional ~7-8 kcal/mol barrier is contributed by the Na+ release process after sugar release in the IFF state. This is a reasonable estimation of the Na+ unbinding barrier.

      Therefore, whether the calculated barrier is too high depends on the experimental kinetics measurements, which are often challenging to perform. Based on the existing experimental data, the MFS transporters are

      usually relatively slow in their active transport cycle. The calculated barrier thus falls within the reasonable range considering the experimentally measured active transport rates.

      (2) Figure 2b: The PMF between images 20-30 shows the conformation change from OF to IF, where the occluded (OC) state is the highest barrier for transition. However, OC state is usually a stable conformation and should be in a local minimum. There should be free energy barriers between OF and OC and in between OC and IF.  

      First, the occluded state (OCB) is not between images 20-30, it is between images 10 to 20. Second, there is no solid evidence that the OCB state is a stable conformation and a local minimum. Existing experimental structures of MFS transporters seldom have the fully occluded state resolved.

      (3) String method pathway is usually not the only transport pathway and alternate lower energy pathways should be explored. The free energy surface looks like it has not deviated from the string pathway. Longer simulations can help in the exploration of lower free energy pathways. 

      We agree with the reviewer that the string method pathway is usually not the only transport pathway and alternate lower energy pathways could exist. However, we also note that even if the fully occluded state is a local minimum and our free energy pathway does visit this missing local minimum after improved sampling, the overall free energy barrier will not be lowered from our current calculated value. This is because the current rate-limiting barrier arises from the transition from the OFB state to the IFF state, and the barrier top corresponds to the sugar molecule passing through the most constricted region in the cytoplasmic region, i.e., the IFC intermediate state visited after the IFB state is reached. Therefore, the free energy difference between the OFB state and the IFC state will not be changed by another hypothetical local minimum between the OFB and IFB states, i.e., the occluded OCB state. In other words, a hypothetical local minimum corresponding to the occluded state, even if it exists, will not decrease the overall rate-limiting barrier and may even increase it further, depending on the depth of the local minimum and the additional barriers of entering and escaping from this new minimum. 

      (4) The conformational change in transporters from OF to IF state is a complicated multi-step process. First, only 10 images in the string pathway are used to capture the transition from OF to IF state. I am not sure is this number is enough to capture the process. Second, the authors have used geodesic interpolation algorithm to generate the intermediate images. However, looking at Figure 3B, it looks like the transition pathway has not captured the occluded (OC) conformation, where the transport tunnel is closed at both the ends. Transporters typically follow a stepwise conformational change mechanism where OF state transitions to OC and then to IF state. It appears that the interpolation algorithm has created a hourglasslike state, where IF gates are opening and OF gates are closing simultaneously thereby creating a state where the transport tunnel is open on both sides of the membrane. These states are usually associated with high energy. References 30-42 cited in the manuscript reveal a distinct OC state for different transporters. 

      In our simulations, even with 10 initial images representing the OF to IF conformational transition, the occluded state is sampled in the final string pathway. There is an ensemble of snapshots where the extracellular and intracellular gates are both relatively narrower than the OF and IF states, preventing the sugar from leaking into either side of the bulk solution. In contrast to the reviewer’s guess, we never observed an hourglass-like state in our simulation where both gates are open. Figure 3B is a visual representation of the backbone structure of the OCB state without explicitly showing the actual radius of the gating region, which also depends on the side chain conformations. Thus, Figure 3B alone cannot be used to conclude that we are dominantly sampling an hourglass-like intermediate conformation instead of the occluded state, as mentioned by the reviewer. 

      Moreover, not all references in 30-42 have sampled the occluded state since many of them did not even simulate the substrate translocation process at all. For the ones that did sample substrate translocation processes, only two of them were studying the cation-coupled MFS family symporter (ref 38, 40) and they didn’t provide the PMF for the entire translocation process. There is no strong evidence for a stable minimum corresponding to a fully occluded state in these two studies.  In fact, different types of transporters with different coupling cations may exhibit different stability of the fully occluded state. For example, the fully occluded state has been experimentally observed for some MFS transporters, such as multidrug transporter EmrD, but not for others, such as lactose permease LacY. Thus, it is not generally true that a stable, fully-occluded state exists in all transporters, and it highly depends on the specific type of transporter and the coupling ion under study. 

      Reviewer 2:

      The manuscript by Liang and Guan provides an impressive attempt to characterize the conformational free energy landscape of a melibiose permease (MelB), a symporter member of major facilitator superfamily (MFS) of transporters. Although similar studies have been conducted previously for other members of MFS, each member or subfamily has its own unique features that make the employment of such methods quite challenging. While the methodology is indeed impressive, characterizing the coupling between large-scale conformational changes and substrate binding in membrane transporters is quite challenging and requires a sophisticated methodology. The conclusions obtained from the three sets of path-optimization and free energy calculations done by the authors are generally supported by the provided data and certainly add to our understanding of how sodium binding facilitates the transport of melibiose in MelB. However, the data is not generated reliably which questions the relevance of the conclusions as well. I particularly have some concerns regarding the implementation of the methodology that I will discuss below. 

      (1) In enhanced sampling techniques, often much attention is given to the sampling algorithm. Although the sampling algorithm is quite important and this manuscript has chosen an excellent pair: string method with swarms of trajectories (SMwST) and replica-exchange umbrella sampling (REUS) for this task, there are other important factors that must be taken into account. More specifically, the collective variables used and the preparation of initial conformations for sampling. I have objectives for both of these (particularly the latter) that I detail below. Overall, I am not confident that the free energy profiles generated (summarized in Figure 5) are reliable, and unfortunately, much of the data presented in this manuscript heavily relies on these free energy profiles. 

      Since comments (1) and (2) from this review are related, please see our response to (2) below. 

      (2) The authors state that they have had an advantage over other similar studies in that they had two endpoints of the string to work from experimental data. I agree that this is an advantage. However, this could lead to some dangerous flaws in the methodology if not appropriately taken into account. Proteins such as membrane transporters have many slow degrees of freedom that can be fully captured within tens of nanoseconds (90 ns was the simulation time used here for the REUS). Biased sampling allows us to overcome this challenge to some extent, but it is virtually impossible to take into account all slow degrees of freedom in the enhanced sampling protocol (e.g., the collective variables used here do not represent anything related to sidechain dynamics). Therefore, if one mixes initial conformations that form different initial structures (e.g., an OF state and an IF state from two different PDB files), it is very likely that despite all equilibration and relaxation during SMwST and REUS simulations, the conformations that come from different sources never truly mix. This is dangerous in that it is quite difficult to detect such inconsistencies and from a theoretical point of view it makes the free energy calculations impossible. Methods such as WHAM and its various offshoots all rely on overlap between neighboring windows to calculate the free energy difference between two windows and the overlap should be in all dimensions and not just the ones that we use for biasing. This is related to well-known issues such as hidden barriers and metastability. If one uses two different structures to generate the initial conformations, then the authors need to show their sampling has been long enough to allow the two sets of conformations to mix and overlap in all dimensions, which is a difficult task to do. 

      We partly agree with the reviewer in that it is challenging to investigate whether the structures generated from the two different initial structures are sufficiently mixed in terms of orthogonal degrees of freedom outside the CV space during our string method and REUS simulations. We acknowledge that our simulations are within 100 ns for each REUS window, and there could be some slow degrees of freedom that are not fully sampled within this timescale. However, the conjectures and concerns raised by the reviewer are somewhat subjective in that they are almost impossible to be completely disproven. In a sense, these concerns are essentially the same as the general suspicion that the biomolecular simulation results are not completely converged, which cannot be fully ruled out for relatively complex biomolecular systems in any computational study involving MD simulations.  We also note that comparison among the PMFs of different cation bound/unbound states will have some error cancellation effects because of the consistent use of the same sampling methods for all three systems. Our main conclusions regarding the cooperative binding and transport of the two substrates lie in such comparison of the PMFs and additionally on the unbiased MD simulations. Thus, although there could be insufficient sampling, our key conclusions based on the relative comparison between the PMFs are more robust and less likely to suffer from insufficient sampling.

      (3) I also have concerns regarding the choice of collective variables. The authors have split the residues in each transmembrane helix into the cyto- and periplasmic sides. Then they have calculated the mass center distance between the cytoplasmic sides of certain pairs of helices and have also done the same for the periplasmic side. Given the shape of a helix, this does not seem to be an ideal choice since rather than the rotational motion of the helix, this captures more the translational motion of the helix. However, the transmembrane helices are more likely to undergo rotational motion than the translational one. 

      Our choice of CVs not only captures the translational motion but also the rotational motion of the helix. Consider a pair of helices. If there is a relative rotation in the angle between the two helices, causing the extracellular halves of the two helices to get closer and the intracellular halves to be more separated, this rotational motion can be captured as the decrease of one CV describing the extracellular distance and increase in the other CV describing the intracellular distance between the two helices. Reversely, if one of the two CVs is forced to increase and the other one forced to decrease, it can, in principle, bias the relative rotation of the two helices with respect to each other. Indeed, comparing Figure 3 with Figure S4, the reorientation of the helices with respect to the membrane normal (Fig. S4) is accompanied by the simultaneous decrease and increase in the pairwise distances between different segments of the helices. Therefore, our choice of CVs in the string method and REUS are not biased against the rotation of the helices, as the reviewer assumed.

      (4) Convergence: String method convergence data does not show strong evidence for convergence (Figure S2) in my opinion. REUS convergence is also not discussed. No information is provided on the exchange rate or overlap between the windows.

      The convergence of string method, REUS, the exchange rate and overlap between windows will be discussed in the reviewed manuscript.

      Reviewer 3:

      The paper from Liang and Guan details the calculation of the potential mean force for the transition between two key states of the melibiose (Mel) transporter MelB. The authors used the string method along with replica-exchange umbrella sampling to model the transition between the outward and inwardfacing Mel-free states, including the binding and subsequent release of Mel. They find a barrier of ~6.8 kcal/mol and an overall free-energy difference of ~6.4 kcal/mol. They also investigate the same process without the co-transported Na+, finding a higher barrier, while in the D59C mutant, the barrier is nearly eliminated.

      For Na+ bound state, the rate-limiting barrier is 8.4 kcal/mol instead of 6.8 kcal/mol. The overall free energy difference is 3.7 kcal/mol instead of 6.4 kcal/mol. These numbers need to be corrected in the public review.

      I found this to be an interesting and technically competent paper. I was disappointed actually to see that the authors didn't try to complete the cycle. I realize this is beyond the scope of the study as presented.

      We agree with the reviewer that characterizing the complete cycle is our eventual goal. However, in order to characterize the complete cycle of the transporter, the free energy landscapes of the Na+ binding and unbinding process in the sugar-bound and unbound states, as well as the OF to IF conformational transition in the apo state. These additional calculations are expensive, and the amount of work devoted to these new calculations is estimated to be at least the same as the current study. Therefore, we prefer to carry out and analyze these new simulations in a future study.  

      The results are in qualitative agreement with expectations from experiments. Could the authors try to make this comparison more quantitative? For example, by determining the diffusivity along the path, the authors could estimate transition rates.

      In our revised manuscript, we will determine the diffusivity along the path and estimate transition rates.

      Relatedly, could the authors comment on how typical concentration gradients of Mel and Na+ would affect these numbers?

      The concentration gradient of Mel and Na+ can be varied in different experimental setups. In a typical active transport essay, the Na+ has a higher concentration outside the cell, and the melibiose has a higher concentration inside the cell. In the steady state, depending on the experiment setup, the extracellular Na+ concentration is in the range of 10-20 mM, and the intracellular concentration is self-balanced in the range of 3-4 mM due to the presence of other ion channels and pumps. In addition to the Na+ concentration gradient, there is also a transmembrane voltage potential of -200 mV (the intracellular side being more negative than the extracellular side), which facilitates the Na+ release into the intracellular side. In the steady state, the extracellular concentration of melibiose is ~0.4 mM, and the intracellular concentration is at least 1000 times the extracellular concentration, greater than 0.4 M. In this scenario, the free energy change of intracellular melibiose translocation will be increased by about ~5 kcal/mol at 300K temperature, leading to a total ∆𝐺 of ~8 kcal/mol. The total barrier for the melibiose translocation is expected to be increased by less than 5 kcal/mol. However, the increase in ∆𝐺 for intracellular melibiose translocation will be compensated by a decrease in ∆𝐺 of similar magnitude ( ~5 kcal/mol) for intracellular Na+ translocation. In a typical sugar self-exchange essay, there is no net gradient in the melibiose or Na+ across the membrane, and the overall free energy changes we calculated apply to this situation.

    1. eLife Assessment

      This fundamental work provides new mechanistic insight in regulation of PDGF signaling through splicing controls. The evidence is compelling to demonstrate functional involvement of Srsf3, an RNA binding protein to this new and interesting mechanism. The work will be of broad interest to developmental biologists in general and molecular biologists/biochemists in the field of growth factor signaling and RNA processing.

    2. Reviewer #1 (Public review):

      In their manuscript "PDGFRRa signaling regulates Srsf3 transcript binding to affect PI3K signaling and endosomal trafficking" Forman and colleagues use iMEPM cells to characterize the effects of PDGF signaling on alternative splicing. They first perform RNA-seq using a one-hour stimulation with Pdgf-AA in control and Srsf3 knockdown cells. While Srsf3 manipulation results in a sizeable number of DE genes, PDGF does not. They then turn to examine alternative splicing, due to findings from this lab. They find that both PDGF and Srsf3 contribute much more to splicing than transcription. They find that the vast majority of PDGF-mediated alternative splicing depends upon Srsf3 activity and that skipped exons are the most common events with PDGF stimulation typically promoting exon skipping in the presence of Srsf3. They used eCLIP to identify RNA regions bound to Srsf3. Under both PDGF conditions, the majority of peaks were in exons with +PDGF having a substantially greater number of these peaks. Interestingly, they find differential enrichment of sequence motifs and GC content in stimulated versus unstimulated cells. They examine 2 transcripts encoding PI3K pathway (enriched in their GO analysis) members: Becn1 and Wdr81. They then go on to examine PDGFRRa and Rab5, an endosomal marker, colocalization. They propose a model in which Srsf3 functions downstream of PDGFRRa signaling to, in part, regulate PDGFRa trafficking to the endosome. The findings are novel and shed light on the mechanisms of PDGF signaling and will be broadly of interest. This lab previously identified the importance of PDGF naling on alternative splicing. The combination of RNA-seq and eCLIP is an exceptional way to comprehensively analyze this effect. The results will be of great utility to those studying PDGF signaling or neural crest biology.

      Comments on the revised version:

      The authors have fully addressed my previous comments and I have no further concerns.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript builds upon the work of a previous study published by the group (Dennison, 2021) to further elucidate the coregulatory axis of Srsf3 and PDGFRa on craniofacial development. The authors in this study investigated the molecular mechanisms by which PDGFRa signaling activates the RNA-binding protein Srsf3 to regulate alternative splicing (AS) and gene expression (GE) necessary for craniofacial development. PDGFRa signaling-mediated Srsf3 phosphorylation drives its translocation into the nucleus and affect binding affinity to different proteins and RNA, but the exact molecular mechanisms were not known. The authors performed RNA sequencing on immortalized mouse embryonic mesenchyme (MEPM) cells treated with shRNA targeting 3' UTR of Srsf3 or scramble shRNA (to probe AS and DE events that are Srsf3 dependent) and with and without PDGF-AA ligand treatment (to probe AS and DE events that are PDGFRa signaling dependent). They found that PDGFRa signaling has more effect on AS than on DE. A matching eCLIP-seq experiment was performed to investigate how Srsf3 binding sites change with and without PDGFRa signaling.

      Strengths:

      (1) The work builds well upon the previous data and the authors employ a variety of appropriate techniques to answer their research questions.

      (2) The authors show that Srsf3 binding pattern within the transcript as well as binding motifs change significantly upon PDGFRa signaling, providing a mechanistic explanation for the significant changes in AS.

      (3) By combining RNA-seq and eCLIP datasets together, the authors identified a list of genes that are directly bound by Srsf3 and undergo changes in GE and/or AS. Two examples are Becn1 and Wdr81, which are involved in early endosomal trafficking.

      Weaknesses:

      (1) The authors identify two genes whose AS are directly regulated by Srsf3 and involved in endosomal trafficking; however, they do not validate the differential AS results and whether changes in these genes can affect endosomal trafficking. In Figure 6, they show that PDGFRa signaling is involved in endosome size and Rab5 colocalization, but do not show how Srsf3 and the two genes are involved.

      (2) The proposed model does not account for other proteins mediating the activation of Srsf3 after Akt phosphorylation. How do we know this is a direct effect (and not secondary or tertiary effect)?

      This is a thoroughly revised manuscript. I would like to congratulate the authors to have invested a lot of time, resources, new data, and a more refined discussion to make this a compelling piece of work. I have no further concerns.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In their manuscript "PDGFRRa signaling regulates Srsf3 transcript binding to affect PI3K signaling and endosomal trafficking" Forman and colleagues use iMEPM cells to characterize the effects of PDGF signaling on alternative splicing. They first perform RNA-seq using a one-hour stimulation with Pdgf-AA in control and Srsf3 knockdown cells. While Srsf3 manipulation results in a sizeable number of DE genes, PDGF does not. They then turn to examine alternative splicing, due to findings from this lab. They find that both PDGF and Srsf3 contribute much more to splicing than transcription. They find that the vast majority of PDGF-mediated alternative splicing depends upon Srsf3 activity and that skipped exons are the most common events with PDGF stimulation typically promoting exon skipping in the presence of Srsf3. They used eCLIP to identify RNA regions bound to Srsf3. Under both PDGF conditions, the majority of peaks were in exons with +PDGF having a substantially greater number of these peaks. Interestingly, they find differential enrichment of sequence motifs and GC content in stimulated versus unstimulated cells. They examine 2 transcripts encoding PI3K pathway (enriched in their

      GO analysis) members: Becn1 and Wdr81. They then go on to examine PDGFRRa and Rab5, an endosomal marker, colocalization. They propose a model in which Srsf3 functions downstream of PDGFRRa signaling to, in part, regulate PDGFRa trafficking to the endosome. The findings are novel and shed light on the mechanisms of PDGF signaling and will be broadly of interest. This lab previously identified the importance of PDGF naling on alternative splicing. The combination of RNA-seq and eCLIP is an exceptional way to comprehensively analyze this effect. The results will be of great utility to those studying PDGF signaling or neural crest biology. There are some concerns that should be considered, however. 

      We thank the Reviewer for these supportive comments.

      (1) It took some time to make sense of the number of DE genes across the results section and Figure 1. The authors give the total number of DE genes across Srsf3 control and loss conditions as 1,629 with 1,042 of them overlapping across Pdgf treatment. If the authors would add verbiage to the point that this leaves 1,108 unique genes in the dataset, then the numbers in Figure 1D would instantly make sense. The same applies to PDGF in Figure 1F and the Venn diagrams in Figure 2. 

      We have edited the relevant sentence for Figure 1D as follows: “There was extensive overlap (521 out of 1,108; 47.0%) of Srsf3-dependent DE genes across ligand treatment conditions, resulting in a total of 1,108 unique genes within both datasets (Fig. 1C,D; Fig. S1A).” Similarly, we edited the relevant sentence for Figure 1F as follows: “There was limited overlap (4 out of 47; 8.51%) of PDGF-AA-dependent DE genes across Srsf3 conditions, resulting in a total of 47 unique genes within both datasets (Fig. 1E,F; Fig. S1B).” We edited the relevant sentence for Figure 2B as follows: “There was limited overlap (203 out of 1,705; 11.9%) of Srsf3-dependent alternatively-spliced transcripts across ligand treatment conditions, resulting in a total of 1,705 unique events within both datasets (Fig. 2A,B).” Finally, we edited the relevant sentence for Figure 2D as follows: “There was negligible overlap (9 out of 622; 1.45%) of PDGF-AA-dependent alternatively-spliced transcripts across Srsf3 conditions, resulting in a total of 622 unique events within both datasets (Fig. 2C,D).”

      (2) The percentage of skipped exons in the +DPSI on the righthand side of Figure 2F is not readable.  

      We have moved the label for the percentage of skipped exon events with a +DPSI for the -PDGF-AA vs +PDGF-AA (scramble) alternatively-spliced transcripts in Figure 2E so that it is legible.

      (3) It would be useful to have more information regarding the motif enrichment in Figure 3. What is the extent of enrichment? The authors should also provide a more complete list of enriched motifs, perhaps as a supplement. 

      We have added P values beneath the motifs in Figure 3F and 3G. Further, we have added a new Supplementary Figure, Figure S5, that lists the occurrence of the top 10 most enriched motifs in the unstimulated and, separately, stimulated samples in the eCLIP dataset and in a control dataset, as well as their P values.

      (4) It is unclear what subset of transcripts represent the "overlapping datasets" on lines 280-315. The authors state that there are 149 unique overlapping transcripts, but the Venn diagram shows 270. Also, it seems that the most interesting transcripts are the 233 that show alternative splicing and are bound by Srsf3. Would the results shown in Figure 5 change if the authors focused on these transcripts? 

      The Reviewer is correct that 233 of the alternatively-spliced transcripts had an Srsf3 eCLIP peak, as indicated in Figure 5A. However, several of these eCLIP peaks were a large distance from an alternatively-spliced element in the rMATS datasets, indicating that Srsf3 binding may not be contributing to the splicing outcomes in these cases. Instead, we correlated the eCLIP peaks with AS events by identifying transcripts in which Srsf3 bound within an alternatively-spliced exon or within 250 bp of the neighboring introns. We have added additional text clarifying this point in the Results: “We next sought to identify high-confidence transcripts for which Srsf3 binding had an increased likelihood of contributing to AS. Previous studies revealed enrichment of functional RBP motifs near alternatively-spliced exons (Yee et al., 2019). As such, we correlated the eCLIP peaks with AS events across all four treatment comparisons by identifying transcripts in which Srsf3 bound within an alternatively-spliced exon or within 250 bp of the neighboring introns (Tables S12-S15).” Further, we have relabeled Figure 5B as “Highconfidence, overlapping datasets biological process GO terms”.

      (5) In general, there is little validation of the sequencing results, performing qPCR on Arhgap12 and Cep55. The authors should additionally validate the PI3K pathway members that they analyze. Related, is Becn1 expression downregulated in the absence of Srsf3, as would be predicted if it is undergoing NMD? 

      We have added two new figure panels, Figure 5F-5G, assessing Wdr81 AS and Wdr81 protein sizes, as this gene has previously been implicated in craniofacial development. We have added the following text to the Results section: “Finally, as Wdr81 protein levels are predicted to regulate RTK trafficking between early and late endosomes, we confirmed the differential AS of Wdr81 transcripts between unstimulated scramble cells and scramble cells treated with PDGFAA ligand for 1 hour by qPCR using primers within constitutively-expressed exons flanking alternatively-spliced exon 9. This analysis revealed a decreased PSI for Wdr81 in each of three biological replicates upon PDGF-AA ligand treatment (Fig. 5F). Relatedly, we assessed the ratio of larger isoforms of Wdr81 protein (containing the WD3 domain) to smaller isoforms (missing the WD3 domain) via western blotting. Consistent with our RNA-seq and qPCR results, PDGFAA stimulation for 24 hours in the presence of Srsf3 led to an increase in smaller Wdr81 protein isoforms (Fig. 5G).”

      (6) What is the alternative splicing event for Acap3?  

      We have added the following text to the Results section and updated Figure 5E with Acap3 eCLIP peak visualization and the predicted alternative splicing outcome: “Finally, Acap3 is a GTPase-activating protein (GAP) for the small GTPase Arf6, converting Arf6 to an inactive, GDP-bound state (Miura et al., 2016). Arf6 localizes to the plasma membrane and endosomes, and has been shown to regulate endocytic membrane trafficking by increasing PI(4,5)P2 levels at the cell periphery (D’Souza-Schorey and Chavrier, 2006). Further, constitutive activation of Arf6 leads to upregulation of the gene encoding the p85 regulatory subunit of PI3K and increased activity of both PI3K and AKT (Yoo et al., 2019)… Srsf3 binding was additionally increased in Acap3 exon 19 upon PDGF-AA stimulation, at an enriched motif within the highconfidence, overlapping datasets, and we observed a corresponding increase in excision of adjacent intron 19 (Fig. 5D,E). As Acap3 intron 19 contains a PTC, this event is predicted to result in more transcripts encoding full-length protein (Fig. 5E).”

      (7) The insets in Figure 6 C"-H" are useful but difficult to see due to their small size. Perhaps these could be made as their own figure panels. 

      We have increased the size of the previous insets in new Figure 6 panels C’’’-H’’’.

      (8) In Figure 6A, it is not clear which groups have statistically significant differences. A clearer visualization system should be used. 

      We have added bracket shapes to Figure 6A indicating the statistically significant differences between scramble 0 minutes and scramble 60 minutes, and between scramble 60 minutes and shSrsf3 60 minutes.

      (9) Similarly in Figure 6B, is 15 vs 60 minutes in the shSrsf3 group the only significant difference? Is there a difference between scramble and shSrsf3 at 15 minutes? Is there a difference between 0 and 15 minutes for either group? 

      We have added a bracket shape to Figure 6B indicating the statistically significant difference between shSrsf3 at 15 minutes and shSrsf3 at 60 minutes. No other pairwise comparisons between treatments or timepoints were statistically significantly different.

      Reviewer #2 (Public Review): 

      Summary: 

      This manuscript builds upon the work of a previous study published by the group (Dennison, 2021) to further elucidate the coregulatory axis of Srsf3 and PDGFRa on craniofacial development. The authors in this study investigated the molecular mechanisms by which PDGFRa signaling activates the RNA-binding protein Srsf3 to regulate alternative splicing (AS) and gene expression (GE) necessary for craniofacial development. PDGFRa signaling-mediated Srsf3 phosphorylation drives its translocation into the nucleus and affects binding affinity to different proteins and RNA, but the exact molecular mechanisms were not known. The authors performed RNA sequencing on immortalized mouse embryonic mesenchyme (MEPM) cells treated with shRNA targeting 3' UTR of Srsf3 or scramble shRNA (to probe AS and DE events that are Srsf3 dependent) and with and without PDGF-AA ligand treatment (to probe AS and DE events that are PDGFRa signaling dependent). They found that PDGFRa signaling has more effect on AS than on DE. A matching eCLIP-seq experiment was performed to investigate how Srsf3 binding sites change with and without PDGFRa signaling. 

      Strengths: 

      (1) The work builds well upon the previous data and the authors employ a variety of appropriate techniques to answer their research questions. 

      (2) The authors show that Srsf3 binding pattern within the transcript as well as binding motifs change significantly upon PDGFRa signaling, providing a mechanistic explanation for the significant changes in AS. 

      (3) By combining RNA-seq and eCLIP datasets together, the authors identified a list of genes that are directly bound by Srsf3 and undergo changes in GE and/or AS. Two examples are Becn1 and Wdr81, which are involved in early endosomal trafficking.  We thank the Reviewer for these supportive comments.

      Weaknesses: 

      (1) The authors identify two genes whose AS are directly regulated by Srsf3 and involved in endosomal trafficking; however, they do not validate the differential AS results and whether changes in these genes can affect endosomal trafficking. In Figure 6, they show that PDGFRa signaling is involved in endosome size and Rab5 colocalization, but do not show how Srsf3 and the two genes are involved. 

      We have added two new figure panels, Figure 5F-5G, assessing Wdr81 AS and Wdr81 protein sizes, as this gene has previously been implicated in craniofacial development. We have added the following text to the Results section: “Finally, as Wdr81 protein levels are predicted to regulate RTK trafficking between early and late endosomes, we confirmed the differential AS of Wdr81 transcripts between unstimulated scramble cells and scramble cells treated with PDGFAA ligand for 1 hour by qPCR using primers within constitutively-expressed exons flanking alternatively-spliced exon 9. This analysis revealed a decreased PSI for Wdr81 in each of three biological replicates upon PDGF-AA ligand treatment (Fig. 5F). Relatedly, we assessed the ratio of larger isoforms of Wdr81 protein (containing the WD3 domain) to smaller isoforms (missing the WD3 domain) via western blotting. Consistent with our RNA-seq and qPCR results, PDGFAA stimulation for 24 hours in the presence of Srsf3 led to an increase in smaller Wdr81 protein isoforms (Fig. 5G).” The experiments in Figure 6 compare early endosome size, PDGFRa localization in early endosomes and phospho-Akt levels in response to PDGF-AA stimulation in scramble versus shSrsf3 cells, demonstrating that Srsf3-mediated PDGFRa signaling leads to enlarged early endosomes, retention of PDGFRa in early endosomes and increased downstream phospho-Akt signaling. Though we agree with the Reviewer that functionally linking the AS events to the endosomal phenotype would strengthen our conclusions, these are technically challenging experiments for several reasons. First, this approach has typically relied on tiling oligos against a region of interest to find the optimal sequence. We identified several transcripts that are bound by Srsf3 and undergo alternative splicing upon PDGFRa signaling to potentially contribute to the regulation of PI3K signaling and early endosomal trafficking. We do not expect that these effects are mediated by a single transcript but may instead by mediated by a combination of alternative splicing changes. As such, these experiments would require us to identify and validate multiple splice-switching antisense oligonucleotides (ASOs). Second, ASOs designed against a specific target may not lead to alternative splicing of that target, even in cases of high predicted binding affinities (Scharner et al., 2020, Nucleic Acid Res 48(2), 802816). Third, ASOs have been shown to result in off-target mis-splicing effects, which are hard to predict (Scharner et al., 2020, Nucleic Acid Res 48(2), 802-816). The design of functional ASOs is thus a long-standing challenge in the field, and likely beyond the scope of this manuscript. We have added the following text to the Discussion to highlight this potential future direction: “In the future, it will be worthwhile to attempt to functionally link the AS of transcripts such as Becn1, Wdr81 and/or Acap3 to the endosomal trafficking changes observed above using spliceswitching antisense oligonucleotides (ASOs).”

      (2) The proposed model does not account for other proteins mediating the activation of Srsf3 after Akt phosphorylation. How do we know this is a direct effect (and not a secondary or tertiary effect)? 

      This point is introduced in the Discussion: “Whether phosphorylation of Srsf3 directly influences its binding to target RNAs or acts to modulate Srsf3 protein-protein interactions which then contribute to differential RNA binding remains to be determined, though findings from Schmok et al., 2024 may argue for the latter mechanism. Studies identifying proteins that differentially interact with Srsf3 in response to PDGF-AA ligand stimulation are ongoing and will shed light on these mechanisms…. Again, this shift could be due to loss of RNA binding owing to electrostatic repulsion and/or changes in ribonucleoprotein composition and will be the subject of future studies.” We have added a potential change in Srsf3 protein-protein interactions upon Akt phosphorylation in the model in Figure 6J.

      Reviewer #2 (Recommendations For The Authors): 

      Suggestions: 

      (1) It would strengthen the paper and improve the connection with the other sections of the paper if the authors show: 

      a)  validation of PDGFRa signaling leading to AS of Becn1 and Wdr81 and corresponding changes in protein, and  

      We have added two new figure panels, Figure 5F-5G, assessing Wdr81 AS and Wdr81 protein sizes, as this gene has previously been implicated in craniofacial development. We have added the following text to the Results section: “Finally, as Wdr81 protein levels are predicted to regulate RTK trafficking between early and late endosomes, we confirmed the differential AS of Wdr81 transcripts between unstimulated scramble cells and scramble cells treated with PDGFAA ligand for 1 hour by qPCR using primers within constitutively-expressed exons flanking alternatively-spliced exon 9. This analysis revealed a decreased PSI for Wdr81 in each of three biological replicates upon PDGF-AA ligand treatment (Fig. 5F). Relatedly, we assessed the ratio of larger isoforms of Wdr81 protein (containing the WD3 domain) to smaller isoforms (missing the WD3 domain) via western blotting. Consistent with our RNA-seq and qPCR results, PDGFAA stimulation for 24 hours in the presence of Srsf3 led to an increase in smaller Wdr81 protein isoforms (Fig. 5G).”

      b)  functionally link the AS event(s) to endosomal phenotype using ASOs, etc. 

      Though we agree with the Reviewer that such results would strengthen our conclusions, these are technically challenging experiments for several reasons. First, this approach has typically relied on tiling oligos against a region of interest to find the optimal sequence. We identified several transcripts that are bound by Srsf3 and undergo alternative splicing upon PDGFRa signaling to potentially contribute to the regulation of PI3K signaling and early endosomal trafficking. We do not expect that these effects are mediated by a single transcript but may instead by mediated by a combination of alternative splicing changes. As such, these experiments would require us to identify and validate multiple splice-switching antisense oligonucleotides (ASOs). Second, ASOs designed against a specific target may not lead to alternative splicing of that target, even in cases of high predicted binding affinities (Scharner et al., 2020, Nucleic Acid Res 48(2), 802-816). Third, ASOs have been shown to result in off-target mis-splicing effects, which are hard to predict (Scharner et al., 2020, Nucleic Acid Res 48(2), 802-816). The design of functional ASOs is thus a long-standing challenge in the field, and likely beyond the scope of this manuscript. We have added the following text to the Discussion to highlight this potential future direction: “In the future, it will be worthwhile to attempt to functionally link the AS of transcripts such as Becn1, Wdr81 and/or Acap3 to the endosomal trafficking changes observed above using splice-switching antisense oligonucleotides (ASOs).”

      (2) The Venn diagram in Figure 5A and the description of the analysis the authors did to combine the RNA-seq and eCLIP-seq data are a little confusing. The authors say that they correlated eCLIP peaks with GE or AS events across all four treatment comparisons. The purpose of looking at both datasets was to find genes that are directly bound by Srsf3 and also have significantly affected GE and/or AS. Therefore, the data with and without PDGF-AA should be considered separately. For example, eCLIP peaks in the PDGF-AA condition can be correlated to Srsf3-dependent AS differences (comparing shSrsf3 and scramble) in the -PDGF-AA condition, and eCLIP peaks in the +PDGF-AA condition can be correlated to Srsf3-dependent AS differences in the +PDGF-AA condition. In the Venn diagram and the description, it seems like all comparisons were combined and it is not clear how the data were analyzed.

      As indicated in Figure 5A, 233 of the alternatively-spliced transcripts uniquely found in one of the four treatment comparisons had an Srsf3 eCLIP peak. However, several of these eCLIP peaks were a large distance from an alternatively-spliced element in the rMATS datasets, indicating that Srsf3 binding may not be contributing to the splicing outcomes in these cases. Instead, we correlated the eCLIP peaks with AS events by identifying transcripts in which Srsf3 bound within an alternatively-spliced exon or within 250 bp of the neighboring introns. We have added additional text clarifying this point in the Results: “We next sought to identify highconfidence transcripts for which Srsf3 binding had an increased likelihood of contributing to AS.

      Previous studies revealed enrichment of functional RBP motifs near alternatively-spliced exons (Yee et al., 2019). As such, we correlated the eCLIP peaks with AS events across all four treatment comparisons by identifying transcripts in which Srsf3 bound within an alternativelyspliced exon or within 250 bp of the neighboring introns (Tables S12-S15).” Further, we have relabeled Figure 5B as “High-confidence, overlapping datasets biological process GO terms”. We respectfully disagree with the Reviewer’s suggested comparisons. A comparison of the PDGF-AA eCLIP data with the scramble vs shSrsf3 (-PDGF-AA) data from the list of highconfidence transcripts resulted in only 7 transcripts. Similarly, a comparison of the +PDGF-AA eCLIP data with the scramble vs shSrsf3 (+PDGF-AA) data from the list of high-confidence transcripts resulted in only 14 transcripts. Separate gene ontology analyses of these lists of 7 and 14 transcripts revealed 21 and 40 significant terms for biological process, respectively, the majority of which encompassed one, and never more than two, transcripts. Had we separately examined the -PDGF-AA and +PDGF-AA data, we would not have detected the changes in Becn1, Wdr81 and Acap3 in Figure 5E.

    1. eLife Assessment

      This valuable manuscript presents a spatiotemporal genetic analysis of malaria-infected individuals from four villages in The Gambia, covering the period between December 2014 and May 2017. Overall, laboratory and data analyses are solid, although details of the methods are lacking. This study offers evidence to advance the understanding of malaria epidemiology in sub-Saharan Africa, but would benefit from additional analysis to strengthen the findings.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript titled "Household clustering and seasonal genetic variation of Plasmodium falciparum at the community-level in The Gambia" presents a valuable genetic spatio-temporal analysis of malaria-infected individuals from four villages in The Gambia, covering the period between December 2014 and May 2017. The majority of samples were analyzed using a SNP barcode with the Spotmalaria panel, with a subset validated through WGS. Identity-by-descent (IBD) was calculated as a measure of genetic relatedness and spatio-temporal patterns of the proportion of highly related infections were investigated. Related clusters were detected at the household level, but only within a short time period.

      Strengths:

      This study offers a valuable dataset, particularly due to its longitudinal design and the inclusion of asymptomatic cases. The laboratory analysis using the Spotmalaria platform combined and supplemented with WGS is solid, and the authors show a linear correlation between the IBD values determined with both methods, although other studies have reported that at least 200 SNPs are required for IBD analysis. Data-analysis pipelines were created for (1) variant filtering for WGS and subsequent IBD analysis, and (2) creating a consensus barcode from the spot malaria panel and WGS data and subsequent SNP filtering and IBD analysis.

      Weaknesses:

      Further refining the data could enhance its impact on both the scientific community and malaria control efforts in The Gambia.

      (1) The manuscript would benefit from improved clarity and better explanation of results to help readers follow more easily. Despite familiarity with genotyping, WGS, and IBD analysis, I found myself needing to reread sections. While the figures are generally clear and well-presented, the text could be more digestible. The aims and objectives need clearer articulation, especially regarding the rationale for using both SNP barcode and WGS (is it to validate the approach with the barcode, or is it to have less missing data?). In several analyses, the purpose is not immediately obvious and could be clarified.

      (2) Some key results are only mentioned briefly in the text without corresponding figures or tables in the main manuscript, referring only to supplementary figures, which are usually meant for additional detail, but not main results. For example, data on drug resistance markers should be included in a table or figure in the main manuscript.

      (3) The study uses samples from 2 different studies. While these are conducted in the same villages, their study design is not the same, which should be addressed in the interpretation and discussion of the results. Between Dec 2014 and Sept 2016, sampling was conducted only in 2 villages and at less frequent intervals than between Oct 2016 to May 2017. The authors should assess how this might have impacted their temporal analysis and conclusions drawn. In addition, it should be clarified why and for exactly in which analysis the samples from Dec 2016 - May 2017 were excluded as this is a large proportion of your samples.

      (4) Based on which criteria were samples selected for WGS? Did the spatiotemporal spread of the WGS samples match the rest of the genotyped samples? I.e. were random samples selected from all times and places, or was it samples from specific times/places selected for WGS?

      (5) The manuscript would benefit from additional detail in the methods section.

      (6) Since the authors only do the genotype replacement and build consensus barcode for 199 samples, there is a bias between the samples with consensus barcode and those with only the genotyping barcode. How did this impact the analysis?

      (7) The linear correlation between IBD-values of barcode vs genome is clear. However, since you do not use absolute values of IBD, but a classification of related (>=0.5 IBD) vs. unrelated (<0.5), it would be good to assess the agreement of this classification between the 2 barcodes. In Figure S6 there seem to be quite some samples that would be classified as unrelated by the consensus barcode, while they have IBD>0.5 in the Genome-IBD; in other words, the barcode seems to be underestimating relatedness.<br /> a. How sensitive is this correlation to the nr of SNPs in the barcode?

      (8) With the sole focus on IBD, a measure of genetic relatedness, some of the conclusions from the results are speculative.<br /> a. Why not include other measures such as genetic diversity, which relates to allele frequency analysis at the population level (using, for example, nucleotide diversity)? IBD and the proportion of highly related pairs are not a measure of genetic diversity. Please revise the manuscript and figures accordingly.<br /> b. Additionally, define what you mean by "recombinatorial genetic diversity" and explain how it relates to IBD and individual-level relatedness.<br /> c. Recombination is one potential factor contributing to the loss of relatedness over time. There are several other factors that could contribute, such as mobility/gene flow, or study-specific limitations such as low numbers of samples in the low transmission season and many months apart from the high transmission samples.<br /> d. By including other measures such as linkage disequilibrium you could further support the statements related to recombination driving the loss of relatedness.

      (9) While the authors conclude there is no seasonal pattern in the drug-resistant markers, one can observe a big fluctuation in the dhps haplotypes, which go down from 75% to 20% and then up and down again later. The authors should investigate this in more detail, as dhps is related to SP resistance, which could be important for seasonal malaria chemoprofylaxis, especially since the mutations in dhfr seem near-fixed in the population, indicating high levels of SP resistance at some of the time points.

      (10) I recommend that raw data from genotyping and WGS should be deposited in a public repository.

    3. Reviewer #2 (Public review):

      Summary:

      Malaria transmission in the Gambia is highly seasonal, whereby periods of intense transmission at the beginning of the rainy season are interspersed by long periods of low to no transmission. This raises several questions about how this transmission pattern impacts the spatiotemporal distribution of circulating parasite strains. Knowledge of these dynamics may allow the identification of key units for targeted control strategies, the evaluation of the effect of selection/drift on parasite phenotypes (e.g., the emergence or loss of drug resistance genotypes), and analyze, through the parasites' genetic nature, the duration of chronic infections persisting during the dry season. Using a combination of barcodes and whole genome analysis, the authors try to answer these questions by making clever use of the different recombination rates, as measured through the proportion of genomes with identity-by-descent (IBD), to investigate the spatiotemporal relatedness of parasite strains at different spatial (i.e., individual, household, village, and region) and temporal (i.e., high, low, and the corresponding the transitions) levels. The authors show that a large fraction of infections are polygenomic and stable over time, resulting in high recombinational diversity (Figure 2). Since the number of recombination events is expected to increase with time or with the number of mosquito bites, IBD allows them to investigate the connectivity between spatial levels and to measure the fraction of effective recombinational events over time. The authors demonstrate the epidemiological connectivity between villages by showing the presence of related genotypes, a higher probability of finding similar genotypes within the same household, and how parasite-relatedness gradually disappears over time (Figure 3). Moreover, they show that transmission intensity increases during the transition from dry to wet seasons (Figure 4). If there is no drug selection during the dry season and if resistance incurs a fitness cost it is possible that alleles associated with drug resistance may change in frequency. The authors looked at the frequencies of six drug-resistance haplotypes (aat1, crt, dhfr, dhps, kelch13, and mdr1), and found no evidence of changes in allele frequencies associated with seasonality. They also find chronic infections lasting from one month to one and a half years with no dependence on age or gender.

      The use of genomic information and IBD analytic tools provides the Control Program with important metrics for malaria control policies, for example, identifying target populations for malaria control and evaluation of malaria control programs.

      Strength:

      The authors use a combination of high-quality barcodes (425 barcodes representing 101 bi-allelic SNPs) and 199 high-quality genome sequences to infer the fraction of the genome with shared Identity by Descent (IBD) (i.e. a metric of recombination rate) over several time points covering two years. The barcode and whole genome sequence combination allows full use of a large dataset, and to confidently infer the relatedness of parasite isolates at various spatiotemporal scales.

    4. Reviewer #3 (Public review):

      This study aimed to investigate the impact of seasonality on the malaria parasite population genetic. To achieve this, the researchers conducted a longitudinal study in a region characterized by seasonal malaria transmission. Over a 2.5-year period, blood samples were collected from 1,516 participants residing in four villages in the Upper River Region of The Gambia and tested the samples for malaria parasite positivity. The parasites from the positive samples were genotyped using a genetic barcode and/or whole genome sequencing, followed by a genetic relatedness analysis.

      The study identified three key findings:

      (1) The parasite population continuously recombines, with no single genotype dominating, in contrast to viral populations;

      (2) The relatedness of parasites is influenced by both spatial and temporal distances; and

      (3) The lowest genetic relatedness among parasites occurs during the transition from low to high transmission seasons. The authors suggest that this latter finding reflects the increased recombination associated with sexual reproduction in mosquitoes.

      The results section is well-structured, and the figures are clear and self-explanatory. The methods are adequately described, providing a solid foundation for the findings. While there are no unexpected results, it is reassuring to see the anticipated outcomes supported by actual data. The conclusions are generally well-supported; however, the discussion on the burden of asymptomatic infections falls outside the scope of the data, as no specific analysis was conducted on this aspect and was not stated as part of the aims of the study. Nonetheless, the recommendation to target asymptomatic infections is logical and relevant.

    1. eLife Assessment

      This manuscript describes a novel magnetic steering technique to target human adipose derived mesenchymal stem cells (hAMSC) or induce pluripotent stem cells to the TM (iPSC-TM). The authors demonstrate the valuable findings that delivery of the stem cells compared to baseline lowered IOP, increased outflow facility, and increased TM cellularity. Although the methods, data, and analysis are solid, there is an overall weakness in the experimental controls, and questions around the transgenic mouse model. If these issues are addressed, the manuscript will be significantly improved.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript describes a novel magnetic steering technique to target human adipose derived mesenchymal stem cells (hAMSC) or induce pluripotent stem cells to the TM (iPSC-TM). The authors show that delivery of the stem cells lowered IOP, increased outflow facility, and increased TM cellularity.

      Strengths:

      The technique is novel and shows promise as a novel therapeutic to lower IOP in glaucoma. hAMSC are able to lower IOP below the baseline as well as increase outflow facility above baseline with no tumorigenicity. These data will have a positive impact on the field and will guide further research using hAMSC in glaucoma models.

      Weaknesses:

      The transgenic mouse model of glaucoma the authors used did not show ocular hypertensive phenotypes at 6-7 months of age as previously reported. Therefore, if there is no pathology in these animals the authors did not show a restoration of function, but rather a decrease in pressure below normal IOP.

    3. Reviewer #2 (Public review):

      Summary:

      This observational study investigates the efficacy of intracameral injected human stem cells as a means to re-functionalize the trabecular meshwork for the restoration of intraocular pressure homeostasis. Using a murine model of glaucoma, human adipose-derived mesenchymal stem cells are shown to be biologically safer and functionally superior at eliciting a sustained reduction in intraocular pressure (IOP). The authors conclude that the use of human adipose-derived mesenchymal stem cells has the potential for long-term treatment of ocular hypertension in glaucoma.

      Strengths:

      A noted strength is the use of a magnetic steering technique to direct injected stem cells to the iridocorneal angle. An additional strength is the comparison of efficacy between two distinct sources of stem cells: human adipose-derived mesenchymal vs. induced pluripotent cell derivatives. Utilizing both in vivo and ex vivo methodology coupled with histological evidence of introduced stem cell localization provides a consistent and compelling argument for a sustainable impact exogenous stem cells may have on the re-functionalization of a pathologically compromised TM.

      Weaknesses:

      A noted weakness of the study, as pointed out by the authors, includes the unanticipated failure of the genetic model to develop glaucoma-related pathology (elevated IOP, TM cell changes). While this is most unfortunate, it does temper the conclusion that exogenous human adipose derived mesenchymal stem cells may restore TM cell function. Given that TM cell function was not altered in their genetic model, it is difficult to say with any certainty that the introduced stem cells would be capable of restoring pathologically altered TM function. A restoration effect remains to be seen. Another noted complication to these findings is the observation that sham intracameral-injected saline control animals all showed elevated IOP and reduced outflow facility, compared to WT or Tg untreated animals, which allowed for more robust statistically significant outcomes. Additional comments/concerns that the authors may wish to address are elaborated in the Private Review section.

    4. Reviewer #3 (Public review):

      Summary:

      The purpose of the current manuscript was to investigate a magnetic cell steering technique for efficiency and tissue-specific targeting, using two types of stem cells, in a mouse model of glaucoma. As the authors point out, trabecular meshwork (TM) cell therapy is an active area of research for treating elevated intraocular pressure as observed in glaucoma. Thus, further studies determining the ideal cell choice for TM cell therapy is warranted. The experimental protocol of the manuscript involved the injection of either human adipose derived mesenchymal stem cells (hAMSCs) or induced pluripotent cell derivatives (iPSC-TM cells) into a previously reported mouse glaucoma model, the transgenic MYOCY437H mice and wild-type littermates followed by the magnetic cell steering. Numerous outcome measures were assessed and quantified including IOP, outflow facility, TM cellularity, retention of stem cells, and the inner wall BM of Schlemm's canal.

      Strengths:

      All of these analyses were carefully carried out and appropriate statistical methods were employed. The study has clearly shown that the hAMSCs are the cells of choice over the iPSC-TM cells, the latter of which caused tumors in the anterior chamber. The hAMSCs were shown to be retained in the anterior segment over time and this resulted in increased cellular density in the TM region and a reduction in IOP and outflow facility. These are all interesting findings and there is substantial data to support it.

      Weaknesses:

      However, where the study falls short is in the MYOCY437H mouse model of glaucoma that was employed. The authors clearly state that a major limitation of the study is that this model, in their hands, did not exhibit glaucomatous features as previously reported, such as a significant increase in IOP, which was part of the overall purpose of the study. The authors state that it is possible that "the transgene was silenced in the original breeders". The authors did not show PCR, western blot, or immuno of angle tissue of the tg to determine transgenic expression (increased expression of MYOC was shown in the angle tissue of the transgenics in the original paper by Zode et al, 2011). This should be investigated given that these mice were rederived. Thus, it is clearly possible that these are not transgenic mice. If indeed they are transgenics, the authors may want to consider the fact that in the Zode paper, the most significant IOP elevation in the mutant mice was observed at night and thus this could be examined by the authors. Other glaucomatous features of these mice could also have been investigated such as loss of RGCs, to further determine their transgenic phenotype. Finally, while increased cellular density in the TM region was observed, proliferative markers could be employed to determine if the transplanted cells are proliferating.

    1. eLife Assessment

      In this potentially valuable study, the authors employed in vivo experiments and theoretical modeling to study the growth dynamics of nuclear condensates. They observed that condensates can exhibit distinct growth modes, as dictated by the competition between condensate surface tension and local elasticity of chromatin. While the theoretical model appears to capture the experimental observations, the level of evidence supporting the proposed growth mechanism is incomplete due to, among other limitations, the multiple fitting parameters and poorly justified Neo-Hookean elasticity.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript "Interplay of condensate material properties and chromatin heterogeneity governs nuclear condensate ripening" presents experiments and theory to explain the dynamic behavior of nuclear condensates. The authors present experimental data that shows the size of multiple artificially induced condensates as a function of time for various conditions. They identify different dynamic regimes, which all differ from traditional Ostwald ripening. By careful analysis and comparison with a quantitative model, the authors conclude that the elastic effects of the chromatin are relevant and the interplay between (heterogeneous) elasticity and surface tension governs the droplets' behavior. However, since they apply a simple model to a complex system, I think that the work is sometimes prone to over-interpretation, which I detail below. In summary, since droplet growth in a heterogeneous, elastic environment is unavoidable for condensates, this work achieves an important step toward understanding this complex setting. The work will likely stimulate more experiments (using different methods or alternative settings) as well as theory (accounting for additional effects, like spatial correlations).

      Strengths:

      A particularly strong point of the work is the tight integration between experiment and theory. Both parts are explained well at an appropriate level with more details in the methods section and the supplementary information. I cannot comment much on the experiments, but they seem convincing to me and the authors quantify the relevant parameters. Concerning the theory, they derive a model at the appropriate level of description. The analysis of the model is performed and explained well. Even though spatial correlations are not taken into account, the model will serve as a useful basis for developing more complicated models in the future. It is also worth mentioning that the clear classification into different growth regimes is helpful since such results, with qualitative predictions for parameter dependencies, likely also hold in more complex scenarios.

      Weaknesses:

      I think that the manuscript would profit from more precise definitions and explanations in multiple points, as detailed below. Clearly, not all these points can be fully incorporated in a model at this point, but I think it would be helpful to mention weaknesses in the manuscript and to discuss the results a bit more carefully.

      (1) The viscosity analysis likely over-interprets the data. First, the FRAP curves do not show clear exponential behavior. For Figure 1C, there are at least two time scales and it is not clear to me why the shorter time scale right after bleaching is not analyzed. If the measured time scale were based on the early recovery, the differences between the two cases would likely be very small. For Figure 1D, the recovery is marginal, so it is not clear how reliable the measurements are. More generally, the analysis was performed on condensates of very different sizes, which can surely affect the measurements; see https://doi.org/10.7554/eLife.68620 for many details on using FRAP to analyze condensate dynamics. Second, the relaxation dynamics are likely not purely diffusive in a viscous environment since many condensates show elastic properties (https://doi.org/10.1126/science.aaw4951). I could very well imagine that the measured recovery time is related to the viscoelastic time scale. Third, the assumption of the Stokes-Einstein-Sutherland equation to relate diffusivity and viscosity is questionable because of viscoelasticity and the fact that the material is clearly interacting, so free diffusion is probably not expected.

      (2) A large part of the paper is spent on the difference between different dynamic regimes, which are called "fusion", "ripening", and "diffusion-based" (with slightly different wording in different parts). First, I would welcome consistent language, e.g., using either fusion or coalescence. Second, I would welcome an early, unambiguous definition of the regimes. A definition is given at the end of page 2, but this definition is not clear to me: Does the definition pertain to entire experiments (e.g., is something called "fusion" if any condensates fuse at any time in the experiment?), or are these labels used for different parts of the experiment (e.g., would the data in Figure 1H first be classified as "ripening" and then "diffusion-based")? More generally, the categorization seems to depend on the observed system size (or condensate count) and time scale. Third, I find the definition of the ripening time a bit strange since it is clearly correlated with droplet size. Is this dependency carefully analyzed in the subsequent parts?

      (3) The effect of the elastic properties of the chromatin is described by a Neo-Hookean model, but the strains R/\xi used in the theory are of the order of 100, which is huge. At such high strains, the Neo-Hookean model essentially has a constant pressure 5E/6, so the mesh size \xi does not matter. It is not clear to me whether chromatin actually exhibits such behavior, and I find it curious that the authors varied the stiffness E but not the mesh size \xi when explaining the experiments in the last section although likely both parameters are affected by the experimental perturbations. In any case, https://doi.org/10.1073/pnas.2102014118 shows that non-linear elastic effects related to breakage and cavitation could set in, which might also be relevant to the problem described here. In particular, the nucleation barrier discussed in the later part of the present manuscript might actually be a cavitation barrier due to elastic confinement. In any case, I would welcome a more thorough discussion of these aspects (in particular the large strains).

      (4) The description of nucleation on page 7 is sloppy and might be misleading. First, at first reading I understood the text as if droplets of any radius could nucleate with probability p_nuc related to Eq. 7. This must be wrong since large droplets have ΔG<0 implying p_nuc > 1. Most likely, the nucleation rate only pertains to the critical radius (which is what might be meant by R_0, but it is unclear from the description). In this case, the critical radius and its dependence on parameters should probably be discussed. It might also help to give the value of the supersaturation S in terms of the involved concentrations, and it should be clarified whether P_E depends on R_0 or not (this might also relate to the cavitation barrier raised in point 3 above). Secondly, it is a bit problematic that E is sampled from a normal distribution, which allows for negative stiffnesses! More importantly, the exact sampling protocol is important since sampling more frequently (in the simulations) leads to a larger chance of hitting a soft surrounding, which facilitates nucleation. I could not find any details on the sampling in the numerical simulations, but I am convinced that it is a crucial aspect. I did find a graphical representation of the situation in Figure S4A, but I think it is misleading since there is no explicit space in the model and stiffnesses are not correlated.

    3. Reviewer #2 (Public review):

      Summary:

      The authors used a chemical linker to induce phase separation in U2OS cell nuclei with two different proteins, a coiled-coil protein (Mad1) and a disordered domain (from LAF-1), whose condensates were purported to have different material properties. First, they performed Fluorescence Recovery After Photobleaching (FRAP) and estimated the viscosity via the Stokes-Einstein equation. Combined with droplet fusion assays, this yielded an estimate of the surface tension, wherein the disordered condensates were found to have 130 times higher surface tension than the coiled-coil condensates. Confocal fluorescence microscopy was used to follow condensates over time, enabling classification of growth events as either fusion-, ripening-, or diffusion-based, and subsequent comparison of the relative abundances of these growth events between the two condensate types. Coiled-coil condensates grew primarily by diffusive processes, whereas disordered condensates grew primarily by ripening processes. The coarsening rates were described by growth exponents extracted from power-law fits of average normalized condensate radius over time. In both cases, these growth exponents were smaller than those predicted by theory, leading the authors to propose that nuclear condensate growth is generally suppressed by chromatin mechanics, as found in previous studies albeit with different exponents. The authors developed a theory to understand how the extent of this effect may depend on condensate material properties like surface tension. Treating chromatin as a neo-Hookean elastic solid, the authors assume a form of mechanical pressure that plateaus with increasing condensate size, and the resulting theory is used to analyze the observed condensate growth dynamics. A linearized extension of the theory is used to distinguish between suppressed, elastic, and Ostwald ripening. Finally, the authors consider the impact of different chromatin environments on condensate growth patterns and dynamics, which is achieved experimentally with another cell type (HeLa) and with a drug that decondenses chromatin (TSA). They find that condensate growth patterns are not significantly changed in either condensate type, but that the number of condensates nucleated and their related growth exponent are more sensitive to variations in chromatin stiffness in the coiled-coil system due to its low surface tension.

      Strengths:

      This work provides evidence that nuclear condensates can coarsen not only by fusion but also by continuous diffusive growth processes, predominant in coiled-coil condensates, and ripening, predominant in disordered condensates. Across these different condensate types and coarsening mechanisms, the authors find growth exponents lower than theoretical expectations, reinforcing the notion that elastic media can suppress condensate growth in the nucleus. Combined with theory, these observed differences in growth patterns and rates are argued to originate from differences in material properties, namely, surface tension relative to local chromatin stiffness. The authors further suggest that the few ripening events that are seen in coiled-coil condensates may be elastic in nature due to gradients in chromatin stiffness as opposed to Ostwald ripening. If this assertion proves to be robust, it would mark an early observation of elastic ripening in living cells.

      Weaknesses:

      (1) The assertion that nuclear condensates experience an external pressure from the chromatin network implies that chromatin should be excluded from the condensates (Nott et al., Molecular Cell (2015); Shin et al., Cell (2018)). This has not been shown or discussed here. While Movie 1 suggests the coiled-coil condensates may exclude chromatin, Movie 2 suggests the disordered condensates do not. LAF-1, as an RNA helicase, interacts with RNA, and RNA can be associated with chromatin in the nucleus. RNA can also modulate droplet viscosity. The authors' analysis of the disordered condensate data only makes sense if these condensates exclude chromatin, which they have not demonstrated, and which appears not to be the case.

      (2) Critical physical parameters like viscosity and surface tension have not been directly measured but rather are estimated indirectly using FRAP and the Stokes-Einstein equation. While not uncommon in the field, this approach is flawed as droplet viscosity is not simply determined by the size of the composing particles. Rather, in polymeric systems, viscosity strongly depends on the local protein concentration and intermolecular interactions (Rubinstein & Semenov Macromolecules (2001)). This unjustified approach propagates to the surface tension estimate since only the ratio of viscosity to surface tension is explicitly measured. Since the paper's conclusions strongly hinge on the magnitude of the surface tension, a more accurate estimate or direct measurement of this salient material property is called for.

      (3) The phase diagram of growth modes very much depends on the assumption of neo-Hookean elasticity of the chromatin network. This assumption is poorly justified and calls into question the general conclusions about possible growth phases. The authors need to either provide evidence for neo-Hookean elasticity, or, alternatively, consider a model in which strain stiffening or thinning continues as droplets grow, which would likely lead to very different conclusions, and acknowledge this uncertainty.

      (4) There is limited data for the elastic ripening claim. In Figure 3E, only one data point resides in the elastic ripening (δ < 0) range, with a few data points very close to zero.

      (5) The authors claim that "our work shows that the elastic chromatin network can stabilize condensates against Ostwald ripening but only when condensate surface tension is low." This claim also depends on the details of the chosen neo-Hookean model of chromatic elasticity, and it is not studied here whether these results are robust to other models.

      (6) It is also not clear how the total number of Mad1 proteins and LAF-1 disordered regions change while the condensates evolve with time. As the experiments span longer than 6 hours, continued protein production could lead to altered condensate coarsening dynamics. For example, continued production of Mad1 can lead to the growth of all Mad1 condensates, mimicking the diffusive growth process.

    4. Author response:

      We appreciate the reviewer’s recognition of the strengths of our work as well as their constructive critiques and insightful suggestions for improvement. In this provisional response, we outline how we plan to address the reviewer’s comments in the revised manuscript. 

      (1) Viscosity and surface tension are not accurately measured. 

      We thank the reviewers for bringing up this important point. We are aware that FRAP is not the best method to accurately measure condensate viscoelasticity due to the problems the reviewers and others in the field have pointed out. More accurate methods of measuring fluorescent protein mobility, such as single-molecule tracking or fluorescence correlation spectroscopy, can be used; however, they cannot accurately reflect the time scale dependence of viscoelasticity in the condensate either. Other methods such as rheology and micropipette aspiration that have been used to measure condensate viscoelasticity in vitro are not accessible in living cells yet. Similarly, there is no readily available method to directly measure the surface tension of condensates in live cells. Therefore, we used FRAP and fusion assays to estimate the ratio of surface tension between the two condensates. This ratio was then used to determine the surface tension of the coiled coil condensates in the model after estimating the surface tension for disordered condensate from in vitro measurements (https://doi.org/10.1016/j.bpr.2021.100011). In the revision, we will adjust our FRAP fitting and use condensates with similar sizes to make our FRAP data more accurate. However, based on the large difference we observed for these two condensates, we do not believe these FRAP improvements would change the conclusions. 

      We are also aware that the stokes-einstein relation strictly applies to purely viscous systems. One can apply the generalized Stokes-Einstein relation, which links the diffusion coefficient to the complex viscoelastic modulus of the medium. However, the complex modulus is difficult to determine in cells through live imaging. We thus used the Stokes-Einstein relation to estimate the ratio of effective viscosities, assuming elastic deformations relax faster. In the revision, we will add these assumptions to our discussion. 

      (2) Justification of a Neo-Hookean elasticity model for chromatin. 

      We thank the reviewer for highlighting this important aspect of our work. The observation that the strains R/ξ in our initial model are of the order of 100 is valid and raises questions about the applicability of the Neo-Hookean model. While it is true that at such high strains, the pressure becomes nearly constant (5E/6), our model remains applicable within the range of strains relevant to chromatin, particularly for small droplets where R/ξ values are more moderate. This is explicitly considered in the section “Effect of mechanical heterogeneity on condensate nucleation and growth,” where we also account for heterogeneous mesh sizes correlated with local stiffness. While these points are discussed in the supplementary material, we acknowledge that these details are not clearly presented in the main text, and we will revise the manuscript to explicitly discuss the strain regime and model applicability.

      We agree that varying both the stiffness E and mesh size ξ would provide a more comprehensive understanding of the system, as both parameters are likely affected by experimental perturbations. We will revisit our analysis to incorporate variations in ξ alongside E and discuss the potential effects on our results.

      Furthermore, the stabilization of condensate size by chromatin elasticity arises from the size-dependent pressure exerted by the elastic network, which is a feature of strain-stiffening elastic media rather than a specific property of the Neo-Hookean model. However, we agree that exploring the robustness of our results under alternative elasticity models would strengthen the manuscript. In the revised version, we will analyze additional elasticity models, including strain stiffening and thinning, to evaluate how these might influence our conclusions and to provide a broader context for the predicted growth phases.

      The connection between the nucleation barrier and the cavitation barrier is particularly intriguing. The referenced study (https://doi.org/10.1073/pnas.2102014118) highlights non-linear elastic effects, including breakage and cavitation, which may be relevant in our system. We will explore whether cavitation effects due to elastic confinement play a role in the nucleation dynamics observed here and include a discussion of these mechanisms in the revised manuscript.

      (3) Unclear description of nucleation in the model. 

      We thank the reviewer for pointing out the lack of clarity in our description of nucleation. R_0​ represents the critical radius for nucleation, beyond which droplets grow spontaneously. The nucleation probability p_nuc​ is evaluated at R_0​, which depends on the free energy barrier ΔG, supersaturation S, and the elastic properties of the surrounding medium. We will include a clearer explanation of R_0​, its dependence on parameters, and its role in nucleation in the revised manuscript.

      We ensure that the stiffness is sampled from a truncated normal distribution, preventing negative stiffness values. Sampling is performed at fixed intervals, and we will clarify the protocol to avoid bias and ensure consistency in the simulations.

      Supersaturation S will be defined regarding solute and solvent concentrations, and we will discuss its influence on ΔG and R_0​.

      The dependence of the elastic pressure P_E​ on R_0​, with stiffer surroundings leading to smaller nucleated droplets, will be explicitly clarified. We also agree that Figure S4A may be misleading, as it suggests spatial correlations in stiffness. We will revise the figure and caption to better represent the model assumptions.

      (4) Limited data for the elastic ripening claim.

      We acknowledge the reviewer’s concern regarding the limitation of support for the claim in the current manuscript. We believe our data do indicate elastic ripening. Particularly, the data points very close to zero are not necessarily artifacts of the fitting, as the elastic ripening can be very slow due to small differences in the local stiffness values around the droplets. We have mentioned this at the end of the section “Condensate material properties and chromatin heterogeneity determine the modes of ripening”. We shall revisit these results and remedy this concern with more data and analysis in the revised manuscript. 

      (5) Confusion for dynamic regimes such as "fusion", "ripening", and "diffusion-based" and the problem with using “ripening time” to compare ripening speed.

      We will clear up our definitions of the dynamic regimes and ensure consistent language use. The ripening time was defined as the time it takes per length of droplets to shrink. This way, the size dependence of the absolute ripening time is decoupled and thus can be used to compare the speed of ripening between two condensates. This is not well-explained in our current version. In the revision, we will redefine the normalized ripening time to avoid this confusion. 

      (6) Chromatin should be excluded from the condensates 

      We have data to support that chromatin is excluded from the condensates. We will add the data in the revision. 

      (7) Effect of protein production on the diffusive growth process.

      From the experiment, we do not believe that protein production is a significant source of the diffusive growth because for coiled-coil condensates nucleated with Hotag3 there was little diffusive growth. In the model also, condensates can grow for hours in the absence of protein production, depending on chromatin stiffness and surface tension. We aim to address the effect of protein production on growth in the revised manuscript.

    1. eLife Assessment

      This study presents important advances in the discovery and assessment of microcins that improve our understanding of their prevalence and roles. The bioinformatics analysis, expression, and antimicrobial assays are solid, although the diverging evaluations also indicated the need for additional support regarding the sequence analysis and validation to fully back some of the claims and conclusions. This study will appeal to researchers working on the discovery and analysis of novel peptide natural products.

    2. Reviewer #1 (Public review):

      Summary:

      Enterobacteriaceae produce microcins to target their competitors. Using informatics approaches, the authors identified 12 new microcins. They expressed them in E. coli, demonstrating that the microcins have antimicrobial activity against other microbes, including plant pathogens and the ESKAPE pathogens Pseudomonas aeruginosa and Acinetobacter baumannii.

      Strengths:

      Overall, this study has the merit of identifying new potential antimicrobial molecules that could be used to target important pathogens. The bioinformatics analysis, the expression system used, and the antimicrobial assays performed are solid, and the data presented are convincing. This work will set the basis for new studies to investigate the potential role of these microcins in vivo.

      Weaknesses:

      The work has been performed in vitro, which is a valid approach for identifying the antimicrobial peptides and assessing their antimicrobial activity. Future studies will need to address whether these new microcins exhibit antimicrobial activity in vivo (e.g., in the context of infection models), and to identify the targets (receptor and mechanisms of action) for the new microcins.

    3. Reviewer #2 (Public review):

      Mortzfeld et al. describe their study of class IIb microcins. Furthering our awareness of the presence and action of microcins is an important line of research. However, several issues related to the premise, sequence analysis, and validation require attention to support the claims.

      (1) Previous studies have been published on the broader distribution of microcins across bacteria. The software has been published for their identification. Comparison to this software and/or discussion of previous work should be included to place this work in the context of the field.

      (2) It is not clear how immunity proteins were identified and there does not appear to be functional confirmation to show these predicted immunity proteins are real. Thus, it is premature to state that immunity genes have been found. This may also confound some of the validation studies below if proper immunity proteins have not been included.

      (3) Please show the nt alignment used to generate the tree. Without seeing it, one would guess that the sequences are either quite similar (making the results from this study less novel) or there would be concerns that the phylogenetic relationship derived from the nt alignment is spurious.

      (4) Figure 1 B-C: There are numerous branches that do not have phylogenetic support (values <50%). These are not statistically valid phylogenetic relationships and should be collapsed. The resulting tree should be used in the description of clades.

      (5) The discovered microcins are not being directly tested since they are expressed heterologous and reliant on non-native modification systems. The results present the statement that novel microcins have been validated. This should be described accordingly.

      (6) The key finding of this paper is the claim that 12 novel class IIb microcins have been validated. To substantiate this claim, original images showing evidence of antibacterial activity must be made available rather than a presence/absence chart. The negative controls for this table are unclear and should be included with the original images.

      (7) Further data for the purified microcin is needed. The purification method described is standard practice and should allow for product quantification, which should be included. Standard practice includes an SDS page showing the purity of the microcin, or at least the TEV digest to show microcin has been produced, and importantly a control sample (scrambled sequence, empty vector purification, etc) to show that observed activity (Figure 2B) is not from a purification carry over. This data should be included to support that microcin has been purified and is active.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, several novel class IIb microcin biosynthetic gene clusters have been discovered by specific homology searches and manual curation. Using a specific E. coli expression system, the microcins were expressed and conjugated to monoglycosylated enterobactin as siderophore moiety. While this synthetic biology approach cannot account for other siderophores being coupled to the microcin core peptide in the original producing strains, it nonetheless allows for a general screening for the activity of the heterologously produced compounds. Through this approach, the activity of several predicted microcins has been confirmed and three novel class IIb microcin clades were identified.

      Strengths:

      The experimental design is sound, the results are corroborated by suitable controls, and the findings have a high level of novelty and significance. Furthermore, the comments of the initial round of peer review have been answered satisfactorily by the authors.

    5. Author response:

      We thank the anonymous very much for dedicating their time to thoroughly review our manuscript. We sincerely appreciate their thoughtful consideration and detailed assessment. Regarding the raised concerns, we acknowledge the importance of exploring the full scope of class IIb microcins, however, we believe that in depth characterization, purification, and in vivo application of the 12 novel compounds goes beyond the scope of this short report and discovery article.

      At the same time, the reviewers acknowledge that the analysis, experimental design, the expression system as well as the performed assays are “sound”, “convincing”, and “corroborated by suitable controls”. In the present manuscript we sought to identify novel antimicrobials and to comprehensively verify their antimicrobial activity in E. coli irrespective of the siderophore-dependent delivery mechanism. Notably, none of the reviewers questioned that we describe new antimicrobials, the characteristics we used to find them, that they are class IIb microcins, or that they do exhibit antimicrobial activity against Gram-negative ESKAPE and plant pathogens.

      We believe that our discovery study can serve as a steppingstone towards the application of bacterially produced antimicrobial compounds to target Gram negative pathogens in numerous plant and animal species, including humans.

    1. Author response:

      Our response to Reviewer #1:

      We appreciate the reviewer’s comments to clarify the strengths and weaknesses of our work. Whether the effect of GM-CSF/IL-3 on the bowel is pro-inflammatory or anti-inflammatory has been controversial. In the present study, we have shown that CD131 mediated a pro-inflammatory effect of GM-CSF on the intestine, which may have worked in synergy with tissue-infiltrating macrophages. While its down-stream signaling has been investigated back and forth, we did not put effort into it. Using macrophage-specific CD131-deficient animals is important to clarify the effects of macrophage-specific CD131 on bowel inflammation. Our present work is indeed incomplete, and we anticipate to work on it further in future research. Concerning the results on human subjects, it is indeed that results from animal experiments were not completely reproduced. We believe that CD131 does have an effect on ulcerative colitis; however, due to the use of biological agents (e.g. anti-TNFs), the need for surgery in the treatment of ulcerative colitis has dramatically decreased and we could not get enough samples to reach a more convincing statistical analysis. Twenty-nine patients shown in the present study were all that received surgical intervention at our center during the past decade, and more human subjects will be needed in future research, possibly from multi-center study.

      Our response to Reviewer #2:

      Many appreciations for the valuable reviewer’s comments and suggestions. We realized that the number of animals per group was not indicated in each figure; in order to clarify the experimental rigor, we have deposited data used to generate the results of the present study in Dryad. Concerning the heterozygous CD131 knock-out animals, we think that others have used the homozygous mice in their studies; however, we observed premature deaths in those animals and we could not get any single homozygous mouse. We could not tell the exact reason, but we did observe robust phenotypes in these heterozygous mice. We do realize that our present work is incomplete, and more experiments need to be done to establish a causal relationship between CD131 and down-stream effects. We anticipate to use macrophage-specific homozygous CD131-deficient mice in our future research, which we believe will produce more meaningful and convincing results.

    2. eLife Assessment

      Ulcerative colitis (UC) is a chronic gut inflammatory condition affecting the colon in humans. This study uses human samples as well as a mouse model of colitis induced by a chemical, DSS, to investigate the role of an immune marker, CD131, in UC pathogenesis. The study, as presented, is incomplete, as experimental details are lacking, the statistical analyses are deficient, and there is not yet direct evidence for a CD131-mediated mechanism of gut inflammation.

    3. Reviewer #1 (Public review):

      Summary:

      This study investigates the role of CD131, a receptor subunit for GM-CSF and IL-3, in ulcerative colitis pathogenesis using a DSS-induced murine colitis model. By comparing wild-type and CD131-deficient mice, the authors demonstrate that CD131 contributes to DSS-induced colitis, working in concert with tissue-infiltrating macrophages.

      Strengths:

      The research shows that CD131's influence on macrophage and T cell chemotaxis is mediated by CCL4. The authors conclude by proposing a pro-inflammatory role for CD131 in murine colitis and suggest potential clinical relevance in human inflammatory bowel disease.

      Weaknesses:

      The statistical association between increased CD131 expression and clinical IBD was not observed in Table 1, indicating that the main results from animal experiments were not reproduced in human subjects. Additionally, due to the absence of experimental results regarding the downstream signaling pathways through CD131, it is difficult to infer the precise differentiated outcomes of this study. Furthermore, the effects of CD131 on immune cells other than macrophages were not presented, and the results specific to macrophage-selective CD131 were not shown. Therefore, I conclude that it is challenging to provide a detailed review as there is a lack of supporting evidence for the core arguments made in this paper.

    4. Reviewer #2 (Public review):

      Summary:

      This study investigates the potential role of CD131, a cytokine receptor subunit shared by GM-CSF and IL-3, in intestinal inflammation. Using heterozygous mice with an inactivating mutation on this gene, the study demonstrates ameliorated inflammation, associated with less infiltration of macrophages. Moreover, the depletion of macrophages prevented many of the inflammatory effects of DSS and made both WT and mutant mice equivalent in terms of inflammation severity. Correlative data showing increased CD131+ cells in tissues of patients with ulcerative colitis is also demonstrating, evidence for plausibility for these pathways in human disease.

      Strengths:

      The phenotype of mutant mice seems quite robust and the pathways proposed, GM-CSF signaling in macrophages with CCL4 as a downstream pathway, are all plausible and concordant with existing models. Many of the experiments included meaningful endpoints and were overall well performed.

      Weaknesses:

      (1) Experimental rigor was lacking in this manuscript, which provided limited or no details on the number of independent iterations that each experiment was done, the number of animals per group, the number of technical or biological replicates in each graph, etc.

      (2) Details of animal model validation showing that this particular mutant allele results in a lack of CD131 protein expression were not shown. Moreover, since the paper uses heterozygous mice, it is critical to show that at the protein level, there is indeed reduced expression of CD131 in het mice compared to controls (many heterozygous states do not lead to appreciable protein depletion).

      (3) Another major weakness is that the paper asserts a causal relationship between CD131 signaling and CCL4 production: the data shown indicates that the phenotypes of CCL4 deficiency (through Ab blockade) and CD131 partial deficiency (in het mice) are similar. However, this does not establish that CD131 signaling acts through CCL4.

      (4) Lastly, while the paper claims that CD131 acts through macrophage recruitment, the evidence is circumstantial and not direct. DSS-induced acute colitis is largely mediated by macrophages, so any manipulation associated with less severe inflammation is accompanied by lesser macrophage infiltration in this model: this does not directly establish that CD131 acts directly on macrophages, which would require cell-specific knockout or complex cell reconstitution experiments.

    1. eLife Assessment

      This important paper reports functional interactions between L1TD1, an RNA binding protein (RBP), and its ancestral LINE-1 retrotransposon which is not modulated at the translational level. The evidence for the association between L1TD1 and LINE-1 ORF1p is solid. The work implies that the a transposon-derived RNA binding protein in the human genome can interact with the ancestral transposable element from which this protein was initially derived. This work spurs interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

    2. Reviewer #1 (Public review):

      Summary:

      In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon.

      The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.

      To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.

      Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells compared to DNMT1 KO alone.

      Strengths:

      The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

      Weaknesses:

      Suggestions for refinement:

      The initial experiment, inducing global hypo-methylation by eliminating DNMT1 in HAP1 cells, is intriguing and warrants more detailed description. How many genes experience mis-regulation or aberrant expression? What phenotypic changes occur in these cells? Why did the authors focus on L1TD1? Providing some of this data would be helpful to understand the rationale behind the thorough analysis of L1TD1.

      The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transposition-positive colonies? Further exploration of this phenomenon would be intriguing.

    3. Reviewer #2 (Public review):

      In this study, Kavaklıoğlu et al. investigated and presented evidence for a role for domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation dependent manner, due to DNMT1 deletion in HAP1 cell line. The authors then identified L1TD1 associated RNAs using RIP-Seq, which display a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found L1TD1 protein associated with L1-RNPs and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expression, and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish feasibility of this relationship existing in vivo in either development or disease, or both.

      Comments on revised version:

      In general, the authors did an acceptable job addressing the major concerns throughout the manuscript. This revision is much clearer and has improved in terms of logical progression.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review): 

      Summary: 

      In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon. 

      The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.  

      To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.  

      Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells than in DNMT1 KO alone.  

      Strengths: 

      The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.  

      Weaknesses: 

      Suggestions for refinement:  

      The initial experiment, inducing global hypo-methylation by eliminating DNMT1 in HAP1 cells, is intriguing and warrants a more detailed description. How many genes experience misregulation or aberrant expression? What phenotypic changes occur in these cells? 

      This is an excellent suggestion. We have gene expression data on WT versus DNMT1 KO HAP1 cells and have included them now as Suppl. Figure S1. The  transcriptome analysis of DNMT1 KO cells showed hundreds of deregulated genes upon DNMT1 ablation. As expected, the majority were up-regulated and gene ontology analysis revealed that among the strongest up-regulated genes were gene clusters with functions in “regulation of transcription from RNA polymerase II promoter” and “cell differentiation” and genes encoding proteins with KRAB domains. In addition, the de novo methyltransferases DNMT3A and DNMT3B were up-regulated in DNMT1 KO cells suggesting the set-up of compensatory mechanisms in these cells. 

      Why did the authors focus on L1TD1? Providing some of this data would be helpful to understand the rationale behind the thorough analysis of L1TD1. 

      We have previously discovered that conditional deletion of the maintenance DNA methyltransferase DNMT1 in the murine epidermis results not only in the up-regulation of mobile elements, such as IAPs but also the induced expression of L1TD1 ([1], Suppl. Table 1 and Author response image 1). Similary, L1TD1 expression was induced by treatment of primary human keratinocytes or squamous cell carcinoma cells with the DNMT inhibitor azadeoxycytidine (Author response images 2 and 3). These findings are in accordance with the observation  that inhibition of DNA methyltransferase activity by aza-deoxycytidine in human non-small cell lung cancer cells (NSCLCs) results in up-regulation of L1TD1 [2]. Our interest in L1TD1 was further fueled by reports on a potential function of L1TD1 as prognostic tumor marker. We have included this information in the last paragraph of the Introduction in the revised manuscript.

      Author response image 1. RT-qPCR of L1TD1 expression in cultured murine control and Dnmt1 Δ/Δker keratinocytes. mRNA levels of L1td1 were analyzed in keratinocytes isolated at P5 from conditional Dnmt1 knockout mice [1]. Hprt expression was used for normalization of mRNA levels and wildtype control was set to 1. Data represent means ±s.d. with n=4. **P < 0.01 (paired t-test). 

      Author response image 2. RT-qPCR analysis of L1TD1 expression in primary human keratinocytes. Cells were treated with 5-aza-2-deoxycidine for 24 hours or 48 hours, with PBS for 48 hours or were left untreated. 18S rRNA expression was used for normalization of mRNA levels and PBS control was set to 1. Data represent means ±s.d. with n=3. **P < 0.01 (paired t-test).

      Author response image 3. Induced L1TD1 expression upon DNMT inhibition in squamous cell carcinoma cell lines SCC9 and SCCO12. Cells were treated with 5-aza-2-deoxycidine for 24 hours, 48 hours or 6 days. (A) Western blot analysis of L1TD1 protein levels using beta-actin as loading control. (B) Indirect immunofluorescence microscopy analysis of L1TD1 expression in SCC9 cells. Nuclear DNA was stained with DAPI. Scale bar: 10 µm. (C)  RT-qPCR analysis of L1TD1 expression in primary human keratinocytes. Cells were treated with 5-aza-2deoxycidine for 24 hours or 48 hours, with PBS for 48 hours or were left untreated. 18S rRNA expression was used for normalization of mRNA levels and PBS control was set to 1. Data represent means ±s.d. with n=3. *P < 0.05, **P < 0.01 (paired t-test).

      The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transposition-positive colonies? Further exploration of this phenomenon would be intriguing. 

      This is an important point and we were aware of this potential problem. Therefore, we calibrated the retrotransposition assay by transfection with a blasticidin resistance gene vector to take into account potential differences in cell viability and blasticidin sensitivity. Thus, the observed reduction in L1 retrotransposition efficiency is not an indirect effect of reduced cell viability. We have added a corresponding clarification in the Results section on page 8, last paragraph. 

      Based on previous studies with hESCs and germ cell tumors [3], it is likely that, in addition to its role in retrotransposition, L1TD1 has further functions in the regulation of cell proliferation and differentiation. L1TD1 might therefore attenuate the effect of DNMT1 loss in KO cells generating an intermediate phenotype (as pointed out by Reviewer 2) and simultaneous loss of both L1TD1 and DNMT1 results in more pronounced effects on cell viability. This is in agreement with the observation that a subset of L1TD1 associated transcripts encode proteins involved in the control of cell division and cell cycle. It is possible that subtle changes in the expression of these protein that were not detected in our mass spectrometry approach contribute to the antiproliferative effect of L1TD1 depletion as discussed in the Discussion section of the revised manuscript. 

      Reviewer #2 (Public Review):           

      In this study, Kavaklıoğlu et al. investigated and presented evidence for the role of domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation-dependent manner, due to DNMT1 deletion in the HAP1 cell line. The authors then identified L1TD1-associated RNAs using RIP-Seq, which displays a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, which is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found the L1TD1 protein associated with L1-RNPs, and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expressed and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish the feasibility of this relationship existing in vivo in either development, disease, or both.   

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):        

      Major 

      (1) The study only used one knockout (KO) cell line generated by CRISPR/Cas9. Considering the possibility of an off-target effect, I suggest the authors attempt one or both of these suggestions. 

      A) Generate or acquire a similar DMNT1 deletion that uses distinct sgRNAs, so that the likelihood of off-targets is negligible. A few simple experiments such as qRT-PCR would be sufficient to suggest the same phenotype.  

      B) Confirm the DNMT1 depletion also by siRNA/ASO KD to phenocopy the KO effect.  (2) In addition to the strategies to demonstrate reproducibility, a rescue experiment restoring DNMT1 to the KO or KD cells would be more convincing. (Partial rescue would suffice in this case, as exact endogenous expression levels may be hard to replicate). 

      We have undertook several approaches to study the effect of DNMT1 loss or inactivation: As described above, we have generated a conditional KO mouse with ablation of DNMT1 in the epidermis. DNMT1-deficient keratinocytes isolated from these mice show a significant increase in L1TD1 expression.  In addition, treatment of primary human keratinocytes and two squamous cell carcinoma cell lines with the DNMT inhibitor aza-deoxycytidine led to upregulation of L1TD1 expression. Thus, the derepression of L1TD1 upon loss of DNMT1 expression or activity is not a clonal effect. Also, the spectrum of RNAs identified in RIP experiments as L1TD1-associated transcripts in HAP1 DNMT1 KO cells showed a strong overlap with the RNAs isolated by a related yet different method in human embryonic stem cells. When it comes to the effect of L1TD1 on L1-1 retrotranspostion, a recent study has reported a similar effect of L1TD1 upon overexpression in HeLa cells [4].  

      All of these points together help to convince us that our findings with HAP1 DNMT KO are in agreement with results obtained in various other cell systems and are therefore not due to off-target effects. With that in mind, we would pursue the suggestion of Reviewer 1 to analyze the effects of DNA hypomethylation upon DNMT1 ablation.

      (3) As stated in the introduction, L1TD1 and ORF1p share "sequence resemblance" (Martin 2006). Is the L1TD1 antibody specific or do we see L1 ORF1p if Fig 1C were uncropped?  (6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).  

      This is a relevant question. We are convinced that the L1TD1 antibody does not crossreact with L1 ORF1p for the following reasons: Firstly, the antibody does not recognize L1 ORF1p (40 kDa) in the  uncropped Western blot for Figure 1C (Author response image 4A). Secondly, the L1TD1 antibody gives only background signals in DKO cells in the  indirect immunofluorescence experiment shown in Figure 1E of the manuscript. 

      Thirdly, the immunogene sequence of L1TD1 that determines the specificity of the antibody was checked in the antibody data sheet from Sigma Aldrich. The corresponding epitope is not present in the L1 ORF1p sequence. Finally, we have shown that the ORF1p antibody does not cross-react with L1TD1 (Author response image 4B).

      Author response image 4. (A) Uncropped L1TD1 Western blot shown in Figure 1C. An unspecific band is indicated by an asterisk. (B) Westernblot analysis of WT, KO and DKO cells with L1 ORF1p antibody.

      (4) In abstract (P2), the authors mentioned that L1TD1 works as an RNA chaperone, but in the result section (P13), they showed that L1TD1 associates with L1 ORF1p in an RNAindependent manner. Those conclusions appear contradictory. Clarification or revision is required. 

      Our findings that both proteins bind L1 RNA, and that L1TD1 interacts with ORF1p are compatible with a scenario where L1TD1/ORF1p heteromultimers bind to L1 RNA. The additional presence of L1TD1 might thereby enhance the RNA chaperone function of ORF1p. This model is visualized now in Suppl. Figure S7C. 

      (5) Figure 2C fold enrichment for L1TD1 and ARMC1 is a bit difficult to fully appreciate. A 100 to 200-fold enrichment does not seem physiological. This appears to be a "divide by zero" type of result, as the CT for these genes was likely near 40 or undetectable. Another qRT-PCRbased approach (absolute quantification) would be a more revealing experiment. 

      This is the validation of the RIP experiments and the presentation mode is specifically developed for quantification of RIP assays (Sigma Aldrich RIP-qRT-PCR: Data Analysis Calculation Shell). The unspecific binding of the transcript in the absence of L1TD1 in DNMT1/L1TD1 DKO cells is set to 1 and the value in KO cells represents the specific binding relative the unspecific binding. The calculation also corrects for potential differences in the abundance of the respective transcript in the two cell lines. This is not a physiological value but the quantification of specific binding of transcripts to L1TD1. GAPDH as negative control shows no enrichment, whereas specifically associated transcripts show strong enrichement. We have explained the details of RIPqRT-PCR evaluation in Materials and Methods (page 14) and the legend of Figure 2C in the revised manuscript.       

      (6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).            

      See response to (3).  

      (7) Figure S4A and S4B: There appear to be a few unusual aspects of these figures that should be pointed out and addressed. First, there doesn't seem to be any ORF1p in the Input (if there is, the exposure is too low). Second, there might be some L1TD1 in the DKO (lane 2) and lane 3. This could be non-specific, but the size is concerning. Overexposure would help see this.

      The ORF1p IP gives rise to strong ORF1p signals in the immunoprecipitated complexes even after short exposure. Under these contions ORF1p is hardly detectable in the input. Regarding the faint band in DKO HAP1 cells, this might be due to a technical problem during Western blot loading. Therefore, the input samples were loaded again on a Western blot and analyzed for the presence of ORF1p, L1TD1 and beta-actin (as loading control) and shown as separate panel in Suppl. Figure S4A. 

      (8) Figure S4C: This is related to our previous concerns involving antibody cross-reactivity. Figure 3E partially addresses this, where it looks like the L1TD1 "speckles" outnumber the ORF1p puncta, but overlap with all of them. This might be consistent with the antibody crossreacting. The western blot (Figure 3C) suggests an upregulation of ORF1p by at least 2-3x in the DKO, but the IF image in 3E is hard to tell if this is the case (slightly more signal, but fewer foci). Can you return to the images and confirm the contrast are comparable? Can you massively overexpose the red channel in 3E to see if there is residual overlap? 

      In Figure 3E the L1TD1 antibody gives no signal in DNMT1/L1TD1 DKO cells confirming that it does not recognize ORF1p. In agreement with the Western blot in Figure 3C the L1 ORF1p signal in Figure 3E is stronger in DKO cells. In DNMT1 KO cells the L1 ORF1p antibody does not recognize all L1TD1 speckles. This result is in agreement with the Western blot shown above in Figure R4B and indicates that the L1 ORF1p antibody does not recognize the L1TD1 protein. The contrast is comparable and after overexposure there are still L1TD1 specific speckles. This might be due to differences in abundance of the two proteins.

      (9) The choice of ARMC1 and YY2 is unclear. What are the criteria for the selection?

      ARMC1 was one of the top hits in a pilot RIP-seq experiment (IP versus input and IP versus  IgG IP). In the actual RIP-seq experiment with DKO HAP1 cells instead of IgG IP as a negative control, we found ARMC1 as an enriched hit, although it was not among the top 5 hits. The results from the 2nd RIP-seq further confirmed the validity of ARMC1 as an L1TD1-interacting transcript. YY2 was of potential biological relevance as an L1TD1 target due to the fact that it is a processed pseudogene originating from YY1 mRNA as a result of retrotransposition. This is mentioned on page 6 of the revised manuscript.

      (10) (P16) L1 is the only protein-coding transposon that is active in humans. This is perhaps too generalized of a statement as written. Other examples are readily found in the literature. Please clarify.  

      We will tone down this statement in the revised manuscript. 

      (11) In both the abstract and last sentence in the discussion section (P17), embryogenesis is mentioned, but this is not addressed at all in the manuscript. Please refrain from implying normal biological functions based on the results of this study unless appropriate samples are used to support them.

      Much of the published data on L1TD1 function are related to embryonic stem cells [3-7]. Therefore, it is important to discuss our findings in the context of previous reports.

      (12) Figure 3E: The format of Figures 1A and 3E are internally inconsistent. Please present similar data/images in a cohesive way throughout the manuscript.  

      We show now consistent IF Figures in the revised manuscript.

      Minor: 

      (1) Intro:           

      - Is L1Td1 in mice and Humans? How "conserved" is it and does this suggest function?  

      Murine and human L1TD1 proteins share 44% identity on the amino acid level and it was suggested that the corresponding genes were under positive selection during evolution with functions in transposon control and maintenance of pluripotency [8].  

      - Why HAP1? (Haploid?) The importance of this cell line is not clear.          

      HAP1 is a nearly haploid human cancer cell line derived from the KBM-7 chronic myelogenous leukemia (CML) cell line [9, 10]. Due to its haploidy is perfectly suited and widely used for loss-of-function screens and gene editing. After gene editing  cells can be used in the nearly haploid or in the diploid state. We usually perform all experiments with diploid HAP1 cell lines.  Importantly, in contrast to other human tumor cell lines, this cell line tolerates ablation of DNMT1. We have included a corresponding explanation in the revised manuscript on page 5, first paragraph.

      - Global methylation status in DNMT1 KO? (Methylations near L1 insertions, for example?) 

      The HAP1 DNMT1 KO cell line with a 20 bp deletion in exon 4 used in our study was validated in the study by Smits et al. [11]. The authors report a significant reduction in overall DNA methylation. However, we are not aware of a DNA methylome study on this cell line. We show now data on the methylation of L1 elements in HAP1 cells and upon DNMT1 deletion in the revised manuscript in Suppl. Figure S1B.

      (2) Figure 1:  

      - Figure 1C. Why is LMNB used instead of Actin (Fig1D)?  

      We show now beta-actin as loading control in the revised manuscript.  

      - Figure 1G shows increased Caspase 3 in KO, while the matching sentence in the result section skips over this. It might be more accurate to mention this and suggest that the single KO has perhaps an intermediate phenotype (Figure 1F shows a slight but not significant trend). 

      We fully agree with the reviewer and have changed the sentence on page 6, 2nd paragraph accordingly.  

      - Would 96 hrs trend closer to significance? An interpretation is that L1TD1 loss could speed up this negative consequence. 

      We thank the reviewer for the suggestion. We have performed a time course experiment with 6 biological replicas for each time point up to 96 hours and found significant changes in the viability upon loss of DNMT1 and again significant reduction in viability upon additional loss of L1TD1 (shown in Figure 1F). These data suggest that as expexted loss of DNMT1 leads to significant reduction viability and that additional ablation of L1TD1 further enhances this effect.

      - What are the "stringent conditions" used to remove non-specific binders and artifacts (negative control subtraction?) 

      Yes, we considered only hits from both analyses, L1TD1 IP in KO versus input and L1TD1 IP in KO versus L1TD1 IP in DKO. This is now explained in more detail in the revised manuscript on page 6, 3rd paragraph.  

      (3) Figure 2:  

      - Figure 2A is a bit too small to read when printed. 

      We have changed this in the revised manuscript.

      - Since WT and DKO lack detectable L1TD1, would you expect any difference in RIP-Seq results between these two?

      Due to the lack of DNMT1 and the resulting DNA hypomethylation, DKO cells are more similar to KO cells than WT cells with respect to the expressed transcripts.

      - Legend says selected dots are in green (it appears blue to me). 

      We have changed this in the revised manuscript.           

      - Would you recover L1 ORF1p and its binding partners in the KO? (Is the antibody specific in the absence of L1TD1 or can it recognize L1?) I noticed an increase in ORF1p in the KO in Figure 3C.  

      Thank you for the suggestion. Yes, L1 ORF1p shows slightly increased expression in the proteome analysis and we have marked the corresponding dot in the Volcano plot (Figure 3A).

      - Should the figure panel reference near the (Rosspopoff & Trono) reference instead be Sup S1C as well? Otherwise, I don't think S1C is mentioned at all. 

      - What are the red vs. green dots in 2D? Can you highlight ERV and ALU with different colors? 

      We added the reference to Suppl. Figure S1C (now S3C) in the revised manuscript. In Figure 2D L1 elements are highlighted in green, ERV elements in yellow, and other associated transposon transcripts in red.     

      - Which L1 subfamily from Figure 2D is represented in the qRT-PCR in 2E "LINE-1"? Do the primers match a specific L1 subfamily? If so, which? 

      We used primers specific for the human L1.2 subfamily. 

      - Pulling down SINE element transcripts makes some sense, as many insertions "borrow" L1 sequences for non-autonomous retro transposition, but can you speculate as to why ERVs are recovered? There should be essentially no overlap in sequence. 

      In the L1TD1 evolution paper [8], a potential link between L1TD1 and ERV elements was discussed: 

      "Alternatively, L1TD1 in sigmodonts could play a role in genome defense against another element active in these genomes. Indeed, the sigmodontine rodents have a highly active family of ERVs, the mysTR elements [46]. Expansion of this family preceded the death of L1s, but these elements are very active, with 3500 to 7000 species-specific insertions in the L1-extinct species examined [47]. This recent ERV amplification in Sigmodontinae contrasts with the megabats (where L1TD1 has been lost in many species); there are apparently no highly active DNA or RNA elements in megabats [48]. If L1TD1 can suppress retroelements other than L1s, this could explain why the gene is retained in sigmodontine rodents but not in megabats." 

      Furthermore, Jin et al. report the binding of L1TD1 to repetitive sequences in transcripts [12]. It is possible that some of these sequences are also present in ERV RNAs.

      - Is S2B a screenshot? (the red underline). 

      No, it is a Powerpoint figure, and we have removed the red underline.

      (4) Figure 3: 

      - Text refers to Figure 3B as a western blot. Figure 3B shows a volcano plot. This is likely 3C but would still be out of order (3A>3C>3B referencing). I think this error is repeated in the last result section. 

      - Figure and legends fail to mention what gene was used for ddCT method (actin, gapdh, etc.). 

      - In general, the supplemental legends feel underwritten and could benefit from additional explanations. (Main figures are appropriate but please double-check that all statistical tests have been mentioned correctly).

      Thank you for pointing this out. We have corrected these errors in the revised manuscript.

      (5) Discussion: 

      -Aluy connection is interesting. Is there an "Alu retrotransposition reporter assay" to test whether L1TD1 enhances this as well? 

      Thank you for the suggestion. There is indeed an Alu retrotransposition reporter assay reported be Dewannieux et al. [13]. The assay is based on a Neo selection marker. We have previously tested a Neo selection-based L1 retrotransposition reporter assay, but this system failed to properly work in HAP1 cells, therefore we switched to a blasticidinbased L1 retrotransposition reporter assay. A corresponding blasticidin-based Alu retrotransposition reporter assay might be interesting for future studies (mentioned in the Discussion, page 11 paragraph 4 of the revised manuscript.

      (6) Material and Methods       : 

      - The number of typos in the materials and methods is too numerous to list. Instead, please refer to the next section that broadly describes the issues seen throughout the manuscript. 

      Writing style  

      (1) Keep a consistent style throughout the manuscript: for example, L1 or LINE-1 (also L1 ORF1p or LINE-1 ORF1p); per or "/"; knockout or knock-out; min or minute; 3 times or three times; media or medium. Additionally, as TE naming conventions are not uniform, it is important to maintain internal consistency so as to not accidentally establish an imprecise version. 

      (2) There's a period between "et al" and the comma, and "et al." should be italic. 

      (3) The authors should explain what the key jargon is when it is first used in the manuscript, such as "retrotransposon" and "retrotransposition".    

      (4) The authors should show the full spelling of some acronyms when they use it for the first time, such as RNA Immunoprecipitation (RIP).  

      (5) Use a space between numbers and alphabets, such as 5 µg.  

      (6) 2.0 × 105 cells, that's not an "x".  

      (7) Numbers in the reference section are lacking (hard to parse).  

      (8) In general, there are a significant number of typos in this draft which at times becomes distracting. For example, (P3) Introduction: Yet, co-option of TEs thorough (not thorough, it should be through) evolution has created so-called domesticated genes beneficial to the gene network in a wide range of organisms. Please carefully revise the entire manuscript for these minor issues that collectively erode the quality of this submission.  

      Thank you for pointing out these mistakes. We have corrected them in the revised manuscript. A native speaker from our research group has carefully checked the paper. In summary, we have added Supplementary Figure S7C and have changed Figures 1C, 1E, 1F, 2A, 2D, 3A, 4B, S3A-D, S4B and S6A based on these comments. 

      REFERENCES

      (1) Beck, M.A., et al., DNA hypomethylation leads to cGAS-induced autoinflammation in the epidermis. EMBO J, 2021. 40(22): p. e108234.

      (2) Altenberger, C., et al., SPAG6 and L1TD1 are transcriptionally regulated by DNA methylation in non-small cell lung cancers. Mol Cancer, 2017. 16(1): p. 1.

      (3) Narva, E., et al., RNA-binding protein L1TD1 interacts with LIN28 via RNA and is required for human embryonic stem cell self-renewal and cancer cell proliferation. Stem Cells, 2012. 30(3): p. 452-60.

      (4) Jin, S.W., et al., Dissolution of ribonucleoprotein condensates by the embryonic stem cell protein L1TD1. Nucleic Acids Res, 2024. 52(6): p. 3310-3326.

      (5) Emani, M.R., et al., The L1TD1 protein interactome reveals the importance of posttranscriptional regulation in human pluripotency. Stem Cell Reports, 2015. 4(3): p. 519-28.

      (6) Santos, M.C., et al., Embryonic Stem Cell-Related Protein L1TD1 Is Required for Cell Viability, Neurosphere Formation, and Chemoresistance in Medulloblastoma. Stem Cells Dev, 2015. 24(22): p. 2700-8.

      (7) Wong, R.C., et al., L1TD1 is a marker for undifferentiated human embryonic stem cells. PLoS One, 2011. 6(4): p. e19355.

      (8) McLaughlin, R.N., Jr., et al., Positive selection and multiple losses of the LINE-1-derived L1TD1 gene in mammals suggest a dual role in genome defense and pluripotency. PLoS Genet, 2014. 10(9): p. e1004531.

      (9) Andersson, B.S., et al., Ph-positive chronic myeloid leukemia with near-haploid conversion in vivo and establishment of a continuously growing cell line with similar cytogenetic pattern. Cancer Genet Cytogenet, 1987. 24(2): p. 335-43.

      (10) Carette, J.E., et al., Ebola virus entry requires the cholesterol transporter Niemann-Pick C1. Nature, 2011. 477(7364): p. 340-3.

      (11) Smits, A.H., et al., Biological plasticity rescues target activity in CRISPR knock outs. Nat Methods, 2019. 16(11): p. 1087-1093.

      (12) Jin, S.W., et al., Dissolution of ribonucleoprotein condensates by the embryonic stem cell protein L1TD1. Nucleic Acids Res, 2024.

      (13) Dewannieux, M., C. Esnault, and T. Heidmann, LINE-mediated retrotransposition of marked Alu sequences. Nat Genet, 2003. 35(1): p. 41-8.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the Authors:

      Reviewer #2:

      (1) In my previous review, I noted that using three different movies to conclude that different genres evoke different thought patterns is an overinterpretation with only one instance per genre. In the rebuttal letter, the authors state that they provide "evidence that is necessary but not sufficient to conclude that we can distinguish different genres of films" (page 15). Accordingly, I suggest refraining from statements such as "There was a significant main effect of movie genre on memory" (page 13) in the manuscript.

      Thank you for this point. We have removed any reference to genre.

      Page 18 (referring to page 13) [354-355] “First, there was a significant main effect of movie on memory, F(2, 254.12) = 49.33, p <.001, η2 = .28.”

      Reviewer #3:

      The revised manuscript is easier to read and better contextualized.

      Thank you for this comment and for your feedback to allow us to make the manuscript more clear.

      Public Reviews:

      Reviewer #1:

      The lack of direct interrogation of individual differences/reliability of the mDES scores warrants some pause.

      Our study's goal was to understand how group-level patterns of thought in one group of participants relate to brain activity in a different group of participants. To this end, we decomposed trial-level mDES data to show dimensions that are common across individuals, which demonstrated excellent split-half reliability. Then we used these data in two complementary ways. First, we established that these ratings reliably distinguished between the different films (showing that our approach is sensitive to manipulations of semantic and affective features in a film) and that these group-level patterns were also able to predict patterns of brain activity in a different group of participants (suggesting that mDES dimensions are also sensitive to the way brain activity emerges during movie watching). Second, we established that variation across individuals in their mDES scores predicted their comprehension of information from films. Thus our study establishes that when applied to movie-watching, mDES is sensitive to individual differences in the movie-watching experience (as determined by an individual's comprehension). Given the success of this study and the relative ease with which mDES can be performed, it will be possible in the future to conduct mDES studies that hone in on both the general features of the movie-watching experience, as well as aspects that are more unique to an individual.

      Reviewer #2:

      (1) The distinction between thinking and stimulus processing (in the sense of detecting and assigning meaning to features, modulated by factors such as attention) remains unclear. Is "thinking" a form of conscious access or a reportable read-out from sensory and higher-level stimulus processing? Or does it simply refer to the method used here to identify different processing states?

      Thank you for highlighting this first point, which is an important consideration when attempting to map cognitive states. We have added some additional comments to our discussion section to expand on this point.

      Page 35-36 [698-711] “It is possible, therefore, that the identification of regions of visual and auditory cortex by our study reflects the participants attention to sensory input, rather than the complex analysis of these inputs that may be required for certain features of the movie watching experience. On the other hand, it is possible that the movie-watching state is a qualitatively different type of mental state to those that emerge in typical task situations. For example, unlike tasks, the movie-watching state is characterized by multi-modal sensory input, semantically rich themes, that evolve together to reveal a continuous narrative to the viewer. It is possible, therefore, that movies engender an absorbed state which depends more on processing in sensory cortex than would occur in traditional task paradigms such as a working memory task (when systems in association cortex may be needed to maintain information related to task rules). Important headway into addressing this uncertainty can be achieved by using mDES to compare the types of states that occur in different contexts (including both movies and tasks) and comparing the topography of brain activity associated with different experiential states.”

      (2) The dimensions of thought appear to be directly linked to brain areas traditionally associated with core faculties of perception and cognition. For example, superior temporal cortex codes for speech information, which is also where thought reports on verbal detail localize in this study. This raises the question of whether the present study truly captures mechanisms specific to thinking and distinct from processing, especially given that individual variations in reports were not considered and movie-specific features were not controlled for.

      Thank you for this point, we have added an additional paragraph to the discussion to expand on this.

      Page 35 [692-698] “Finally, it is worth considering whether the patterns of brain activity identified by our analysis reflect the stimuli that are processed during movie watching, or the cognitive and affective processing of this information. On the one hand, the regions we found were often within regions of sensory cortex, areas of the brain which are often ascribed basic stimulus processing functions [1]. Moreover, according to perspectives on cognition derived from more traditional task paradigms, complex features of cognition, such as the regulation of thought, are often attributed to regions of association cortex, such as the dorsolateral prefrontal cortex [2].”

      Reviewer #3:

      This paper is framed as presenting a new paradigm but it does little to discuss what this paradigm serves, what are its limitations and how it should have been tested. The novelty appears to be in using experience sampling from 1 sample to model the responses of a second sample.

      Thank you for this comment, we have since made clear what the novelty of the methodology is, as you have correctly identified, by expanding this point beyond the methods section to clearly orient the reader to the application and limitation of our methodological approach with our paradigm.

      Page 7-8 [149-174] “One challenge that arises when attempting to map the dynamics of thought onto brain activity during movie-watching is accounting for the inherently disruptive nature of experience sampling: to measure experience with sufficient frequency to map experiential reports during movies would inherently disrupt the natural processes of the brain and alter the viewer’s experience (for example, by pausing the film at a moment of suspense). Therefore, if we periodically interrupt viewers to acquire a description of their thoughts while recording brain activity, this could impact on the ability to capture important dynamic features of the brain. On the other hand, if we measured fMRI activity continuously over movie-watching (as is usually the case), we would lack the capacity to directly relate brain signals to the corresponding experiential states. Thus, to overcome these obstacles, we developed a novel methodological approach using two independent samples of participants. In the current study, one set of 120 participants was probed with mDES five times across the three ten-minute movie clips (11 minutes total, no sampling in the first minute). We used a jittered sampling technique where probes were delivered at different intervals across the film for different people depending on the condition they were assigned. Probe orders were also counterbalanced to minimize the systematic impact of prior and later probes at any given sampling moment. We used these data to construct a precise description of the dynamics of experience for every 15 seconds of three ten-minute movie clips. These data were then combined with fMRI data from a different sample of 44 participants who had already watched these clips without experience sampling [3]. By combining data from two different groups of participants, our method allows us to describe the time series of different experiential states (as defined by mDES) and relate these to the time series of brain activity in another set of participants who watched the same films with no interruptions. In this way, our study set out to explicitly understand how the patterns of thoughts that dominate different moments in a film in one group of participants relate to the brain activity at these time points in a second set of participants and, therefore, better understand the contribution of different neural systems to the movie-watching experience.”

      Page 33-35 [658-691] “Importantly, our study provides a novel method for answering these questions and others regarding the brain basis of experiences during films that can be applied simply and cost-effectively. As we have shown, mDES can be combined with existing brain activity, allowing information about both brain activity and experience to be determined at a relatively low cost.  For example, the cost-effective nature of our paradigm makes it an ideal way to explore the relationship between cognition and neural activity during movie-watching during different genres of film. In neuroimaging, conclusions are often made using one film in naturalistic paradigm studies [4]. Although the current study only used three movie clips, restraining our ability to form strong conclusions regarding how different patterns of thought relate to specific genres of film, in the future, it will be possible to map cognition across a more extensive set of movies and discern whether there are specific types of experience that different genres of films engage. One of the major strengths of our approach, therefore, is the ability to map thoughts across groups of participants across a wide range of movies at a relatively low cost.

      Nonetheless, this paradigm is not without limitations. This is the first study, as far as we know, that attempts to compare experiential reports in one sample of participants with brain activity in a second set of participants, and while the utility of this method enables us to understand the relationship between thought and brain activity during movies, it will be important to extend our analysis to mDES data during movie-watching while brain activity is recorded. In addition, our study is correlational in nature, and in the future, it could be useful to generate a more mechanistic understanding of how brain activity maps onto the participants experience. Our analysis shows that mDES is able to discriminate between films, highlighting its broad sensitivity to variation in semantic or affective content. Armed with this knowledge, we propose that in the future, researchers could derive mechanistic insights into how the semantic features may influence the mDES data. For example, it may be possible to ask participants to watch movies in a scrambled order to understand how the structure of semantic or information influences the mapping between brains and ongoing experience as measured by mDES. Finally, our study focused on mapping group-level patterns of experience onto group-level descriptions of brain activity. In the future it may be possible to adopt a “precision-mapping” approach by measuring longer periods of experience using mDES and determining how the neural correlates of experience vary across individuals who watched the same movies while brain activity was collected [5]. In the future, we anticipate that the ease with which our method can be applied to different groups of individuals and different types of media will make it possible to build a more comprehensive and culturally inclusive understanding of the links between brain activity and movie-watching experience.”

      What are the considerations for treating high-order thought patterns that occur during film viewing as stable enough to use across participants? What would be the limitations of this method? (Do all people reading this paper think comparable thoughts reading through the sections?) This is briefly discussed in the revised manuscript and generally treated as an opportunity rather than as a limitation.

      It is likely, based on our study, that films can evoke both stereotyped thought patterns (i.e. thoughts that many people will share) and others that are individualistic. It is clear that, in principle, mDES is capable of capturing empirical information on both stereotypical thoughts and idiosyncratic thoughts. For example, clear differences in experiences across films and, in particular, during specific periods within a film, show that movie-watching can evoke broadly similar thought patterns in different groups of participants (see Figure 3 right-hand panel). On the other hand, the association between comprehension and the different mDES components indicate that certain individuals respond to the same film clip in different ways and that these differences are rooted in objective information (i.e. their memory of an event in a film clip). A clear example of these more idiosyncratic features of movie watching experience can be seen in the association between “Episodic Knowledge” and comprehension. We found that “Episodic Knowledge” was generally high in the romance clip from 500 Days of Summer but was especially high for individuals who performed the best, indicating they remembered the most information. Thus good comprehends responded to the 500 Days of Summer clip with responses that had more evidence of “Episodic Knowledge” In the future, since the mDES approach can account for both stereotyped and idiosyncratic features of experience, it will be an important tool in understanding the common and distinct features that movie watching experiences can have, especially given the cost effective manner with which these studies can be run.  

      In conclusion, this study tackles a highly interesting subject and does it creatively and expertly. It fails to discuss and establish the utility and appropriateness of its proposed method.

      Thank you very much for your feedback and critique. In our revision and our responses to these questions, we provided more information about the method's robustness utility and application to understanding cognition. Thank you for bringing these points to our attention.

      References

      (1) Kaas, J.H. and C.E. Collins, The organization of sensory cortex. Current Opinion in Neurobiology, 2001. 11(4): p. 498-504.

      (2) Turnbull, A., et al., Left dorsolateral prefrontal cortex supports context-dependent prioritisation of off-task thought. Nature Communications, 2019. 10.

      (3) Aliko, S., et al., A naturalistic neuroimaging database for understanding the brain using ecological stimuli. Scientific Data, 2020. 7(1).

      (4) Yang, E., et al., The default network dominates neural responses to evolving movie stories. Nature Communications, 2023. 14(1): p. 4197.

      (5) Gordon, E.M., et al., Precision Functional Mapping of Individual Human Brains. Neuron, 2017. 95(4): p. 791-807.e7.