26,913 Matching Annotations
  1. Jun 2024
    1. Reviewer #1 (Public Review):

      Summary:

      This study demonstrated a novel exciting link between conserved miRNA-target axis of miR-182-Lrp6 in liver metabolism which causatively contributes to type 2 diabetes and NAFLD in mice and, potentially, humans.

      Strengths:

      The direct interaction and inhibition of Lrp6 by miR-182 is convincingly shown. The effects of miR-182-5p on insulin sensitivity are also credible for the in vivo and in vitro gain-of-function experiments.

      Weaknesses:

      However, the DIO cohorts lack key assays for insulin sensitivity such as ITT or insulin-stimulated pAKT, as well as histological evidence to support their claims and strengthen the link between miR-182-5p and T2D or NAFLD. Besides, the lack of loss-of-function experiments limits its aptitude as potential therapeutic target.

    2. Reviewer #2 (Public Review):

      Summary:

      In this study, Christin Krause et al mapped the hepatic miRNA-transcriptome of type 2 diabetic obese subjects, identified miR-182-5p and its target genes LRP6 as potential drivers of dysregulated glucose tolerance and fatty acid metabolism in obese T2-diabetics.

      Strengths:

      This study contains some interesting findings and are valuable for the understanding of key regulatory role of miRNAs in the pathogenesis of T2D.

      Weaknesses:

      The authors didn't systemically investigate the function of miR-182 in T2DM or NAFLD.

    3. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Krause and colleagues identify miR-182 as diabetes-associated microRNA: miR-182 is increased in bariatric surgery patients with versus without T2D; miR-182 was the only microRNA associated with three metabolic traits; miR-182 levels were associated with increased body weight in mice under different dietary manipulations; overexpression in Hep-G2 led to a decrease in LRP6; and overexpression in HFD fed mice led to increased insulin and liver TG. The manuscript provides a potentially useful resource of microRNA expression in human livers, though the functional importance of miR-182 remains unclear.

      Strengths:

      The use of human tissues and good sample sizes is strong.

      Weaknesses:

      The study remains primarily correlative; the in vivo overexpression is non-physiological; and the mechanisms by which miR-182 exerts its effects are not rigorously tested.

    1. eLife assessment

      This study provides a fundamental advance in palaeontology by reporting the fossils of a new invertebrate, Beretella spinosa, and inferring its relationship with already described species. The analysis placed the newly described species in the earliest branch of moulting invertebrates. The study, supported by convincing fossil observation, hypothesizes that early moulting invertebrate animals were not vermiform.

    2. Reviewer #1 (Public Review):

      Summary:

      Wang and co-workers characterise the fossil of Beretella spinosa from the early Cambrian, Yanjiahe Formation, South China. Combining morphological analyses with phylogenetic reconstructions, the authors conclude that B. spinosa is closely related to Saccorhytus, an enigmatic fossil recently ascribed to Ecdysozoa, or moulting animals, as an extinct "basal" lineage. Finding additional representatives of the clade Saccorhytida strengthens the idea that there existed a diversity of body plans previously underappreciated in Ecdysozoa, which may have implications for our understanding of the earliest steps in the evolution of this major animal group.

      Strengths:

      I'm not a paleobiologist; therefore, I cannot give an expert opinion on the descriptions of the fossils. However, the similarities with Saccorhytus seem evident, and the phylogenetic reconstructions are adequate. Evolutionary interpretations are generally justified, and the consolidation of Saccorhytida as the extinct sister lineage to extant Ecdysozoans will have significant implications for our understanding of this major animal clade.

      Weaknesses:

      While I generally agree with the author's interpretations, the idea of Saccorhytida as a divergent, simplified off-shot is slightly contradictory with a probably non-vermiform ecdysozoan ancestor. The author's analyses do not discard the possibility of a vermiform ecdysozoan ancestor (importantly, Supp Table 4 does not reconstruct that character), and outgroup comparison with Spiralia (and even Deuterostomia for Protostomia as a whole) indicates that a more or less anteroposteriorly elongated (i.e., vermiform) body is likely common and ancestral to all major bilaterian groups, including Ecdysozoa. Indeed, Figure 4 b depicts the potential ancestor as a "worm". The authors argue that the simplification of Saccorhytida from a vermiform ancestor is unlikely "because it would involve considerable anatomical transformations such as the loss of vermiform organization, introvert and pharynx in addition to that of the digestive system". However, their data support the introvert as a specialisation of Scalidophora (Fig. 4a and Supp Table 4), and a pharyngeal structure cannot be ruled out in Saccorhytida. Likewise, loss of an anus is not uncommon in Bilateria. Moreover, this can easily become a semantics discussion (to what extent can an animal be defined as "vermiform"? Where is the limit?). Therefore, I suggest to leave the evolutionary scenario more open. Supporting Saccorhytida as a true group at the early steps of Ecdysozoa evolution is important and demonstrates that animal body plans are more plastic than previously appreciated. However, with the current data, it is unlikely that Saccorhytida represents the ancestral state for Ecdysozoa (as the authors admit), and a vermiform nature is not ruled out (and even likely) in this animal group. Suggesting that the ancestral Ecdysozoan might have been small and meiobenthic is perhaps more interesting and supported by the current data (phylogeny and outgroup comparison with Spiralia).

    3. Reviewer #2 (Public Review):

      Summary:

      This work provides important anatomical features of a new species from the Lower Cambrian, which helps advance our understanding of the evolutionary origins of animal body plans. The authors interpreted that the new species possessed a bilateral body covered with cuticular polygonal reticulation and a ventral mouth. Based on cladistic analyses using maximum likelihood, Bayesian, and parsimony, the new species was placed, along with Saccorhytus, in a sister-group ("Saccorhytida") of the Ecdysozoa. The phylogenetic position of Saccorhytida suggests a new scenario of the evolutionary origin of the crown ecdysozoan body plan.

      Strengths:

      Although the new species reported in this paper show strange morphologies, the interpretation of anatomical features was based on detailed observations of multiple fossil specimens, thereby convincing at the moment. Morphological data about fossil taxa in the Ediacaran and Early Cambrian are quite important for our understanding of the evolution of body plans (and origins of phyla) in paleontology and evolutionary developmental biology, and this paper represents a valuable contribution to such research fields.

      Weaknesses:

      The preservations of the specimens, in particular on the putative ventral side, are not good, and the interpretation of the anatomical features need to be tested with additional specimens in future. The monophyly of Cycloneuralia (Nematoida + Scalidophora) was not necessarily well-supported by cladistic analyses (Supplementary Figures 7-9), and the evolutionary scenario (Fig. 4) also need to be tested in future works. On the other hand, the revised version provides important contributions from currently available data, and the above-mentioned problems should be studied in a separate paper in future.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer 1:

      Weaknesses:

      While I generally agree with the author's interpretations, the idea of Saccorhytida as a divergent, simplified off-shot is slightly contradictory with a probably non-vermiform ecdysozoan ancestor. The author's analyses do not discard the possibility of a vermiform ecdysozoan ancestor (importantly, Supplementary Table 4 does not reconstruct that character),

      Saccorhytids are only known from the early Cambrian and their unique morphology has no equivalent among any extinct or extant ecdysozoan groups. This prompted us to consider them as a possible dead-end evolutionary off-shot. The nature of the last common ancestor of ecdysozoan (i.e. an elongated worm-like or non-vermiform animal with capacities to renew its cuticle by molting) remains hypothetical. At present, palaeontological data do not allow us to resolve this question. The animal in Fig. 4b at the base of the tree is supposed to represent an ancestral soft-bodied form with no cuticle from which ecdysozoan evolved via major innovations (cuticular secretion and ecdysis). Its shape is hypothetical as indicated by a question mark. Our evolutionary model is clearly intended to be tested by further studies and hopefully new fossil discoveries.

      …and outgroup comparison with Spiralia (and even Deuterostomia for Protostomia as a whole) indicates that a more or less anteroposteriorly elongated (i.e., vermiform) body is likely common and ancestral to all major bilaterian groups, including Ecdysozoa. Indeed, Figure 4b depicts the potential ancestor as a "worm". The authors argue that the simplification of Saccorhytida from a vermiform ancestor is unlikely "because it would involve considerable anatomical transformations such as the loss of vermiform organization, introvert, and pharynx in addition to that of the digestive system". However, their data support the introvert as a specialisation of Scalidophora (Figure 4a and Supplementary Table 4), and a pharyngeal structure cannot be ruled out in Saccorhytida. Likewise, loss of an anus is not uncommon in Bilateria. Moreover, this can easily become a semantics discussion (to what extent can an animal be defined as "vermiform"? Where is the limit?).

      We agree that “worm” and “vermiform” are ill-defined terms. They are widely used in various palaeontological and biological papers to describe elongated tubular animals such as edydsozoans and annelids (see Giribet and Edgecombe 2017; popular textbook written by Nielsen 2012; Schmit-Rhaesa 2013; Brusca et al. 2023; Giribet and Edgecombe 2020). Very few other animals are termed “worms”. Changes have been made in the text to solve this semantic problem, for example in the abstract where we added (i.e elongated and tubular) to better define what we mean by “vermiform”.

      Priapulid worms or annelids are examples of extremely elongated, tubular animals. In saccorhytids, the antero-posterior elongation is present (as it is in the vast majority of bilaterians) but extremely reduced, Saccorhytus and Beretella having a sac-like or beret-shape, respectively. That such forms may have derived from elongated, tubular ancestors (e.g. comparable with present-day priapulid worms) would require major anatomical transformations that have no equivalent among modern animals. We agree that further speculation about the nature of these transformations is unnecessary and should be deleted simply because the nature of these ancestors is purely hypothetical. We also agree that the loss of anus and the extreme simplification of the digestive system is common among extant bilaterians. In Figure 4b, the hypothetical pre-ecdysozoan animal is slightly elongated (along its antero-posterior axis) but in no way comparable with a very elongated and cylindrical ecdysozoan worm (e.g. extant or extinct priapulid).

      Therefore, I suggest to leave the evolutionary scenario more open. Supporting Saccorhytida as a true group at the early steps of Ecdysozoa evolution is important and demonstrates that animal body plans are more plastic than previously appreciated. However, with the current data, it is unlikely that Saccorhytida represents the ancestral state for Ecdysozoa (as the authors admit), and a vermiform nature is not ruled out (and even likely) in this animal group. Suggesting that the ancestral Ecdysozoan might have been small and meiobenthic is perhaps more interesting and supported by the current data (phylogeny and outgroup comparison with Spiralia).

      We agree to leave the evolutionary scenario more open, especially the evolutionary process that gave rise to Saccorhytida. Again, we know nothing about the morphology of the ancestral ecdysozoan (typically the degree of body elongation, whether it had a differentiated introvert or not, whether it had a through gut or not). In Fig.4, the ancestral ecdysozoan is supposed to have evolved from a soft-bodied epibenthic animal through key innovations such as the secretion of a cuticle and ecdysis. It is a hypothesis that needs to be tested by further studies and fossil discoveries. Speculations concerning the process through which saccorhytids may have arisen have been deleted.

      Reviewer 2:

      Weaknesses:

      The preservations of the specimens, in particular on the putative ventral side, are not good, and the interpretation of the anatomical features needs to be tested with additional specimens in the future. The monophyly of Cycloneuralia (Nematoida + Scalidophora) was not necessarily well-supported by cladistic analyses, and the evolutionary scenario (Figure 4) also needs to be tested in future works.

      Yes, we agree that the animal described in our manuscrip remains enigmatic (e.g. the natures of its internal organs, its lifestyle, etc..). Whereas the dorsal side of the animal is well documented (consistent pattern of pointed sclerites), uncertainties remain concerning its ventral anatomy (typically the mouth location and shape). Additional better-preserved specimens will hopefully provide the missing information. Concerning Cycloneuralia, their monophyly is generally better supported by analyses based on morphological characters than in molecular phylogenies.

      Reviewer 3:

      Weaknesses:

      I, as a paleontology non-expert, experienced several difficulties in reading the manuscript. This should be taken into consideration when assuming a wide range of readers including non-experts.

      We have ensured that the text is comprehensible to biologists. The main results are summarized in relatively simple diagrams (e.g. Fig. 4) that can be understood by non-specialized readers. We are aware that technical descriptive terms may appear obscure to non-specialists. We can hardly avoid them in the descriptive parts. However, our figures (e.g. SEM images and 3D-reconstruction) are clear enough to give the reader a clear idea of the morphology of Beretella.

      Recommendations for the authors:

      All three reviewers appreciate the discovery and found the merit of publishing this manuscript. They also raised some concerns about the data presentation. The authors are requested to perform no additional analysis but to go through all the reviewer comments and rebut or intake them in revising the manuscript.

      Reviewer 1:

      - Line 41: comma after "ecdysozans".

      OK, done.

      - Formatting style: add a space before references.

      OK, done.

      - Line 169: B. spinosa in italics

      OK, done.

      - Line 157: could the "relatively large opening" in the flattened ventral side of a mouth (even when altered by the fossilisation process)?

      Most bilaterians have a mouth. There is no opening on the relatively well-preserved dorsal side of Beretella, that could be interpreted as a mouth. In contrast the flattened ventral side often show a depressed area that could potentially bear a mouth. This ventral area is often pushed in and poorly preserved. The cuticle of this ventral side might have been relatively thinner, perhaps more flexible than that of the dorsal one (with strong sclerites). These differences might explain why the possible oral area is poorly preserved.

      - Line 178: "position of the mouth"

      OK, done.

      - Line 219: "These sclerites, unknown..."

      OK, done.

      - Line 282: update reference formatting

      OK, done.

      - Line 298: remove reference to Supplementary Table 4, as it does not refer to the possible vermiform nature of the last common ecdysozoan ancestor?

      OK, done.

      - Figure 4a: change "paired legs" for "paired appendages"?

      OK, done.

      - Supplementary Table 4: For TGE and Introvert, the state 0 (absent) should be in bold and underlined (as it is the most likely state).

      OK, done.

      Reviewer 2:

      Line 25: "from the early Cambrian" should be changed into "from the lower Cambrian"

      OK, done.

      Line 126: The range of maximum length should be reported in µm (rather than mm) just like those of maximum width and height.

      OK, done.

      Lines 191-192: Please recheck the figure panels of Saccorhytus (Supplementary Figure 4c) and scalidophoran worm (Supplementary Figure 4d). Perhaps, the former should refer to Figure 4d, and the latter to Figure 4c?

      OK, done.

      Lines 239 and 241: "1" and "2" appear to stand for citations (the other journal style), but I am not certain what they are.

      To avoid confusing, we replace ‘1’ and ‘2’ by ‘i’ and ‘ii’.

      Figures 3d and 4a: "Cycloneuralia" should be included in the phylogenetic trees.

      OK, done.

      Figure 3: The caption for the panel d is redundant. It should be changed into, for example, "Phylogenetic tree obtained from cladistic analyses using maximum likelihood (IQTREE)."

      OK, done.

      Supplementary Figures 6-9: In the captions, more detailed explanations of the results (for example, "50% majority rule consensus of XXX trees" and "strict consensus of all 4 most-parsimonious trees") should be provided.

      OK, done.

      Supplementary Figures 8 and 9: The caption explains that Cycloneuralia is resolved as a paraphyletic group, but it is not certain because Nematoida, Scalidophora, and Panarthropoda are resolved in a polytomy.

      We changed the sentence into:

      “Note that Cycloneuralia does not appear as a monophyletic clade”

      Reviewer 3:

      Line 25 'tiny' - I suggest giving an absolute measure of the size.

      We add ‘maximal length 3 mm’.

      Line 29 'both forms' - This is hard to follow by a non-expert. Can this be replaced with 'fossil species'?

      OK, done.

      Line 32 'dead-end' - Is this word necessary? I suggest to skip this word, as it is obvious that this lineage is extinct.

      OK, done.

      Lines 80, 94, and 172 'Remarks' - I, as a palaeontology non-expert, cannot get this manuscript structure with a repetition of this same section title.

      Our systematic descriptions follow the standard rules in palaeontology.

      Line 119 - I could not get what this 'Member 5' that was not introduced earlier means.

      In Stratigraphy, ‘member’ is a lithostratigraphic subdivision (a Formation is usually subdivided into several Members).

      Lines 104, 105, 417, ... - The name of the organization or database hosting these IDs (CUB.... and ELIXX....) should also be supplied.

      OK, done.

      Lines 341 and 361 - These two Figures (Figures 1 and 2) have the same caption (with an addition to the one for Figure 1). There should be a distinction based on what is presented in each figure.

      We corrected the caption of Figure 2 and wrote the following: ‘Beretella spinosa gen. et sp. nov.’.

      Line 362-367 - There is no guide about what the individual figure panels (e.g., Figure 2g, 2h, and 2i) show in detail. This guide should be supplied. This also applies to Figure 3a-c - are they anterolateral (a), dorsal (b), and posterolateral (c) views? It is better to write clearly in this way.

      OK, done.

      Figure 3d - The color contrast is not sufficient, and this figure does not look reader-friendly. Plus, the division into Cycloneuralia and Panarthropoda is indicated above the tree, but it is not clear what range of lineages these clades include. For example, is Pliciloricidae included in Cycloneuralia? Also, is Collinsium included in Panarthropoda? This figure looks quite unreliable, and it should be easy to fix.

      OK, done.

      Line 277 legend of Figure 3 - Including the parenthesis only with the program name (IQTREE) is not useful at all. Isn't it enough to describe it in Methods?

      OK, done. We remove (IQTREE).

      Line 380 legend of Figure 3 - I could not get where 'thicker bars' are.

      Known fossil record indicated by thicker vertical bars. We added “vertical”.

      Line 453 - Give full names of the methods, maximum parsimony, and maximum-likelihood.

      OK, done.

      Line 489 - State clearly what 'the recent paper' means.

      Replace ‘recent’ by ‘present’.

    1. eLife assessment

      The authors report that the neurohormone, bursicon, and its receptor, play a role in regulating aspects of the seasonal polyphenism of the bug, Cacopsylla chinensis. This important study shows that low temperature activates the bursicon signaling pathway during the transition from the summer to the winter form and that it affects cuticle pigment and chitin content, and cuticle thickness. In addition, the authors show that the microRNA miR-6012 targets the bursicon receptor, thereby modulating the function of the bursicon signaling pathway. The study's solid set of experiments and results reveal a role of bursicon signaling in regulating features of polyphenism related to the exoskeleton. Nevertheless, they only incompletely substantiate the authors' claims about the regulation of polyphenism itself.

    2. Joint Public Review:

      Summary:

      Bursicon is a key hormone regulating cuticle tanning in insects. While the molecular mechanisms of its function are rather well studied--especially in the model insect Drosophila melanogaster, its effects and functions in different tissues are less well understood. Here, the authors show that bursicon and its receptor play a role in regulating aspects of the seasonal polyphenism of Cacopsylla chinensis. They found that low temperature treatment activated the bursicon signaling pathway during the transition from summer form to winter form and affect cuticle pigment and chitin content, and cuticle thickness. In addition, the authors show that miR-6012 targets the bursicon receptor, CcBurs-R, thereby modulating the function of bursicon signaling pathway in the seasonal polyphenism of C. chinensis. This discovery expands our knowledge of the roles of neuropeptide bursicon action in arthropod biology.

      However, the study falls short of its claim that it reveals the molecular mechanisms of a seasonal polyphenism. While cuticle tanning is an important part of the pear psyllid polyphenism, it is not the equivalent of it. First, there are other traits that distinguish between the two morphs, such as ovarian diapause (Oldfield, 1970), and the role of bursicon signaling in regulating these aspects of polyphenism were not measured. Thus, the phenotype in pear psyllids, whereby knockdown bursicon reduces cuticle tanning seems to simply demonstrate the phenotypes of Drosophila mutants for bursicon receptor (Loveall and Deitcher, 2010, BMC Dev Biol) in another species (Fig. 2I, 4H). Second, the study fails to address the threshold nature of cuticular tanning in this species, although it is the threshold response (specifically, to temperature and photoperiod) that distinguishes this trait as a part of a polyphenism. Whereas miR-6012 was found to regulate bursicon expression, there no evidence is provided that this microRNA either responds to or initiates a threshold response to temperature. In principle, miR-6012 could regulate bursicon whether or not it is part of a polyphenism. Thus, the impact of this work would be significantly increased if it could distinguish between seasonal changes of the cuticle and a bona fide reflection of polyphenism.

      Strengths:

      This study convincingly identifies homologs of the genes encoding the bursicon subunits and its receptor, showing an alignment with those of another psyllid as well as more distant species. It also demonstrates that the stage- and tissue-specific levels of bursicon follow the expected patterns, as informed by other insect models, thus validating the identity of these genes in this species. They provide strong evidence that the expression of bursicon and its receptor depend on temperature, thereby showing that this trait is regulated through both parts of the signaling mechanism.

      Several parallel measurements of the phenotype were performed to show the effects of this hormone, its receptor, and an upstream regulator (miR-6012), on cuticle deposition and pigmentation (if not polyphenism per se, as claimed). Specifically, chitin staining and TEM of the cuticle qualitatively show difference between controls and knockdowns, and this is supported by some statistical tests of quantitative measurements (although see comments below). Thus, this study provides strong evidence that bursicon and its receptor play an important role in cuticle deposition and pigmentation in this psyllid.

      The study identified four miRNAs which might affect bursicon due to sequence motifs. By manipulating levels of synthetic miRNA agonists, the study successfully identified one of them (miR-6012) to cause a cuticle phenotype. Moreover, this miRNA was localized (by FISH) to the cuticle, body-wide. To our knowledge, this is the first demonstrated function for this miRNA, and this study provides a good example of using a gene of known function as an entry point to discovering others influencing a trait. Thus, this finding reveals another level of regulation of cuticle formation in insects.

      Weaknesses:

      (1) The introduction to this manuscript does not accurately reflect progress in the field of mechanisms underlying polyphenism (e.g., line 60). There are several models for polyphenism that have been used to uncover molecular mechanisms in at least some detail, and this includes seasonal polyphenisms in Hemiptera. Therefore, the justification for this study cannot be predicated on a lack of knowledge, nor is the present study original or unique in this line of research (e.g., as reviewed by Zhang et al. 2019; DOI: 10.1146/annurev-ento-011118-112448). The authors are apparently aware of this, because they even provide other examples (lines 104-108); thus the introduction seems misleading as framed.

      (2) The data in Figure 2H show "percent of transition." However, the images in 2I show insects with tanned cuticle (control) vs. those without (knockdown). Yet, based on the description of the Methods provided, there appears to be no distinction between "percent of transition" and "percent with tanning defects". This an important distinction to make if the authors are going to interpret cuticle defects as a defect in the polyphenism. Furthermore, there is no mention of intermediate phenotypes. The data in 2H are binned as either present or absent, and these are the phenotypes shown in 2I. Was the phenotype really an all-or-nothing response? Instead of binning, which masks any quantitative differences in the tanning phenotypes, the authors should objectively quantify the degree of tanning and plot that. This would show if and to what degree intermediate tanning phenotypes occurred, which would test how bursicon affects the threshold response. This comment also applies to the data in Figures 4G and 6G. Since cuticle tanning is present in more insect than just those with seasonal polyphenism, showing how this responds as a threshold is needed to make claims about polyphenism.

      (3) This study also does not test the threshold response of cuticle phenotypes to levels of bursicon, its receptor, or miR-6012. Hormone thresholds are the most widespread and, in most systems where polyphenism has been studied, the defining characteristic of a polyphenism (e.g., Nijhout, 2003, Evol Dev). Quantitative (not binned) measurements of a polyphenism marker (e.g., chitin) should be demonstrated to result as a threshold titer (or in the case of the receptor, expression level) to distinguish defects in polyphenism from those of its component trait.

      (4) Cuticle issue:<br /> (a) Unlike Fig. 6D and F, Figs. 2D and F do not correspond to each other. Especially the lack and reduction of chitin in ds-a+b! By fluorescence microscopy there is hardly any signal, whereas by TEM there is a decent cuticle. Additionally, the dsGFP control cuticle in 2D is cut obliquely with a thick and a thin chitin layer. This is misleading.<br /> (b) In Figs. 2F and 3F, the endocuticle appears to be missing, a portion of the procuticle that is produced post-molting. As tanning is also occurring post-molting, there seems to be a general problem with cuticle differentiation at this time point. This may be a timing issue. Please clarify.<br /> (c) To provide background information, it would be useful analyze cuticle formation in the summer and winter morphs of controls separately by light and electron microscopy. More baseline data on these two morphs is needed.<br /> (d) For the TEM study, it is not clear whether the same part of the insect's thorax is being sectioned each time, or if that matters. There is not an obvious difference in the number of cuticular layers, but only the relative widths of those layers, so it is difficult to know how comparable those images are. This raises two questions that the authors should clarify. First, is it possible that certain parts of the thoracic cuticle, such as those closer to the intersegmental membrane, are naturally thinner than other parts of the body? Second, is the tanning phenotype based on the thickness or on the number of chitin layers, or both? The data shown later in Figure 4I, J convincingly shows that the biosynthesis pathway for chitin is repressed, but any clarification of what this might mean for deposition of chitin would help to understand the phenotypes reported. Also, more details on how the data in Fig. 2G were collected would be helpful. This also goes for the data in Fig. 4 (bursicon receptor knockdowns).

      (5) Tissue issue:<br /> The timed experiments shown in all figures were done in whole animals. However, we know from Drosophila that Bursicon activity is complex in different tissues. There is, thus, the possibility, that the effects detected on different days in whole animals are misleading because different tissues--especially the brain and the epidermis, may respond differentially to the challenge and mask each other's responses. The animal is small, so the extraction from single tissue may be difficult. However, this important issue needs to be addressed.

      (6) No specific information is provided regarding the procedure followed for the rescue experiments with burs-α and burs-β (How were they done? Which concentrations were applied? What were the effects?). These important details should appear in the Materials and Methods and the Results sections.

      (7) Pigmentation<br /> (a) The protocol used to assess pigmentation needs to be validated. In particular, the following details are needed: Were all pigments extracted? Were pigments modified during extraction? Were the values measured consistent with values obtained, for instance, by light microscopy (which should be done)?<br /> (b) In addition, pigmentation occurs post-molting; thus, the results could reflect indirect actions of bursicon signaling on pigmentation. The levels of expression of downstream pigmentation genes (ebony, lactase, etc) should be measured and compared in molting summer vs. winter morphs.

      (8) L236: "while the heterodimer protein of CcBurs α+β could fully rescue the effect of CcBurs-R knockdown on the transition percent (Figure 4G 4H)". This result seems contradictory. If CcBurs-R is the receptor of bursicon, the heterodimer protein of CcBurs α+β should not be able to rescue the effect of CcBurs-R knockdown insects. How can a neuropeptide protein rescue the effect when its receptor is not there! If these results are valid, then the CcBurs-R would not be the (sole) receptor for CcBurs α+β heterodimer. This is a critical issue for this manuscript and needs to be addressed (also in L337 in Discussion).

      (9) Fig. 5D needs improvement (the magnification is poor) and further explanation and discussion. mi6012 and CcBurs-R seem to be expressed in complementary tissues--do we see internal tissues also (see problem under point 2)? Again, the magnification is not high enough to understand and appreciate the relationships discussed.

      (10) The schematic in Fig. 7 is a useful summary, but there is a part of the logic that is unsupported by the data, specifically in terms of environmental influence on cuticle formation (i.e., plasticity). What is the evidence that lower temperatures influence expression of miR-6012? The study measures its expression over life stages, whether with an agonist or not, over a single temperature. Measuring levels of expression under summer form-inducing temperature is necessary to test the dependence of miR-6012 expression on temperature. Otherwise, this result cannot be interpreted as polyphenism control, but rather the control of a specific trait.

    1. eLife assessment

      This paper addresses a question regarding the low overlap between genetic variants linked to human complex diseases and variants linked to differences in gene expression. Some of the analyses supporting the main claims are convincing, and the key conclusions are valuable and of interest to readers in the fields of human genetics and functional genomics. However, chromatin accessibility QTL (caQTL) also carry the limitation of not identifying the genes that directly mediate the influence on disease phenotypes.

    2. Reviewer #1 (Public Review):

      Most human traits and common diseases are polygenic, influenced by numerous genetic variants across the genome. These variants are typically non-coding and likely function through gene regulatory mechanisms. To identify their target genes, one strategy is to examine if these variants are also found among genetic variants with detectable effects on gene expression levels, known as eQTLs. Surprisingly, this strategy has had limited success, and most disease variants are not identified as eQTLs, a puzzling observation recently referred to as "missing regulation".

      In this work, Jeong and Bulyk aimed to better understand the reasons behind the gap between disease-associated variants and eQTLs. They focused on immune-related diseases and used lymphoblastoid cell lines (LCLs) as a surrogate for the cell types mediating the genetic effects. Their main hypothesis is that some variants without eQTL evidence might be identifiable by studying other molecular intermediates along the path from genotype to phenotype. They specifically focused on variants that affect chromatin accessibility, known as caQTLs, as a potential marker of regulatory activity.

      The authors present data analyses supporting this hypothesis: several disease-associated variants are explained by caQTLs but not eQTLs. They further show that although caQTLs and eQTLs likely have largely overlapping underlying genetic variants, some variants are discovered only through one of these mapping strategies. Notably, they demonstrate that eQTL mapping is underpowered for gene-distal variants with small effects on gene expression, whereas caQTL mapping is not dependent on the distance to genes. Additionally, for some disease variants with caQTLs but no corresponding eQTLs in LCLs, they identify eQTLs in other cell types.

      Altogether, Jeong and Bulyk convincingly demonstrate that for immune-related diseases, discovering the missing disease-eQTLs requires both larger eQTL studies and a broader range of cell types in expression assays. It remains to be seen what fractions of the missing disease-eQTLs will be discovered with either strategy and whether these results can be extended to other diseases or traits.

      It should be noted that the problem of "missing regulation" has been investigated and discussed in several recent papers, notably Umans et al., Trends in Genetics 2021; Connally et al., eLife 2022; Mostafavi et al., Nat. Genet. 2023. The results reported by Jeong and Bulyk are not unexpected in light of this previous work (all of which they cite), but they add valuable empirical evidence that mostly aligns with the model and discussions presented in Mostafavi et al.

    3. Reviewer #2 (Public Review):

      Summary:

      eQTLs have emerged as a method for interpreting GWAS signals. However, some GWAS signals are difficult to explain with eQTLs. In this paper, the authors demonstrated that caQTLs can explain these signals. This suggests that for GWAS signals to actually lead to disease phenotypes, they must be accessible in the chromatin. This implies that for GWAS signals to translate into disease phenotypes, they need to be accessible within the chromatin.

      However, fundamentally, caQTLs, like GWAS, have the limitation of not being able to determine which genes mediate the influence on disease phenotypes. This limitation is consistent with the constraints observed in this study.

      (1) For reproducibility, details are necessary in the method section.

      - Details about adding YRI samples in ATAC-seq: For example, how many samples are there, and what is used among public data? There is LCL-derived iPSC and differentiated iPSC (cardiomyocytes) data , not LCL itself. How does this differ from LCL, and what is the rationale for including this data despite the differences?

      - caQTL is described as having better power than eQTL despite having fewer samples. How does the number of ATAC peaks used in caQTL compare to the number of gene expressions used in eQTL?

      - Details about RNA expression data: In the method section, it states that raw data (ERP001942) was accessed, and in data availability, processed data (E-GEUV-1) was used. These need to be consistent.

      How many samples were used (the text states 373, but how was it reduced from the original 465, and the total genotype is said to be 493 samples while ATAC has n=100; what are the 20 others?), and it mentions European samples, but does this exclude YRI?

      (2) Experimental results determining which TFs might bind to the representative signals of caQTL are required.

      (3) It is stated that caQTL is less tissue-specific compared to eQTL; would caQTL performed with ATAC-seq results from different cell types, yield similar results?

    1. eLife assessment

      This valuable work presents elegant experimental data from the Drosophila embryo supporting the notion that interactions among specific loci, called boundary elements, contribute to topologically associated domain (TAD) formation and gene regulation. The evidence supporting boundary:boundary pairing as a determinant of 3D structures is compelling; however, an inability to deplete loop extruders formally leaves open a possible contribution of loop extrusion. This study will be of interest to the nuclear structure community, particularly those using Drosophila as a model.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors addressed how long-range interactions between boundary elements are established and influence their function in enhancer specificity. Briefly, the authors placed two different reporters separated by a boundary element. They inserted this construct ectopically ~140 kb away from an endogenous locus that contains the same boundary element. The authors used expression patterns driven by nearby enhancers as an output to determine which enhancers the reporters interact with. They complemented this analysis with 3D DNA contact mapping. The authors found that the orientation of the boundary element determined which enhancers each reporter interacted with. They proposed that the 3D interaction topology, whether being circular or stem configuration, distinguished whether the interaction was cohesin mediated or through an independent mechanism termed pairing.

      Strengths:

      The transgene expression assays are built upon prior knowledge of the enhancer activities. The 3D DNA contacts confirm that transgene expression correlates with the contacts. Using 4 different orientations covers all combinations of the reporter genes and the boundary placement.

      Weaknesses:

      The interpretation of the data as a refusal of loop extrusion playing a role in TAD formation is not warranted, as the authors did not deplete the loop extruders to show that what they measure is independent. As the authors show, the single long DNA loop mediated by cohesin loop extrusion connecting the ectopic and endogenous boundary is clearly inconsistent with the results, therefore the main conclusion of the paper that the 3D topology of the boundary elements a consequence of pairing is strong. However, the loop extrusion and pairing are not mutually exclusive models for the formation of TADs. Loop-extruding cohesin complexes need not make a 140 kb loop, multiple smaller loops could bring together the two boundary elements, which are then held together by pairing proteins that can make circular topologies.

    3. Reviewer #2 (Public Review):

      In Bing et al, the authors analyze micro-C data from NC14 fly embryos, focusing on the eve locus, to assess different models of chromatin looping. They conclude that fly TADs are less consistent with conventional cohesin-based loop extrusion models and instead rely more heavily on boundary-boundary pairings in an orientation-dependent manner.

      Overall, I found the manuscript to be interesting and thought-provoking. However, this paper reads much more like a perspective than a research article. Considering the journal is aimed at the general audience, I strongly suggest the authors spend some time editing their introduction to the most salient points as well as organizing their results section in a more conventional way with conclusion-based titles. It was very difficult to follow the authors' logic throughout the manuscript as written. It was also not clear as written which experiments were performed as part of this study and which were reanalyzed but published elsewhere. This should be made clearer throughout.

      It has been shown several times that Drosophila Hi-C maps do not contain all of the features (frequent corner peaks, stripes, etc.) observed when compared to mammalian cells. Considering these features are thought to be products of extrusion events, it is not an entirely new concept that Drosophila domains form via mechanisms other than extrusion. That being said, the authors' analyses do not distinguish between the formation and the maintenance of domains. It is not clear to this reviewer why a single mechanism should explain the formation of the complex structures observed in static Hi-C heatmaps from a population of cells at a single developmental time point. For example, how can the authors rule out that extrusion initially provides the necessary proximity and possibly the cis preference of contacts required for boundary-boundary pairing whereas the latter may more reflect the structures observed at maintenance? Future work aimed at analyzing micro-C data in cohesin-depleted cells might shed additional light on this.

      Additional mechanisms at play include compartment-level interactions driven by chromatin states. Indeed, in mammalian cells, these interactions often manifest as a "plume" on Hi-C maps similar to what the authors attribute to boundary interactions in this manuscript. How do the chromatin states in the neighboring domains of the eve locus impact the model if at all?

      How does intrachromosomal homolog pairing impact the models proposed in this manuscript (Abed et al. 2019; Erceg et al., 2019). Several papers recently have shown that somatic homolog pairing is not uniform and shows significant variation across the genome with evidence for both tight pairing regions and loose pairing regions. Might loose pairing interactions have the capacity to alter the cis configuration of the eve locus?

      In summary, the transgenic experiments are extensive and elegant and fully support the authors' models. However, in my opinion, they do not completely rule out additional models at play, including extrusion-based mechanisms. Indeed, my major issue is the limited conceptual advance in this manuscript. The authors essentially repeat many of their previous work and analyses. The authors make no attempt to dissect the mechanism of this process by modifying extrusion components directly. Some discussion of Rollins et al., 1999 on the discovery of Nipped-B and its role in enhancer-promoter communication should also be made to reconcile their conclusions in the proposed absence of extrusion events.

    4. Reviewer #3 (Public Review):

      Bing et al. attempt to address fundamental mechanisms of TAD formation in Drosophila by analyzing gene expression and 3D conformation within the vicinity of the eve TAD after insertion of a transgene harboring a Homie insulator sequence 142 kb away in different orientations. These transgenes along with spatial gene expression analysis were previously published in Fujioka et al. 2016, and the underlying interpretations regarding resulting DNA configuration in this genomic region were also previously published. This manuscript repeats the expression analysis using smFISH probes in order to achieve more quantitative analysis, but the main results are the same as previously published. The only new data are the Micro-C and an additional modeling/analysis of what they refer to as the 'Z3' orientation of the transgenes. The rest of the manuscript merely synthesizes further interpretation with the goal of addressing whether loop extrusion may be occurring or if boundary:boundary pairing without loop extrusion is responsible for TAD formation. The authors conclude that their results are more consistent with boundary:boundary pairing and not loop extrusion; however, most of this imaging data seems to support both loop extrusion and the boundary:boundary models. This manuscript lacks support, especially new data, for its conclusions. Furthermore, there are many parts of the manuscript that are difficult to follow. There are some minor errors in the labelling of the figures that if fixed would help elevate understanding. Lastly, there are several major points that if elaborated on, would potentially be helpful for the clarity of the manuscript.

      Major Points:

      (1) The authors suggest and attempt to visualize in the supplemental figures, that loop extrusion mechanisms would appear during crosslinking and show as vertical stripes in the micro-C data. In order to see stripes, a majority of the nuclei would need to undergo loop extrusion at the same rate, starting from exactly the same spots, and the loops would also have to be released and restarted at the same rate. If these patterns truly result from loop extrusion, the authors should provide experimental evidence from another organism undergoing loop extrusion.<br /> (2) On lines 311-314, the authors discuss that stem-loops generated by cohesin extrusion would possibly be expected to have more next-next-door neighbor contacts than next-door neighbor contacts and site their models in Figure 1. Based on the boundary:boundary pairing models in the same figure would the stem-loops created by head-to-tail pairing also have the same phenotype? Making possible enrichment of next-next-door neighbor contacts possible in both situations? The concepts in the text are not clear, and the diagrams are not well-labeled relative to the two models.<br /> (3) The authors appear to cite Chen et al., 2018 as a reference for the location of these transgenes being 700nM away in a majority of the nuclei. However, the exact transgenes in this manuscript do not appear to have been measured for distance. The authors could do this experiment and include expression measurements.<br /> (4) The authors discuss the possible importance of CTCF orientation in forming the roadblock to cohesin extrusion and discuss that Homie orientation in the transgene may impact Homie function as an effective roadblock. However, the Homie region inserted in the transgene does not contain the CTCF motif. Can the authors elaborate on why they feel the orientation of Homie is important in its ability to function as a roadblock if the CTCF motif is not present? Trans-acting factors responsible for Homie function have not been identified and this point is not discussed in the manuscript.<br /> (5) The imaging results seem to be consistent with both boundary:boundary interaction and loop extrusion stem looping.<br /> (6) The authors suggest that the eveMa TAD could only be formed by extrusion after the breakthrough of Nhomie and several other roadblocks. Additionally, the overall long-range interactions with Nhomie appear to be less than the interactions with endogenous Homie (Figures 7, 8, and supplemental 5). Is it possible that in some cases boundary:boundary pairing is occurring between only the transgenic Homie and endogenous Homie and not including Nhomie?<br /> (7) In Figure 4E, the GFP hebe expression shown in the LhomieG Z5 transgenic embryo does not appear in the same locations as the LlambdaG Z5 control. Is this actually hebe expression or just a background signal?<br /> (8) Figure 6- The LhomieG Z3 late-stage embryo appears to be showing the ventral orientation of the embryo rather than the lateral side of the embryo as was shown in the previous figure. Is this for a reason? Additionally, there are no statistics shown for the Z3 transgenic images. Were these images analyzed in the same way as the Z5 line images?<br /> (9) Do the Micro-C data align with the developmental time points used in the smFISH probe assays?

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors addressed how long-range interactions between boundary elements are established and influence their function in enhancer specificity. Briefly, the authors placed two different reporters separated by a boundary element. They inserted this construct ectopically ~140 kb away from an endogenous locus that contains the same boundary element. The authors used expression patterns driven by nearby enhancers as an output to determine which enhancers the reporters interact with. They complemented this analysis with 3D DNA contact mapping. The authors found that the orientation of the boundary element determined which enhancers each reporter interacted with. They proposed that the 3D interaction topology, whether being circular or stem configuration, distinguished whether the interaction was cohesin mediated or through an independent mechanism termed pairing.

      Strengths:

      The transgene expression assays are built upon prior knowledge of the enhancer activities. The 3D DNA contacts confirm that transgene expression correlates with the contacts. Using 4 different orientations covers all combinations of the reporter genes and the boundary placement.

      Weaknesses:

      The interpretation of the data as a refusal of loop extrusion playing a role in TAD formation is not warranted, as the authors did not deplete the loop extruders to show that what they measure is independent.

      (1.1) To begin with, our findings do not exclude the possibility that cohesin loop extrusion has some sort of role in the formation or maintenance of TADs in flies or other aspects of chromosome structure.  On the other hand, it clearly is not determinative in defining the end-points of TADs or in generating the resulting topology (stem-loop or circle-loop).  Our main point, which we feel we have established unequivocally, is that it can’t explain many essential features of TADs or chromosome loops (see below) in Drosophila.  This reviewer agrees with this point in their next paragraph (below).  We also think that the loop extrusion model’s general acceptance as THE driving force behind TAD formation in mammals is unwarranted and not fully consistent with the available data, as explained below.

      As to the reviewer’s specific point regarding depletion of loop extruders, we first note that completely eliminating factors encoding cohesin subunits in fly embryos isn’t readily feasible.  As cohesin is essential starting at the beginning of embryonic development, and is maternally deposited, knockdowns/depletions would likely be incomplete and there would always be some remaining activity.  As long as there is some residual activity—and no disruption in TAD formation is observed—this experimental test would be a failure.  In addition, any defects that are observed might arise not from a failure in TAD formation via loop extrusion but rather because the rapid mitotic cycles would be disrupted.  A far better approach would be to deplete/knockdown cohesin subunits in tissue culture cells, as there is no requirement for the cells to undergo embryonic development.  Moreover, since cell division is relatively slow, the depletion would likely eliminate much if not all of the activity before a checkpoint is reached.

      While a drastic depletion of cohesin is not feasible in our model organism, we would draw the reviewer’s attention to an experiment of this type which has already been done in mammalian tissue culture cells by Goel et al. (Goel et al. 2023).  Unlike most Hi-C studies in mammals, the authors used region capture MicroC (RCMC).  In contrast to published genome-wide mammalian MicroC experiments (c.f., (Hsieh et al. 2020; Krietenstein et al. 2020)) which require large bin sizes to visualize mammalian “TADs,” the resolution of the experiments in Goel et al. (Goel et al. 2023) is similar to the resolution in our MicroC experiments (200-400 bp).  A MicroC contact map from Goel et al. shows the Pdm1g locus on chromosome 5 before and after Rad21 depletion.  The contact map visualizes a 250 kb DNA segment, which is only slightly larger than the ~230 kb DNA segment in Fig. 2C in our paper.

      In this experiment, there was a 97% reduction in the amount of Rad21.  However, as can be seen by comparing the contact profiles above and below the diagonal, there is little or no difference in TAD organization after cohesin depletion when individual TADs are visualized with a bin size of 250 bp.  These results would indicate that mammalian TADs do not require cohesin.

      Note also that the weak 45o stripes connecting different TADs (c.f. blue/green arrowheads) are still present after Rad21 depletion.  In the most popular version of the loop extrusion model, cohesin loads at a site(s) somewhere in the TAD-to-be, and then extrudes both strands until it bumps into CTCF roadblocks.  As illustrated in Figure Sup 2, this mechanism generates a vertical stripe originating at the cohesin loading site and extending until cohesin bumps into the left or right roadblock, at which point the stripe transitions into 45o stripe that ends when cohesin bumps into the other roadblock.  While 45o stripes are visible, there is no hint of a vertical stripe.  This suggests that the mechanism for generating stripes, if it is an active mechanism (rather than passive diffusion) may be quite different.  The 45o stripes must be generated by a factor(s) that is anchored to one (blue arrowhead) or both (green arrowhead) boundaries.  In addition, this factor, whatever it is, is not cohesin.  The reason for this is that the 45o stripes are present both before and after Rad21 depletion.  Moreover, if one were to imagine that the stripes represent a process involved in TAD formation, this process does not require cohesin (see Goel et al 2023).

      It is worth noting another observation that is inconsistent with the cohesin loop extrusion/CTCF roadblock model for TAD formation/maintenance.  CTCF is not found at all of the TAD boundaries in this 250 kb DNA region.  This would suggest that there are other DNA binding proteins that have chromosomal architectural functions besides CTCF.  In flies, many of the chromosomal architectural proteins are, like CTCF, polydactyl zinc finger (PZF) proteins (Bonchuk et al. 2021; Bonchuk et al. 2022; Fedotova et al. 2017).  These include Su(Hw), CTCF, Pita, Zipic and CLAMP.  The PZF family in flies is quite large.  There are ~250 different PZF genes, and since only a handful of these have been characterized, it seems likely that additional members of this family will have architectural functions.  Thus far, only one boundary protein, CTCF, has received attention in studies on mammalian chromosome architecture.  As the mammalian genome is much larger and more complicated than the fly genome, it is difficult to believe that CTCF is the sole chromosomal architectural protein in mammals.  In this respect, it is worth noting that there are ~800 members of the PZF family in mammalian genomes (Fedotova et al. 2017).

      Goel et al. (Goel et al. 2023) did observe alterations in the contact profiles after Rad21 depletion when they visualized the Ppm1g region at much lower resolution (bin sizes of 5 kb and 1 kb). The 5 kb bin size visualizes a region of ~1.2 Mb, while the 1 kb bin size visualizes a region that spans ~800 kb.  These large triangular units do not correspond to the individual TADs seen when Goel et al. visualized the Ppm1g locus at 250 bp resolution. 

      Nor do they correspond to TADs in Fig. 2 of our paper.  Instead they represent TAD neighborhoods which, likely consist of 20-30 or more individual TADs.  Consequently the alterations in contact patterns seen after Rad21 depletion are occurring at the level of TAD neighborhoods.  This can be seen by comparing pixel density inside the blue lines before (above the diagonal) and after Rad21 depletion (below the diagonal) (Goel et al 2023).  The more distant contacts between individual TADs within this neighborhood are preferentially reduced by Rad21 depletion (the region below and to the left of the double arrowhead).  By contrast, the TADs themselves are unaffected, as are contacts between individual TADs and their immediate neighbors (see purple and light green asterisk).  The other interesting feature is the loss of contacts between what appears to be partially overlapping neighborhoods.  This loss of neighborhood-toneighborhood contacts can be seen in the region located between the green and blue lines.  The neighborhood that appears to partially overlap the Ppm1g neighborhood is outlined in purple.

      It worth noting that, with the exception of the high resolution experiments in Goel et al., all of the other studies on cohesin (and CTCF) have examined the effects on contact maps within (and between) large neighborhoods (bin sizes >1 kb).  In most cases, these large neighborhoods are likely to be composed of many individual TADs like those seen in Goel et al. and in Fig. 2 of our paper.  We also observe larger neighborhoods in the fly genome, though they do not appear to be as large as those in mammals.  Our experiments do not address what role cohesin might have in facilitating contacts between more distant TADs located within the same neighborhoods, or between TADs in different neighborhoods, or whether loop extrusion is involved.

      We would also note that the Drosophila DNA segment in Fig. 2C contains 35 different genes, while the mammalian DNA segment shown in Fig. 1 has only 9.  Thus, in this part of the fly genome, Pol II genes are more densely packed than in the mammalian DNA segment.  Much of the fly genome is also densely packed, and the size of individual TADs will likely be smaller, on average, than in mammals.  Nevertheless, the MicroC profiles are not all that different.  As is also common in flies, each TAD in the Ppm1g region only encompasses one or two genes.  Note also that there are no volcano triangles with plumes as would be predicted for TADs that have a stem-loop topology.

      In fact, as shown in Author response image 1, the high-resolution contact profile for the Ppm1g region shows a strong resemblance to that observed for the fly Abd-B regulatory domains.  These regulatory domains are part of larger neighborhood that encompasses the abd-A and Abd-B genes and their regulatory domains.

      Author response image 1.

      Abd-B regulatory domains

      As the authors show, the single long DNA loop mediated by cohesin loop extrusion connecting the ectopic and endogenous boundary is clearly inconsistent with the results, therefore the main conclusion of the paper that the 3D topology of the boundary elements a consequence of pairing is strong. However, the loop extrusion and pairing are not mutually exclusive models for the formation of TADs. Loop-extruding cohesin complexes need not make a 140 kb loop, multiple smaller loops could bring together the two boundary elements, which are then held together by pairing proteins that can make circular topologies.

      (1.2) In the pairing model, distant boundaries bump into each other (by random walks or partially constrained walks), and if they are “compatible” they pair with each other, typically in an orientation-dependent manner.  As an alternative, the reviewer argues that cohesin need not make one large 140 kb loop.  Instead it could generate a series of smaller loops (presumably corresponding to the intervening TADs).  These smaller loops would bring homie in the transgene in close proximity to the eve locus so that it could interact with the endogenous homie and nhomie elements in the appropriate orientation, and in this way only one of the reporters would be ultimately activated.

      There are two problems with the idea that cohesin-dependent loop extrusion brings transgene homie into contact with homie/nhomie in the eve locus by generating a series of small loops (TADs).  The first is the very large distances over which specific boundary:boundary pairing interactions can occur.  The second is that boundary:boundary pairing interactions can take place not only in cis, but also in trans.

      We illustrate these points with several examples. 

      Fujioka et al. 2016, Fig 7 shows an experiment in which attP sites located ~2 Mb apart were used to insert two different transgenes, one containing a lacZ reporter and the other containing the eve anal plate enhancer (AP) (Fujioka et al. 2016).  If the lacZ reporter and the AP transgenes also contain homie, the AP enhancer can activate lacZ expression (panel A,).  On the other hand, if one of the transgenes has lambda DNA instead of homie, no regulatory interactions are observed (panel A,).  In addition, as is the case in our experiments using the -142 kb platform, orientation matters.  In the combination on the top left, the homie boundary is pointing away from both the lacZ reporter and the AP enhancer.  Since homie pairs with itself head-tohead, pairing brings the AP enhancer into contact with the lacZ reporter.  A different result is obtained for the transgene pair in panel A on the top right.  In this combination, homie is pointing away from the lacZ reporter, while it is pointing towards the AP enhancer.  As a consequence, the reporter and enhancer are located on opposite sides of the paired homie boundaries, and in this configuration they are unable to interact with each other.

      On the top left of panel B, the homie element in the AP enhancer transgene was replaced by a nhomie boundary oriented so that it is pointing towards the enhancer.  Pairing of homie and nhomie head-to-tail brings the AP enhancer in the nhomie transgene into contact with the lacZ reporter in the homie transgene, and it activates reporter expression.  Finally, like homie, nhomie pairs with itself head-to-head, and when the nhomie boundaries are pointing towards both the AP reporter and the lacZ reporter, reporter expression is turned on.

      Long distance boundary-dependent pairing interactions by the bithorax complex Mcp boundary have also been reported in several papers.  Fig. 6 from Muller et al. (Muller et al. 1999) shows the pattern of regulatory interactions (in this case PRE-dependent “pairing-sensitive silencing”) between transgenes that have a mini-white reporter, the Mcp and scs’ boundaries and a PRE that is located close to Mcp.  In this experiment flies carrying transgenes inserted at the indicated sites on the left and right arms of the 3rd chromosome were mated in pairwise combinations, and their trans-heterozygous progeny examined for pairing-sensitive silencing of the mini-white reporter.

      Two examples of long-distance pairing-sensitive silencing mediated by Mcp/scs’ are shown in Fig. 5b from Muller et al. 1999.  The transgene inserts in panel A are w#12.43 and ff#10.5w#12.43 is inserted close to the telomere of 3R at 99B.  ff10.5 is inserted closer to the middle of 3R at 91A.  The estimate distance between them is 11.3 Mb.  The transgene inserts in panel B are ff#10.5 and ff#11.102ff#11.102 is inserted at 84D, and the distance between them is 11 Mb.  Normally, the eye color phenotype of the mini-white reporter is additive: homozygyous inserts have twice as dark eye color as hemizygous inserts, while in trans-_heterozygous flies the eye color would be the sum of the two different transgenes.  However, when a PRE is present and the transgene can pair, silencing is observed.  In panel A, the t_rans-_heterozygous combination has a lighter eye color than either of the parents.  In panel B, the _trans-_heterozygous combination is darker than one of the parents (_ff#10.5) but much lighter than the other (ff#11.102).

      All ten of the transgenes tested were able to engage in long distance (>Mbs) trans_regulatory interactions; however, likely because of how the chromosome folds on the Mb scale (e.g., the location of meta-loops: see #2.1 and Author response image 3) not all of the possible pairwise silencing interactions are observed.  The silencing interactions shown in Muller et.al. are between transgenes inserted on different homologs.  _Mcp/scs'-dependent silencing interactions can also occur in cis. Moreover, just like the homie and nhomie experiments described above, Muller et.al. (Muller et al. 1999) found that Mcp could mediate long-distance activation of mini-white and yellow by their respective enhancers.

      The pairing-sensitive activity of the PRE associated with the Mcp boundary is further enhanced when the mini-white transgene has the scs boundary in addition to Mcp and scs’.  In the experiment shown in Fig. 8 from Muller et al. 1999, the pairing-sensitive silencing interactions of the Mcp/scs’/scs transgene are between transgenes inserted on different chromosomes.  Panel A shows pairing-sensitive silencing between w#15.60, which is on the X chromosome, and w#15.102, which is on the 2nd chromosome.  Panel B shows pairing-sensitive silencing between the 2nd chromosome insert w#15.60 and a transgene, w#15.48, which is inserted on the 3rd chromosome.

      The long-distance trans and cis interactions described here are not unique to homie, nhomie, Mcp, scs’, or scs.  Precisely analogous results have been reported by Sigrist and Pirrotta (Sigrist and Pirrotta 1997) for the gypsy boundary when the bxd PRE was included in the mini-white transgene.  Also like the Mcp-containing transgenes in Muller et al. (Muller et al. 1999), Sigrist and Pirrotta observed pairing-sensitive silencing between gypsy bxd_PRE _mini-white transgenes inserted on different chromosomes.  Similar long-distance (Mb) interactions have been reported for Fab-7 (Bantignies et al. 2003; Li et al. 2011).  In addition, there are examples of “naturally occurring” long-distance regulatory and/or physical interactions.  One would be the regulatory/physical interactions between the p53 enhancer upstream of reaper and Xrp1 which was described by Link et al. (Link et al. 2013).  Another would be the nearly 60 meta-loops identified by Mohana et al. (Mohana et al. 2023).

      Like homie at -142 kb, the regulatory interactions (pairing-sensitive silencing and enhancer activation of reporters) reported in Muller et al. (Muller et al. 1999) involve direct physical interactions between the transgenes.  Vazquez et al. (Vazquez et al. 2006) used the lacI/lacO system to visualize contacts between distant scs/Mcp/scs’-containing transgenes in imaginal discs.  As indicated in Vasquez et al. 2006, Table 3 lines #4-7,  when both transgenes have Mcp and were inserted on the same chromosome, they colocalized in trans-_heterozygotes (single dot) in 94% to 97% of the disc nuclei in the four pairwise combinations they tested.  When the transgenes both lacked _Mcp (Vasquez et al. 2006, Table 3 #1), co-localization was observed in 4% of the nuclei.  When scs/Mcp/scs’-containing transgenes on the 2nd and 3rd chromosome were combined (Vasquez et al. 2006, Table 3 #8), colocalization was observed in 96% of the nuclei.  They also showed that four different scs/Mcp/scs’ transgenes (two at the same insertion site but on different homologs, and two at different sites on different homologs) co-localized in 94% of the eye imaginal disc nuclei (Vasquez et al. 2006, Table 3 #9).  These pairing interactions were also found to be stable over several hours.  Similar co-localization experiments together with 3C were reported by Li et al. (Li et al. 2011).

      The de novo establishment of trans interactions between compatible boundary elements has been studied by Lim et al. (Lim et al. 2018).  These authors visualized transvection (enhancer activation of a MS2 loop reporter in trans) mediated by the gypsy insulator, homie and Fab-8  in NC14 embryos.  When both transgenes shared the same boundary element, transvection/physical pairing was observed in a small subset of embryos.  The interactions took place after a delay and increased in frequency as the embryo progressed into NC14.  As expected, transvection was specific: it was not observed when the transgenes had different boundaries.  For homie it was also orientation-dependent.  It was observed when homie was orientated in the same direction in both transgenes, but not when homie was orientated in opposite directions in the two transgenes.

      While one could imagine that loop extrusion-dependent compaction of the chromatin located between eve and the transgene at -142 kb into a series of small loops (the intervening TADs) might be able to bring homie in the transgene close to homie/nhomie in the eve locus, there is no cohesinbased loop extrusion scenario that would bring transgenes inserted at sites 6 Mb, 11 Mb, on different sides of the centromere, or at opposite ends of the 3rd chromosome together so that the distant boundaries recognize their partners and physically pair with each other.  Nor is there a plausible cohesin-based loop extrusion mechanism that could account for the fact that most of the documented long-distance interactions involve transgenes inserted on different homologs.  This is not to mention the fact that long-distance interactions are also observed between boundarycontaining transgenes inserted on different chromosomes.

      In fact, given these results, one would logically come to precisely the opposite conclusion.  If boundary elements inserted Mbs apart, on different homologs and on different chromosomes can find each other and physically pair, it would be reasonable to think that the same mechanism (likely random collisions) is entirely sufficient when they are only 142 kb apart.

      Yet another reason to doubt the involvement or need for cohesin-dependent loop extrusion in bringing the transgene homie in contact with the eve locus comes from the studies of Goel et al. (Goel et al. 2023).  They show that cohesin has no role in the formation of TADs in mammalian tissue culture cells.  So if TADs in mammals aren’t dependent on cohesin, there would not be a good reason to think at this point that the loops (TADs) that are located between eve and the transgene are generated by, or even strongly dependent on, cohesin-dependent loop extrusion.

      It is also important to note that even if loop-extrusion were to contribute to chromatin compaction in this context and make the looping interactions that lead to orientation-specific pairing more efficient, the role of loop extrusion in this model is not determinative of the outcome, it is merely a general compaction mechanism.  This is a far cry from the popular concept of loop extrusion as being THE driving force determining chromosome topology at the TAD level.

      Reviewer #2 (Public Review):

      In Bing et al, the authors analyze micro-C data from NC14 fly embryos, focusing on the eve locus, to assess different models of chromatin looping. They conclude that fly TADs are less consistent with conventional cohesin-based loop extrusion models and instead rely more heavily on boundaryboundary pairings in an orientation-dependent manner.

      Overall, I found the manuscript to be interesting and thought-provoking. However, this paper reads much more like a perspective than a research article. Considering eLIFE is aimed at the general audience, I strongly suggest the authors spend some time editing their introduction to the most salient points as well as organizing their results section in a more conventional way with conclusion-based titles. It was very difficult to follow the authors' logic throughout the manuscript as written. It was also not clear as written which experiments were performed as part of this study and which were reanalyzed but published elsewhere. This should be made clearer throughout.

      It has been shown several times that Drosophila Hi-C maps do not contain all of the features (frequent corner peaks, stripes, etc.) observed when compared to mammalian cells. Considering these features are thought to be products of extrusion events, it is not an entirely new concept that Drosophila domains form via mechanisms other than extrusion.

      (2.1) While there are differences between the Hi-C contact profiles in flies and mammals, these differences likely reflect in large part the bin sizes used to visualize contact profiles.  With the exception of Goel et al. (Goel et al. 2023), most of the mammalian Hi-C studies have been low resolution restriction enzyme-based experiments, and required bin sizes of >1 kb or greater to visualize what are labeled as  “TADs.”  In fact, as shown by experiments in Goel et al., these are not actually TADs, but rather a conglomeration of multiple TADs into a series of TAD neighborhoods.  The same is true for the MicroC experiments of Krietenstein et al. and Hsieh et al. on human and mouse tissue culture cells (Hsieh et al. 2020; Krietenstein et al. 2020).  This is shown in Author response image 2.  In this image, we have compared the MicroC profiles generated from human and mouse tissue culture cells with fly MicroC profiles at different levels of resolution.

      For panels A-D, the genomic DNA segments shown are approximately 2.8 Mb, 760 kb, 340 kb, and 190 kb.  For panels E-H, the genomic DNA segments shown are approximately 4.7 Mb, 870 kb, 340 kb and 225 kb.  For panels I-L, the genomic DNA segments shown are approximately 3 Mb, 550 kb, 290 kb and 175 kb.

      As reported for restriction enzyme-based Hi-C experiments, a series of stripes and dots are evident in mammalian MicroC profiles.  In the data from Krietenstein et al., two large TAD “neighborhoods” are evident with a bin size of 5 kb, and these are bracketed by 45o stripes (A: black arrows).  At 1 kb (panel B), the 45o stripe bordering the neighborhood on the left no longer defines the edge of the neighborhood (blue arrow: panel B), and both stripes become discontinuous (fuzzy dots).  At 500 (panel C) and 200 bp (panel D) bin sizes, the stripes largely disappear (black arrows) even though they were the most prominent feature in the TAD landscape with large bin sizes.  At 200 bp, the actual TADs (as opposed to the forest) are visible, but weakly populated.  There are no stripes, and only one of the TADs has an obvious “dot” (green asterisk: panel C).

      Author response image 2.

      Mammalian MicroC profiles different bin sizes.

      Large TAD neighborhoods bordered by stripes are also evident in the Hsieh et al. data set in Author response image 2 panels E and F (black arrows in E and F and green arrow in F).  At 400 bp resolution (panel G), the narrow stripe in panel F (black arrows) becomes much broader, indicating that it is likely generated by interactions across one or two small TADs that can be discerned at 200 bp resolution.  The same is true for the broad stripe indicated by the green arrows in panels F, G and H.  This stripe arises from contacts between the TADs indicated by the red bar in panels G and H and the TADs to the other side of the volcano triangle with a plume (blue arrow in panel H).  As in flies, we would expect that this volcano triangle topped by a plume corresponds to a stem-loop.  However, the resolution is poor at 200 bp, and the profiles of the neighboring TADs are not very distinct.

      For the fly data set, stripes can be discerned when analyzed at 800 bp resolution (see arrows in Author response image 3);  however, these stripes are flanked by regions of lower contact, and represent TAD-TAD interactions.  At 400 bp, smaller neighborhoods can be discerned, and these neighborhoods exhibit a complex pattern of interaction with adjacent neighborhoods.  With bin sizes of 200 bp, individual TADs are observed, as are TAD-TAD interactions like those seen near eve.  Some of the TADs have dots at their apex, while others do not—much like what is seen in the mammalian MicroC studies.

      Author response image 3.

      Mammalian MicroC profiles different bin sizes.

      Stripes: As illustrated in Author response image 2 A-D and E-H, the continuous stripes seen in low resolution mammalian studies (>1 kb bins) would appear to arise from binning artefacts.  At high resolution where single TADs are visible, the stripes seem to be generated by TAD-TAD interactions, and not by some type of “extrusion” mechanism.  This is most clearly seen for the volcano with plume TAD in Author response inage 2 G and H.  While stripes in Author response image 2 disappear at high resolution, this is not always true.  There are stripes that appear to be “real” in Geol et al. 2023 for the TADs in the Ppm1g region, and in Author response image 1 for the Abd-B regulatory domain TADs.  Since the stripes in the Ppm1g region are unaffected by Rad21 depletion, some other mechanism must be involved (c.f. (Shidlovskii et al. 2021)).

      Dots: The high resolution images of mammalian MicroC experiments in Author response image 2D and H show that, like Drosophila (Author response image 3L), mammalian TADs don’t always have a “dot” at the apex of the triangle.  This is not surprising.  In the MicroC procedure, fixed chromatin is digested to mononucleosomes with MNase.  Since most TAD boundaries in flies, and presumably also in mammals, are relatively large (150-400 bp) nuclease hypersensitive regions, extensive MNase digestion will typically reduce the boundary element sequences to oligonucleotides.

      In flies, the only known sequences (at least to date) that end up giving dots (like those seen in Author response image 1) are bound by a large (>1,000 kd) GAF-containing multiprotein complex called LBC.  In the Abd-B region of BX-C, LBC binds to two ~180 bp sequences in Fab-7 (dHS1 and HS3: (Kyrchanova et al. 2018; Wolle et al. 2015), and to the centromere proximal (CP) side of Fab-8.  The LBC elements in Fab-7 (dHS1) and Fab-8 (CP) have both blocking and boundary bypass activity (Kyrchanova et al. 2023; Kyrchanova et al. 2019a; Kyrchanova et al. 2019b; Postika et al. 2018).  Elsewhere, LBC binds to the bx and bxd PREs in the Ubx regulatory domains, to two PREs upstream of engrailed, to the hsp70 promoter, the histone H3-H4 promoters, and the eve promoter (unpublished data).  Based on ChIP signatures, it likely binds to most PREs/tethering elements in the fly genome (Batut et al. 2022; Li et al. 2023).  Indirect end-labeling experiments (Galloni et al. 1993; Samal et al. 1981; Udvardy and Schedl 1984) indicate that LBC protects an ~150-180 bp DNA segment from MNase digestion, which would explain why LBC-bound sequences are able to generate dots in MicroC experiments.  Also unlike typical boundary elements, the pairing interactions of the LBC elements we’ve tested appear to be orientation-independent (unpublished data).

      The difference in MNase sensitivity between typical TAD boundaries and LBC-bound elements is illustrated in the MicroC of the Leukocyte-antigen-related-like (Lar) meta-loop in Author response image 4 panels A and B.  Direct physical pairing of two TAD boundaries (blue and purple) brings two TADs encompassing the 125 kb lar gene into contact with two TADs in a gene poor region 620 kb away.  This interaction generates two regions of greatly enhanced contact: the two boxes on either side of the paired boundaries (panel A).  Note that like transgene homie pairing with the eve boundaries, the boundary pairing interaction that forms the lar meta-loop is orientation-dependent.  In this case the TAD boundary in the Lar locus pairs with the TAD boundary in the gene poor region head-to-head (arrow tip to arrow tip), generating a circle-loop.  This circle-loop configuration brings the TAD upstream of the blue boundary into contact with the TAD upstream of the purple boundary.  Likewise, the TAD downstream of the blue boundary is brought into contact with the TAD downstream of the purple boundary.

      In the MicroC procedure, the sequences that correspond to the paired boundaries are not recovered (red arrow in Author response image 4 panel B).  This is why there are vertical and horizontal blank stripes (red arrowheads) emanating from the missing point of contact.  Using a different HiC procedure (dHS-C) that allows us to recover sequences from typical boundary elements (Author response image 4 panels C and D), there is a strong “dot” at the point of contact which corresponds to the pairing of the blue and purple boundaries.

      There is a second dot (green arrow) within the box that represents physical contacts between sequences in the TADs downstream of the blue and purple boundaries.  This dot is resistant to MNase digestion and is visible both in the MicroC and dHS-C profiles.  Based on the ChIP signature of the corresponding elements in the two TADs downstream of the blue and purple boundaries, this dot represents paired LBC elements.

      Author response image 4.

      Lar metaloop. Panels A & bB: MicroC. Panels C & D: dHS-C

      That being said, the authors' analyses do not distinguish between the formation and the maintenance of domains. It is not clear to this reviewer why a single mechanism should explain the formation of the complex structures observed in static Hi-C heatmaps from a population of cells at a single developmental time point. For example, how can the authors rule out that extrusion initially provides the necessary proximity and possibly the cis preference of contacts required for boundaryboundary pairing whereas the latter may more reflect the structures observed at maintenance?

      (2.2) The MicroC profiles shown in Fig. 2 of our paper were generated from nuclear cycle (NC) 14 embryos.  NC14 is the last nuclear cycle before cellularization (Foe 1989).  After the nuclei exit mitosis, S-phase begins, and because satellite sequences are late replicating in this nuclear cycle, S phase lasts 50 min instead of only 4-6 min during earlier cycles (Shermoen et al. 2010).  So unlike MicroC studies in mammals, our analysis of chromatin architecture in NC14 embryos likely offers the best opportunity to detect any intermediates that are generated during TAD formation.  In particular, we should be able to observe evidence of cohesin linking the sequences from the two extruding strands together (the stripes) as it generates TADs de novo.  However, there are no vertical stripes in the eve TAD as would be expected if cohesin entered at a few specific sites somewhere within the TAD and extruded loops in opposite directions synchronously, nor are their stripes at 45o as would be expected if it started at nhomie or homie (see Figure Supplemental 1).  We also do not detect cohesin-generated stripes in any of the TADs in between eve and the attP site at -142 kb. Note that in some models, cohesin is thought to be continuously extruding loops. After hitting the CTCF roadblocks, cohesin either falls off after a short period and starts again or it breaks through one or more TAD boundaries generating the LDC domains. In this dynamic model, stripes of crosslinked DNA generated by the passing cohesin complex should be observed throughout the cell cycle.  They are not. 

      As for formation versus maintenance, and the possible involvement of cohesin loop extrusion in the former, but not the latter:  This question was indirectly addressed in point #1.2 above.  In this point we described multiple examples of specific boundary:boundary pairing interactions that take place over Mbs, in cis and in trans and even between different chromosomes.  These long-distance interactions don’t preexist;  instead they must be established de novo and then maintained.  This process was actually visualized in the studies of Lim et al. (Lim et al. 2018) on the establishment of trans boundary pairing interactions in NC14 embryos.  There is no conceivable mechanism by which cohesin-based loop extrusion could establish the long or short distance trans interactions that have been documented in many studies on fly boundary elements.  Also as noted above, its seems unlikely that it is necessary for long-range interactions in cis.  

      A more plausible scenario is that cohesin entrapment helps to stabilize these long-distance interactions after they are formed.  If this were true, then one could argue that cohesin might also function to maintain TADs after boundaries have physically paired with their neighbors in cis.  However, the Rad21 depletion experiments of Goel et al. (Goel et al. 2023) would rule out an essential role for cohesin in maintaining TADs after boundary:boundary pairing.  In short, while we cannot formally rule out that loop extrusion might help bring sequences closer together to increase their chance of pairing, neither the specificity of that pairing, nor its orientation can be explained by loop extrusion.  Furthermore, since pairing in trans cannot be facilitated by loop extrusion, invoking it as potentially important for boundary-boundary pairing in cis can only be described as a potential mechanism in search of a function, without clear evidence in its favor.

      On the other hand, the apparent loss of contacts between TADs within large multi-TAD neighborhoods (Geol et al. 2023) would suggest that there is some sort of decompaction of neighborhoods after Rad21 depletion.  It is possible that this might stress interactions that span multiple TADs as is the case for homie at -142, or for the other examples described in #1.2 above.  This kind of involvement of cohesin might or might not be associated with a loop extrusion mechanism.

      Future work aimed at analyzing micro-C data in cohesin-depleted cells might shed additional light on this.

      (2.3) This experiment has been done by Goel et al. (Goel et al. 2023) in mammalian tissue culture cells.  They found that TADs, as well as local TAD neighborhoods, are not disrupted/altered by Rad21 depletion (see Geol at al. 2023 and our response to point #1.1 of reviewer #1).

      Additional mechanisms at play include compartment-level interactions driven by chromatin states. Indeed, in mammalian cells, these interactions often manifest as a "plume" on Hi-C maps similar to what the authors attribute to boundary interactions in this manuscript. How do the chromatin states in the neighboring domains of the eve locus impact the model if at all?

      (2.4) Chromatin states have been implicated in driving compartment level interactions. 

      Compartments as initially described were large, often Mb sized, chromosomal segments that “share” similar chromatin marks/states, and are thought to merge via co-polymer segregation.  They were visualized using large multi-kb bin sizes.  In the studies reported here, we use bin sizes of 200 bp to examine a DNA segment of less than 200 kb which is subdivided into a dozen or so small TADs.  Several of the TADs contain more than one transcription unit, and they are expressed in quite different patterns, and thus might be expected to have different “chromatin states” at different points in development and in different cells in the organism. However, as can be seen by comparing the MicroC patterns in our paper that are shown in Fig. 2 with Fig. 7, Figure Supplemental 5 and Figure Supplemental 6, the TAD organization in NC14 and 12-16 hr embryos is for the most part quite similar.  There is no indication that these small TADs are participating in liquid phase compartmentalization that depends upon shared chromatin/transcriptional states in NC14 and then again in 12-16 hr embryos. 

      In NC14 embryos, eve is expressed in 7 stripes, while it is potentially active throughout much of the embryo.  In fact, the initial pattern in early cycles is quite broad and is then refined during NC14.  In 12-16 hr embryos, the eve gene is silenced by the PcG system in all but a few cells in the embryo.  However, here again the basic structure of the TAD, including the volcano plume, looks quite similar at these different developmental stages.  

      As for the suggestion that the plume topping the eve volcano triangle is generated because the TADs flanking the eve TAD share chromatin states and coalesce via some sort of phase separation:

      This model has been tested directly in Ke et al. (Ke et al. 2024).  In Ke et al., we deleted the nhomie boundary and replaced it with either nhomie in the reverse orientation or homie in the forward orientation.  According to the compartment model, changing the orientation of the boundaries so that the topology of the eve TAD changes from a stem-loop to a circle-loop should have absolutely no effect on the plume topping the eve volcano triangle.  The TADs flanking the eve TAD would still be expected to share the same chromatin states and would still be able to coalesce via phase transition.  However, this is not what is observed.  The plume disappears and is replaced by “clouds” on both sides of the eve TAD. The clouds arise because the eve TAD bumps into the neighboring TADs when the topology is a circle-loop.  

      We would also note that “compartment-level” interactions would not explain the findings presented in Muller at al. 1999, in Table 1 or in Author response image 4.  It is clear that the long distant (Mb) interactions observed for Mcp, gypsy, Fab-7, homie, nhomie and the blue and purple boundaries in Author response image 4 arise by the physical pairing of TAD boundary elements.  This fact is demonstrated directly by the MicroC experiments in Fig. 7 and Fig Supplemental 4 and 5, and by the MicroC and dHS-C experiments in Author response image 4.  There is no evidence for any type of “compartment/phase separation” driving these specific boundary pairing interactions.

      In fact, given the involvement of TAD boundaries in meta-loop formation, one might begin to wonder whether some of the “compartment level interactions” are generated by the specific pairing of TAD boundary elements rather than by “shared chromatin” states.  For example, the head-tohead pairing of the blue and purple boundaries generates a Lar meta-loop that has a circle-loop topology.  As a consequence, sequences upstream of the blue and purple boundary come into contact, generating the small dark rectangular box on the upper left side of the contact map.  Sequences downstream of the blue and purple boundary also come into contact, and this generates the larger rectangular box in the lower right side of the contact map.  A new figure, Fig. 9, shows that the interaction pattern flips (lower left and top right) when the meta-loop has a stem-loop topology.  If these meta-loops are visualized using larger bin sizes, the classic “compartment” patchwork pattern of interactions emerges.  Would the precise patchwork pattern of “compartmental” interactions involving the four distant TADs that are linked in the two meta-loops shown in Fig. 9 persist as is if we deleted one of the TAD boundaries that forms the meta-loop?  Would the precise patchwork pattern persist if we inverted one of the meta-loop boundaries so that we converted the topology of the loop from a circle-loop to a stem-loop or vice versa?  We haven’t used MicroC to compare the compartment organization after deleting or inverting a meta-loop TAD boundary; however, a comparison of the MicroC pattern in WT in Fig. 1C with that for the homie transgenes in Fig. 7 and Figs. Supplemental 5, 6 and 7 indicates a) that novel patterns of TAD:TAD interactions are generated by this homie dependent mini-meta-loop and b) that the patterns of TAD:TAD interactions depend upon loop topology. Were these novel TAD:TAD interactions generated instead by compartment level interactions/shared chromatin states, they should be evident in WT as well (Fig. 1).  They are not.

      How does intrachromosomal homolog pairing impact the models proposed in this manuscript (Abed et al. 2019; Erceg et al., 2019). Several papers recently have shown that somatic homolog pairing is not uniform and shows significant variation across the genome with evidence for both tight pairing regions and loose pairing regions. Might loose pairing interactions have the capacity to alter the cis configuration of the eve locus?

      (2.5) At this point it is not entirely clear how homolog pairing impacts the cis configuration/MicroC contact maps.  We expect that homolog pairing is incomplete in the NC14 embryos we analyzed;  however, since replication of eve and the local neighborhood is likely complete, sister chromosomes should be paired.  So we are likely visualizing the 3D organization of paired TADs.

      In summary, the transgenic experiments are extensive and elegant and fully support the authors' models. However, in my opinion, they do not completely rule out additional models at play, including extrusion-based mechanisms. Indeed, my major issue is the limited conceptual advance in this manuscript. The authors essentially repeat many of their previous work and analyses.

      (2.6) In our view, the current paper makes a number of significant contributions that go well beyond those described in our 2016 publication.  These are summarized below.

      A) While our 2016 paper used transgenes inserted in the -142 kb attP site to study pairing interactions of homie and nhomie, we didn’t either consider or discuss how our findings might bear on the loop extrusion model.  However, since the loop extrusion model is currently accepted as established fact by many labs working on chromosome structure, it is critically important to devise experimental approaches which test the predictions of this particular model.  One approach would be to deplete cohesin components; however, as discussed in #1.1, our experimental system is not ideal for this type of approach.  On the other hand, there are other ways to test the extrusion model.  Given the mechanism proposed for TAD formation—extruding a loop until cohesin bumps into CTCF/boundary road blocks—it follows that only two types of loop topologies are possible: stemloop and unanchored loop.  The loop extrusion model, as currently conceived, can’t account for the two cases in this study in which the reporter on the wrong side of the homie boundary from the eve locus is activated by the eve enhancers.  In contrast, our findings are completely consistent with orientation-specific boundary:boundary pairing.

      B) In the loop extrusion model, cohesin embraces both of the extruded chromatin fibers, transiently bringing them into close proximity.  As far as we know, there have been no (high resolution) experiments that have actually detected these extruding cohesin complexes during TAD formation.  In order to have a chance of observing the expected signatures of extruding cohesin complexes, one would need a system in which TADs are being formed.  As described in the text, this is why we used MicroC to analyze TADs in NC14 embryos.  We do not detect the signature stripes that would be predicted (see Figure Supp 2) by the current version of the loop extrusion model.

      C) Reporter expression in the different -142 kb transgenes provides only an indirect test of the loop extrusion and boundary:boundary pairing models for TAD formation.  The reporter expression results need to be confirmed by directly analyzing the pattern of physical interactions in each instance.  While we were able to detect contacts between the transgenes and eve in our 2016 paper, the 3C experiments provided no information beyond that.  By contrast, the MicroC experiments in the current paper give high resolution maps of the physical contacts between the transgene and the eve TAD.  The physical contacts track completely with reporter activity.  Moreover, just as is the case for reporter activity, the observed physical interactions are inconsistent with the loop extrusion model.

      D) Genetic studies in Muller et al. (Muller et al. 1999) and imaging in Vazquez et al. (Vazquez et al. 2006) suggested that more than two boundaries can participate in pairing interactions.  Consistent with these earlier observations, viewpoint analysis indicates the transgene homie interacts with both eve boundaries.  While this could be explained by transgene homie alternating between nhomie and homie in the eve locus, this would require the remodeling of the eve TAD each time the pairing interaction switched between the three boundary elements.  Moreover, two out of the three possible pairing combinations would disrupt the eve TAD, generating an unanchored loop (c.f., the lambda DNA TAD in Ke et al., (Ke et al. 2024)).  However, the MicroC profile of the eve TAD is unaffected by transgenes carrying the homie boundary.  This would suggest that like Mcp, the pairing interactions of homie and nhomie might not be exclusively pairwise.  In this context is interesting to compare the contact profiles of the lar meta-loop shown in Author response image 4 with the different 142 kb homie inserts.  Unlike the homie element at -142 kb, there is clearly only a single point of contact between the blue and purple boundaries.

      E) Chen et al. (Chen et al. 2018) used live imaging to link physical interactions between a homie containing transgene inserted at -142 kb and the eve locus to reporter activation by the eve enhancers.  They found that the reporter was activated by the eve enhancers only when it was in “close proximity” to the eve gene.  “Close proximity” in this case was 331 nM.  This distance is equivalent to ~1.1 kb of linear duplex B form DNA, or ~30 nucleosome core particles lined up in a row.  It would not be possible to ligate two DNAs wrapped around nucleosome core particles that are located 330 nM apart in a fixed matrix.  Since our MicroC experiments were done on embryos in which the gene is silent in the vast majority of cells, it is possible that the homie transgene only comes into close enough proximity for transgene nucleosome: eve nucleosome ligation events when the eve gene is off.  Alternatively, and clearly more likely, distance measurements using imaging procedures that require dozens of fluorescent probes may artificially inflate the distance between sequences that are actually close enough for enzymatic ligation.

      F) The findings reported in Goel et al. (Goel et al. 2023) indicate that mammalian TADs don’t require cohesin activity; however, the authors do not provide an alternative mechanism for TAD formation/stability.  Here we have suggested a plausible mechanism.

      The authors make no attempt to dissect the mechanism of this process by modifying extrusion components directly.

      (2.7) See point #1.1

      Some discussion of Rollins et al. on the discovery of Nipped-B and its role in enhancer-promoter communication should also be made to reconcile their conclusions in the proposed absence of extrusion events.

      (2.8) The reason why reducing nipped-B activity enhances the phenotypic effects of gypsy-induced mutations is not known at this point; however, the findings reported in Rollins et al. (Rollins et al. 1999) would appear to argue against an extrusion mechanism for TAD formation.

      Given what we know about enhancer blocking and TADs, there are two plausible mechanisms for how the Su(Hw) element in the gypsy transposon blocks enhancer-promoter interactions in the gypsy-induced mutants studied by Rollins et al.  First, the Su(Hw) element could generate two new TADs through pairing interactions with boundaries in the immediate neighborhood.  This would place the enhancers in one TAD and the target gene in another TAD.  Alternatively, the studies of Sigrist and Pirrotta (Sigrist and Pirrotta 1997) as well as several publications from Victor Corces’ lab raise the possibility that the Su(Hw) element in gypsy-induced mutations is pairing with gypsy transposons inserted elsewhere in the genome.  This would also isolate enhancers from their target genes.  In either case, the loss of nipped-B activity increases the mutagenic effects of Su(Hw) element presumably by strengthening its boundary function.  If this is due to a failure to load cohesin on to chromatin, this would suggest that cohesin normally functions to weaken the boundary activity of the Su(Hw) element, i.e., disrupting the ability of Su(Hw) elements to interact with either other boundaries in the neighborhood or with themselves.  Were this a general activity of cohesin (to weaken boundary activity), one would imagine that cohesin normally functions to disrupt TADs rather than generate/stabilize TADs.

      An alternative model is that Nipped-B (and thus cohesion) functions to stabilize enhancerpromoter interactions within TADs.  In this case, loss of Nipped-B would result in a destabilization of the weak enhancer:promoter interactions that can still be formed when gypsy is located between the enhancer and promoter.  In this model the loss of these weak interactions in nipped-b mutants would appear to increase the “blocking” activity of the gypsy element.  However, this alternative model would also provide no support for the notion that Nipped-B and cohesin function to promote TAD formation.

      Reviewer #3 (Public Review):

      Bing et al. attempt to address fundamental mechanisms of TAD formation in Drosophila by analyzing gene expression and 3D conformation within the vicinity of the eve TAD after insertion of a transgene harboring a Homie insulator sequence 142 kb away in different orientations. These transgenes along with spatial gene expression analysis were previously published in Fujioka et al. 2016, and the underlying interpretations regarding resulting DNA configuration in this genomic region were also previously published. This manuscript repeats the expression analysis using smFISH probes in order to achieve more quantitative analysis, but the main results are the same as previously published. The only new data are the Micro-C and an additional modeling/analysis of what they refer to as the 'Z3' orientation of the transgenes. The rest of the manuscript merely synthesizes further interpretation with the goal of addressing whether loop extrusion may be occurring or if boundary:boundary pairing without loop extrusion is responsible for TAD formation. The authors conclude that their results are more consistent with boundary:boundary pairing and not loop extrusion; however, most of this imaging data seems to support both loop extrusion and the boundary:boundary models. This manuscript lacks support, especially new data, for its conclusions.

      (3.1) The new results/contributions of our paper are described in #2.6 above. 

      Although there are (two) homie transgene configurations that give expression patterns that would be consistent with the loop extrusion model, that is not quite the same as strong evidence supporting loop extrusion.  On the contrary, key aspects of the expression data are entirely inconsistent with loop extrusion, and they thus rule out the possibility that loop extrusion is sufficient to explain the results.  Moreover, the conclusions drawn from the expression patterns of the four transgenes are back up by the MicroC contact profiles—profiles that are also not consistent with the loop extrusion model.  Further, as documented above, loop extrusion is not only unable to explain the findings reported in this manuscript, but also the results from a large collection of published studies on fly boundaries.  Since all of these boundaries function in TAD formation, there is little reason to think that loop extrusion makes a significant contribution at the TAD level in flies.   Given the results reported by Goel et al. (Goel et al. 2023), one might also have doubts about the role of loop extrusion in the formation/maintenance of mammalian TADs. 

      To further document these points, we’ve included a new figure (Fig. 9) that shows two meta-loops.  Like the loops seen for homie-containing transgenes inserted at -142 kb, meta-loops are formed by the pairing of distant fly boundaries.  As only two boundaries are involved, the resulting loop topologies are simpler than those generated when transgene homie pairs with nhomie and homie in the eve locus.  The meta-loop in panel B is a stem-loop.  While a loop with this topology could be formed by loop extrusion, cohesion would have to break through dozens of intervening TAD boundaries and then somehow know to come to a halt at the blue boundary on the left and the purple boundary on the right.  However, none of the mechanistic studies on either cohesin or the mammalian CTCF roadblocks have uncovered activities of either the cohesin complex or the CTCF roadblocks that could explain how cohesin would be able to extrude hundreds of kb and ignore dozens of intervening roadblocks, and then stop only when it encounters the two boundaries that form the beat-IV meta-loop.  The meta-loop in panel A is even more problematic in that it is a circle-loop--a topology that can’t be generated by cohesin extruding a loop until comes into contact with CTCF roadblocks on the extruded strands.

      Furthermore, there are many parts of the manuscript that are difficult to follow. There are some minor errors in the labelling of the figures that if fixed would help elevate understanding. Lastly, there are several major points that if elaborated on, would potentially be helpful for the clarity of the manuscript.

      Major Points:

      (1) The authors suggest and attempt to visualize in the supplemental figures, that loop extrusion mechanisms would appear during crosslinking and show as vertical stripes in the micro-C data. In order to see stripes, a majority of the nuclei would need to undergo loop extrusion at the same rate, starting from exactly the same spots, and the loops would also have to be released and restarted at the same rate. If these patterns truly result from loop extrusion, the authors should provide experimental evidence from another organism undergoing loop extrusion.

      (3.2) We don’t know of any reports that actually document cohesion extrusion events that are forming TADs (TADs as defined in our paper, in the RCMC experiments of Goel et al. (Goel et al. 2023), in response #1.1, or in the high-resolution images from the MicroC data of Krietenstein et al (Krietenstein et al. 2020) and Hseih et al. (Hsieh et al. 2020). However, an extruding cohesin complex would be expected to generate stripes because it transiently brings together the two chromatin strands as illustrated by the broken zipper in Figure Supplemental 2 of our paper.  While stripes generated by cohesin forming a TAD have not to our knowledge ever been observed, Fig. 4 in Goel et al. (Goel et al. 2023)) shows 45o stripes outlining TADs and connecting neighboring TADs.  These stripes are visible with or without Rad21.

      In some versions of the loop extrusion model, cohesin extrudes a loop until it comes to a halt at both boundaries, where it then remains holding the loop together.  In this model, the extrusion event would occur only once per cell cycle.  This is reason we selected NC14 embryos as this point in development should provide by far the best opportunity to visualize cohesin-dependent TAD formation.  However, the expected stripes generated by cohesin embrace of both strands of the extruding loop were not evident.  Other newer versions of the loop extrusion model are much more dynamic—cohesin extrudes the loop, coming to a halt at the two boundaries, but either doesn’t remain stably bound or breaks through one or both boundaries. In the former case, the TAD needs to be reestablished by another extrusion event, while in the latter case LDC domains are generated.  In this dynamic model, we should also be able to observe vertical and 45o stripes (or stripes leaning to one side or another of the loading site if the extrusion rates aren’t equal on both fibers) in NC14 embryos corresponding to the formation of TADs and LDC domains.  However, we don’t.

      (2) On lines 311-314, the authors discuss that stem-loops generated by cohesin extrusion would possibly be expected to have more next-next-door neighbor contacts than next-door neighbor contacts and site their models in Figure 1. Based on the boundary:boundary pairing models in the same figure would the stem-loops created by head-to-tail pairing also have the same phenotype? Making possible enrichment of next-next-door neighbor contacts possible in both situations? The concepts in the text are not clear, and the diagrams are not well-labeled relative to the two models.

      (3.3) Yes, we expect that stem-loops formed by cohesin extrusion or head-to-tail pairing would behave in a similar manner.  They could be stem-loops separated by unanchored loops as shown in Fig. 1B and E.  Alternatively, adjacent loops could be anchored to each other (by cohesin/CTCF road blocks or by pairing interactions) as indicated in Fig. 1C and F.  In stem-loops generated either by cohesin extrusion or by head-to-tail pairing, next-next door neighbors should interact with each other, generating a plume above the volcano triangle.  In the case of circle-loops, the volcano triangle should be flanked by clouds that are generated when the TAD bumps into both next-door neighbors.  In the accompanying paper, we test this idea by deleting the nhomie boundary and then a) inserting nhomie back in the reverse orientation, or b) by inserting homie in the forward orientation.  The MicroC patterns fit with the predictions that were made in this paper.

      (3) The authors appear to cite Chen et al., 2018 as a reference for the location of these transgenes being 700nM away in a majority of the nuclei. However, the exact transgenes in this manuscript do not appear to have been measured for distance. The authors could do this experiment and include expression measurements.

      (3.4) The transgenes used in Chen et al. are modified versions of a transgene used in Fujioka et al. (2016) inserted into the same attP site.  When we visualize reporter transcription in NC14 embryos driven by the eve enhancers using smFISH, HCR-FISH or DIG, only a subset of the nuclei at this stage are active.  The number of active nuclei we detect is similar to that observed in the live imaging experiments of Chen et al.  The reason we cited Chen et al. (Chen et al. 2018) was that they found that proximity was a critical factor in determining whether the reporter was activated or not in a given nucleus.  The actual distance they measured wasn’t important.  Moreover, as we discussed in response #2.6 above, there are good reasons to think that the “precise” distances measured in live imaging experiments like those used in Chen et al. are incorrect.  However, their statements are certainly correct if one considers that a distance of ~700 nM or so is “more distant” relative to a distance of ~300 nM or so, which is “closer.”

      (4) The authors discuss the possible importance of CTCF orientation in forming the roadblock to cohesin extrusion and discuss that Homie orientation in the transgene may impact Homie function as an effective roadblock. However, the Homie region inserted in the transgene does not contain the CTCF motif. Can the authors elaborate on why they feel the orientation of Homie is important in its ability to function as a roadblock if the CTCF motif is not present? Trans-acting factors responsible for Homie function have not been identified and this point is not discussed in the manuscript.

      We discussed the “importance” of CTCF orientation in forming roadblocks because one popular version of the cohesin loop extrusion/CTCF roadblock model postulates that CTCF must be oriented so that the N-terminus of the protein is facing towards the oncoming cohesin complex, otherwise it won’t be able to halt extrusion on that strand.  When homie in the transgene is pointing towards the eve locus, the reporter on the other side (farther from eve) is activated by the eve enhancers.  One possible way to explain this finding (if one believes the loop extrusion model) is that when homie is inverted, it can’t stop the oncoming cohesin complex, and it runs past the homie boundary until it comes to a stop at a properly oriented boundary farther away.  In this case, the newly formed loop would extend from the boundary that stopped cohesin to the homie boundary in the eve locus, and would include not only the distal reporter, but also the proximal reporter.  If both reporters are in the same loop with the eve enhancers (which they would have to be given the mechanism of TAD formation by loop extrusion), both reporters should be activated.  They are not.

      For the boundary pairing model, the reporter that will be activated will depend upon the orientation of the pairing interaction—which can be either head-to-head or head-to-tail (or both: see discussion of LBC elements in #2.1).  For an easy visualization of how the orientation of pairing interactions is connected to the patterns of interactions between sequences neighboring the boundary, please look at Fig. 9.  This figure shows two different meta-loops.  In panel A, head-tohead pairing of the blue and purple boundaries brings together, on the one hand, sequences upstream of the blue and purple boundary, and on the other hand, sequences downstream of the blue and purple boundaries.  In the circle loop configuration, the resulting rectangular boxes of enhanced contact are located in the upper left and lower right of the contact map.  In panel B, the head-to-tail pairing of the blue and purple boundary changes how sequences upstream and downstream of the blue and purple boundaries interact with each other.  Sequences upstream of the blue boundary interact with sequences downstream of the purple boundary, and this gives the rectangular box of enhanced interactions on the top right.  Sequences downstream of the blue boundary interact with sequences upstream of the purple boundary, and this gives the rectangular box of enhanced contact on the lower left.

      CTCF: Our analysis of the homie boundary suggests that CTCF contributes little to its activity.  It has an Su(Hw) recognition sequence and a CP190 “associated” sequence.  Mutations in both compromise boundary activity (blocking and -142 kb pairing).  Gel shift experiments and ChIP data indicate there are half a dozen or more additional proteins that associate with the 300 bp homie fragment used in our experiments.

      Orientation of CTCF or other protein binding sites:  The available evidence suggests that orientation of the individual binding sites is not important (Kyrchanova et al. 2016; Lim et al. 2018)).  Instead, it is likely that the order of binding sites affects function.

      (5) The imaging results seem to be consistent with both boundary:boundary interaction and loop extrusion stem looping.

      It is not clear whether the reviewer is referring to the different patterns of reporter expression— which clearly don’t fit with the loop extrusion model in the key cases that distinguish the two models—or the live imaging experiments in Chen et al. (Chen et al. 2018).

      (6) The authors suggest that the eveMa TAD could only be formed by extrusion after the breakthrough of Nhomie and several other roadblocks. Additionally, the overall long-range interactions with Nhomie appear to be less than the interactions with endogenous Homie (Figures 7, 8, and supplemental 5). Is it possible that in some cases boundary:boundary pairing is occurring between only the transgenic Homie and endogenous Homie and not including Nhomie?

      Yes, it is possible.  On the other hand, the data that are currently available supports the idea that transgene homie usually interacts with endogenous homie and nhomie at the same time.  This is discussed in #2.6D above.  The viewpoints indicate that crosslinking occurs more frequently to homie than to nhomie.  This could indicate that when there are only pairwise interactions, these tend to be between homie and homie.  Alternatively, this could also be explained by a difference in relative crosslinking efficiency.

      (7) In Figure 4E, the GFP hebe expression shown in the LhomieG Z5 transgenic embryo does not appear in the same locations as the LlambdaG Z5 control. Is this actually hebe expression or just a background signal?

      The late-stage embryos shown in E are oriented differently.  For GlambdaL, the embryo is oriented so that hebe-like reporter expression on the ventral midline is readily evident.  However, this orientation is not suitable for visualizing eve enhancer-dependent expression of the reporters in muscle progenitor cells.  For this reason, the 12-16 hr GeimohL embryo in E is turned so that the ventral midline isn’t readily visible in most of the embryo.  As is the case in NC14 embyros, the eve enhancers drive lacZ but not gfp expression in the muscle progenitor cells.

      (8) Figure 6- The LhomieG Z3 (LeimohG) late-stage embryo appears to be showing the ventral orientation of the embryo rather than the lateral side of the embryo as was shown in the previous figure. Is this for a reason? Additionally, there are no statistics shown for the Z3 transgenic images.

      Were these images analyzed in the same way as the Z5 line images?

      The LeimohG embryo was turned so that the hebe enhancer-dependent expression of lacZ is visible.  While the eve enhancer-dependent expression of lacZ in the muscle progenitor cells isn’t visible with this orientation, eve enhancer-dependent expression in the anal plate is.

      (9) Do the Micro-C data align with the developmental time points used in the smFISH probe assays?

      The MicroC data aligns with the smFISH images of older embryos: 12-14 hour embryos or stages 14-16.  

      Recommendations for the authors:   

      Reviewer #1 (Recommendations For The Authors):

      This was a difficult paper to review. It took me several hours to understand the terminology and back and forth between different figures to put it together. It might be useful to put the loop models next to the MicroC results and have a cartoon way of incorporating which enhancers are turning on which reporters.

      I also found the supercoiled TAD models in Figure 1 not useful. These plectoneme-type of structures likely do not exist, based on the single-cell chromosome tracing studies, and the HiC structures not showing perpendicular to diagonal interactions between the arms of the plectonemes.

      We wanted to represent the TAD as a coiled 30nM fiber, as they are not likely to resemble the large loops like those shown in Fig. 1 A, D, and G.

      There are no stripes emerging from homies, which is consistent with the pairing model, but there seem to be stripes from the eve promoter. I think these structures may be a result of both the underlying loop extruders + pairing elements.

      There are internal structures in the eve TAD that link the upstream region of the eve promoter to the eve PRE and sequences in nhomie.  All three of these sequences are bound by LBC.  Each of the regulatory domains in BX-C also have LBC elements and, as shown in Author response image 1, you can see stripes connecting some of these LBC elements to each other.  Since the stripes that Goel et al. (Goel et al. 2023) observed in their RCMC analysis of Ppm1g didn’t require cohesin, how these stripes are generated (active: e.g, a chromatin remodeler or passive: e.g., the LBC complex has non-specific DNA binding activity that can be readily crosslinked as the chromatin fiber slides past) isn’t clear.

      The authors say there are no TADs that have "volcano plumes" but the leftmost TAD TA appears to have one. What are the criteria for calling the plumes? I am also not clear why there is a stripe off the eve volcano. It looks like homie is making a "stripe" loop extrusion type of interaction with the next TAD up. Is this maybe cohesin sliding off the left boundary?

      The reviewer is correct, the left-most TAD TA appears to have a plume.  We mentioned TA seems to have a plume in the original text, but it was inadvertently edited out.

      Two different types of TADßàTAD interactions are observed.  In the case of eve, the TADs to either side of eve interact more frequently with each other than they do with eve.  This generates a “plume” above the eve volcano triangle.  The TADs that comprise the Abd-B regulatory domains (see Author response image 1) are surrounded by clouds of diminishing intensity.  Clouds at the first level represent interactions with both next-door neighbors; clouds at the second level represent interactions with both next-next-door neighbors; clouds at the third level represent interactions with next-next-next door neighbors.  The Abd-B TADs are close to the same size, so that interactions with neighbors are relatively simple.  However, this is not always the case.  When there are smaller TADs near larger TADs the pattern of interaction can be quite complicated.  An example is indicated by the red bar in Author response image 2

      The authors state "In the loop-extrusion model, a cohesin complex initiating loop extrusion in the eve TAD must break through the nhomie roadblock at the upstream end of the eve TAD. It must then make its way past the boundaries that separate eve from the attP site in the hebe gene, and come to a halt at the homie boundary associated with the lacZ reporter." Having multiple loops formed by cohesin would also bring in the 142kb apart reporter and homie. Does cohesin make 140 kb long loops in flies?

      A mechanism in which cohesin brings the reporter close to the eve TAD by generating many smaller loops (which would be the intervening TADs) was discussed in #1.2.

      Figure 5 title mistakes the transgene used?

      Fixed.

      In figure 6, the orientation of the embryos does not look the same for the late-stage panels. So it was difficult to tell if the eve enhancer was turning the reporter on.

      Here we were focusing mainly on the AP enhancer activation of the reporter, as this is most easily visualized.  It should be clear from the images that the appropriate reporter is activated by the AP enhancer for each of the transgene inserts.

      It is not clear to me why the GFP makes upstream interactions (from the 4C viewpoint) in GhomileLZ5 but not in LhomieGZ5? Corresponding interactions for Fig Supp 5 & 6 are not the same. That is, LacZ in the same place and with the same homie orientation does not show a similar upstream enrichment as the GFP reporter does.

      We are uncertain as to whether we understand this question/comment.  In GhomieLZ5 (now GhomieL, the lacZ reporter is on the eve side of the homie boundary while gfp is on the hebe enhancer side of the homie boundary.  Since homie is pointing away from gfp, pairing interactions with homie and nhomie in the eve locus bring the eve enhancers in close proximity with the gfp reporter.  This is what is seen in Fig. 7 panel D—lower trace.  In LhomieGZ5 (now GeimohL) the lacZ reporter is again on the eve side of the homie boundary while gfp is on the hebe enhancer side of the homie boundary.  However, in this case homie is inverted so that it is points away from lacZ (towards gfp).  In this orientation, pairing brings the lacZ reporter into contact with the eve enhancers.  This is what is seen in the upper trace in Fig. 7 panel D.

      The orientation of the transgene is switch in Fig. Supp 5 and 6.  For these “Z3) transgenes (now called LeimohG and LhomieG the gfp reporter is on the eve side of homie while the lacZ reporter is on the hebe enhancer side of homie.  The interactions between the reporters and eve are determined by the orientation of homie in the transgene.  When homie is pointing away from gfp (as in LeimohG), gfp is activated and that is reflected in the trace in Supp Fig. 5. When homie is pointing away from lacZ, lacZ is activated and this is reflected (though not as cleanly as in other cases) in the trace in Supp Fig. 6.  

      I did not see a data availability statement. Is the data publicly available? The authors also should consider providing the sequences of the insertions, or provide the edited genomes, in case other researchers would like to analyze the data.

      Data have been deposited.

      Reviewer #3 (Recommendations For The Authors):

      Minor Points:

      (1) There is an inconsistency in the way that some of the citations are formatted. Some citations have 'et al' italicized while others do not. It seems to be the same ones throughout the manuscript. Some examples: Chetverina et al 2017, Chetverina et al 2014, Cavalheiro et al 2021, Kyrchanova et al 2008a, Muravyova et al 2001.

      Fixed

      (2) Pita is listed twice in line 48.

      Fixed

      (3) Line 49, mod(mdg4)67.2 is written just as mod(mdg4). The isoform should be indicated.

      This refers to all Mod isoforms.

      (4) Homie and Nhomie are italicized throughout the manuscript and do not need to be.

      This is the convention used previously.  

      (5) The supplemental figure captions 1 and 2 in the main document are ordered differently than in the supplemental figures file. This caused it to look like the figures are being incorrectly cited in lines 212-214 and 231-232.

      Fixed

      (6) Is the correct figure being cited in line 388-389? The line cites Figure 6E when mentioning LlambdaG Z5; however, LlambdaG Z5 is not shown in Figure 6.

      Fixed

      (7) Section heading 'LhomieG Z5 and GhomieL Z5' could be renamed for clarity. GhomieL Z5 results are not mentioned until the next section, named 'GhomieL Z5'.

      Fixed

      (8) Can the authors provide better labeling for control hebe expression? This would help to determine what is hebe expression and what is background noise in some of the embryos in Figures 4-6.

      Author response image 5 shows expression of the lacZ reporter in GeimohL and GlambdaL.  For the GlambdaL transgene, the hebe enhancers drive lacZ expression in 1216 hr embryos.  Note that lacZ expression is restricted to a small set of quite distinctive cells along the ventral midline.  lacZ is also expressed on the ventral side of the GeimohL embryo (top panel).  However, their locations are quite different from those of the lacZ positive cells in the GlambdaL transgene embryo.  These cells are displaced from the midline, and are arranged as pairs of cells in each hemisegment, locations that correspond to eve-expressing cells in the ventral nerve cord.  The eve enhancers also drive lacZ expression elsewhere in the GeimohL embryo, including the anal plate and dorsal muscle progenitor cells (seen most clearly in the lower left panel).

      Author response image 5.

      lacZ expression in Giemohl and Glambdal embryos

      (9) The Figure 5 title is labeled with the wrong transgene.

      Fixed

      (10) Heat map scales are missing for Figures 7, supplemental 5, and supplemental 6.

      Fixed

      (11) Did the authors check if there was a significant difference in the expression of GFP and lacZ from lambda control lines to the Homie transgenic lines?

      Yes.  Statistical analysis added in Table Supplemental #1

      (12) The Figure 7 title references that these are Z3 orientations, however, it is Z5 orientations being shown.

      Fixed

      (13) The virtual 4C data should include an axis along the bottom of the graphs for better clarity. An axis is missing in all 4C figures.

      References:

      Bantignies F, Grimaud C, Lavrov S, Gabut M, Cavalli G. 2003. Inheritance of polycomb-dependent chromosomal interactions in drosophila. Genes Dev. 17(19):2406-2420.

      Batut PJ, Bing XY, Sisco Z, Raimundo J, Levo M, Levine MS. 2022. Genome organization controls transcriptional dynamics during development. Science. 375(6580):566-570.

      Bonchuk A, Boyko K, Fedotova A, Nikolaeva A, Lushchekina S, Khrustaleva A, Popov V, Georgiev P. 2021. Structural basis of diversity and homodimerization specificity of zinc-fingerassociated domains in drosophila. Nucleic Acids Res. 49(4):2375-2389.

      Bonchuk AN, Boyko KM, Nikolaeva AY, Burtseva AD, Popov VO, Georgiev PG. 2022. Structural insights into highly similar spatial organization of zinc-finger associated domains with a very low sequence similarity. Structure. 30(7):1004-1015.e1004.

      Chen H, Levo M, Barinov L, Fujioka M, Jaynes JB, Gregor T. 2018. Dynamic interplay between enhancer–promoter topology and gene activity. Nat Genet. 50(9):1296.

      Fedotova AA, Bonchuk AN, Mogila VA, Georgiev PG. 2017. C2h2 zinc finger proteins: The largest but poorly explored family of higher eukaryotic transcription factors. Acta Naturae. 9(2):4758.

      Foe VE. 1989. Mitotic domains reveal early commitment of cells in drosophila embryos. Development. 107(1):1-22.

      Fujioka M, Mistry H, Schedl P, Jaynes JB. 2016. Determinants of chromosome architecture: Insulator pairing in cis and in trans. PLoS Genet. 12(2):e1005889.

      Galloni M, Gyurkovics H, Schedl P, Karch F. 1993. The bluetail transposon: Evidence for independent cis-regulatory domains and domain boundaries in the bithorax complex. The EMBO Journal. 12(3):1087-1097.

      Goel VY, Huseyin MK, Hansen AS. 2023. Region capture micro-c reveals coalescence of enhancers and promoters into nested microcompartments. Nat Genet. 55(6):1048-1056.

      Hsieh TS, Cattoglio C, Slobodyanyuk E, Hansen AS, Rando OJ, Tjian R, Darzacq X. 2020. Resolving the 3d landscape of transcription-linked mammalian chromatin folding. Mol Cell. 78(3):539553.e538.

      Ke W, Fujioka M, Schedl P, Jaynes JB. 2024. Chromosome structure ii: Stem-loops and circle-loops. eLife.

      Krietenstein N, Abraham S, Venev SV, Abdennur N, Gibcus J, Hsieh TS, Parsi KM, Yang L, Maehr R, Mirny LA et al. 2020. Ultrastructural details of mammalian chromosome architecture. Mol Cell. 78(3):554-565.e557.

      Kyrchanova O, Ibragimov A, Postika N, Georgiev P, Schedl P. 2023. Boundary bypass activity in the abdominal-b region of the drosophila bithorax complex is position dependent and regulated. Open Biol. 13(8):230035.

      Kyrchanova O, Kurbidaeva A, Sabirov M, Postika N, Wolle D, Aoki T, Maksimenko O, Mogila V, Schedl P, Georgiev P. 2018. The bithorax complex iab-7 polycomb response element has a novel role in the functioning of the fab-7 chromatin boundary. PLoS Genet. 14(8):e1007442.

      Kyrchanova O, Mogila V, Wolle D, Deshpande G, Parshikov A, Cleard F, Karch F, Schedl P, Georgiev P. 2016. Functional dissection of the blocking and bypass activities of the fab-8 boundary in the drosophila bithorax complex. PLoS Genet. 12(7):e1006188.

      Kyrchanova O, Sabirov M, Mogila V, Kurbidaeva A, Postika N, Maksimenko O, Schedl P, Georgiev P.

      2019a. Complete reconstitution of bypass and blocking functions in a minimal artificial fab7 insulator from drosophila bithorax complex. Proceedings of the National Academy of Sciences.201907190.

      Kyrchanova O, Wolle D, Sabirov M, Kurbidaeva A, Aoki T, Maksimenko O, Kyrchanova M, Georgiev P, Schedl P. 2019b. Distinct elements confer the blocking and bypass functions of the bithorax fab-8 boundary. Genetics.genetics. 302694.302019.

      Li H-B, Muller M, Bahechar IA, Kyrchanova O, Ohno K, Georgiev P, Pirrotta V. 2011. Insulators, not polycomb response elements, are required for long-range interactions between polycomb targets in drosophila melanogaster. Mol Cell Biol. 31(4):616-625.

      Li X, Tang X, Bing X, Catalano C, Li T, Dolsten G, Wu C, Levine M. 2023. Gaga-associated factor fosters loop formation in the drosophila genome. Mol Cell. 83(9):1519-1526.e1514.

      Lim B, Heist T, Levine M, Fukaya T. 2018. Visualization of transvection in living drosophila embryos. Mol Cell. 70(2):287-296. e286.

      Link N, Kurtz P, O'Neal M, Garcia-Hughes G, Abrams JM. 2013. A p53 enhancer region regulates target genes through chromatin conformations in cis and in trans. Genes Dev. 27(22):24332438.

      Mohana G, Dorier J, Li X, Mouginot M, Smith RC, Malek H, Leleu M, Rodriguez D, Khadka J, Rosa P et al. 2023. Chromosome-level organization of the regulatory genome in the drosophila nervous system. Cell. 186(18):3826-3844.e3826.

      Muller M, Hagstrom K, Gyurkovics H, Pirrotta V, Schedl P. 1999. The mcp element from the drosophila melanogaster bithorax complex mediates long-distance regulatory interactions. Genetics. 153(3):1333-1356.

      Postika N, Metzler M, Affolter M, Müller M, Schedl P, Georgiev P, Kyrchanova O. 2018. Boundaries mediate long-distance interactions between enhancers and promoters in the drosophila bithorax complex. PLoS Genet. 14(12):e1007702.

      Rollins RA, Morcillo P, Dorsett D. 1999. Nipped-b, a drosophila homologue of chromosomal adherins, participates in activation by remote enhancers in the cut and ultrabithorax genes. Genetics. 152(2):577-593.

      Samal B, Worcel A, Louis C, Schedl P. 1981. Chromatin structure of the histone genes of d. Melanogaster. Cell. 23(2):401-409.

      Shermoen AW, McCleland ML, O'Farrell PH. 2010. Developmental control of late replication and s phase length. Curr Biol. 20(23):2067-2077.

      Shidlovskii YV, Bylino OV, Shaposhnikov AV, Kachaev ZM, Lebedeva LA, Kolesnik VV, Amendola D, De Simone G, Formicola N, Schedl P et al. 2021. Subunits of the pbap chromatin remodeler are capable of mediating enhancer-driven transcription in drosophila. Int J Mol Sci. 22(6).

      Sigrist CJ, Pirrotta V. 1997. Chromatin insulator elements block the silencing of a target gene by the drosophila polycomb response element (pre) but allow trans interactions between pres on different chromosomes. Genetics. 147(1):209-221.

      Udvardy A, Schedl P. 1984. Chromatin organization of the 87a7 heat shock locus of drosophila melanogaster. J Mol Biol. 172(4):385-403.

      Vazquez J, Muller M, Pirrotta V, Sedat JW. 2006. The mcp element mediates stable long-range chromosome-chromosome interactions in drosophila. Molecular Biology of the Cell. 17(5):2158-2165.

      Wolle D, Cleard F, Aoki T, Deshpande G, Schedl P, Karch F. 2015. Functional requirements for fab-7 boundary activity in the bithorax complex. Mol Cell Biol. 35(21):3739-3752.

    1. Author Response:

      We thank the reviewers for careful reading, acknowledging the strength of our manuscript, and pointing out its weakness, which we will address in the revised version as described below.

      (1) We will supplement our analysis with finer statistical testing and analysis, such as cross-validation and a more detailed analysis of the relation between the inferred model and the intrinsic timescales of the system. For the effect of the drug TIMP-1 on the animal, we will first explore the possibility of assessing the results using a multifactor ANOVA test, with the caveat that the distribution of interactions is not Gaussian. We will further test the effect of different group size on the significance of our results by considering subgroups of animals in the drug group, and compare the statistics between the (subsampled) drug group and the controlled group.

      (2) Our manuscript is similar with that of Shemesh et al. in that we both analyze socially interacting mice by constructing maximum entropy models (MEM) of the co-localization patterns of mice. The difference is in the setup and the number of mice (4 mice in Shemesh et al, 10-15 in our work), as we outlined in the manuscript. To further supplement our current argument of the difference of our results in the Discussion section, we will learn a MEM model up to triplet interactions for our Eco-HAB mice data, and compare to our current MEM model up to pairwise interactions using test-set validation or the Bayesian information criterion (BIC).

    2. eLife assessment

      This useful work investigates the social interactions of mice living together in a system of multiple connected cages. The approach is interesting as it uses some of the tools developed in physics to investigate animal behaviour. However, , some of the analyses require further scrutiny, leaving the evidence supporting the main claim currently incomplete.

    3. Public Review:

      Summary:

      In this manuscript, Chen et al. investigate the statistical structure of social interactions among mice living together in the ECO-Hab. They use maximum entropy models (MEM) from statistical physics that include individual preferences and pair-wise interactions among mice to describe their collective behavior. They also use this model to track the evolution of these preferences and interactions across time and in one group of mice injected with TIMP-1, an enzyme regulating synaptic plasticity. The main result is that they can explain group behavior (the probability of being together in one compartment) by a MEM that only includes pair-wise interactions. Moreover, the impact of TIMP-1 is to increase the variance of the couplings J_ij, the preference for the compartment containing food, as well as the dissatisfaction triplet index (DTI).

      Strengths:

      The ECO-Hab is a really nice system to ask questions about the sociability of mice and to tease apart sociability from individual preference. Moreover, combining the ECO-Hab with the use of MEM is a powerful and elegant approach that can help statistically characterize complex interactions between groups of mice -- an important question that requires fine quantitative analysis.

      Weaknesses:

      However, there is a risk in interpreting these models. In my view, several of the comparisons established in the current study would require finer and more in-depth analysis to be able to establish firmer conclusions (see below). Also, the current study, which closely resembles previous work by Shemesh et al., finds a different result but does not provide the same quantitative model comparison included there, nor a conclusive explanation of why their results are different. In total, I felt that some of the results required more solid statistical testing and that some of the conclusions of the paper were not entirely justified. In particular, the results from TIMP-1 require proper interaction tests (group x drug) which I couldn't find. This is particularly important when the control group has a smaller N than the drug groups.

    1. Reviewer #2 (Public Review):

      Summary:

      The tubulin subunits that make up microtubules can be posttranslationally modified and these PTMs are proposed to regulate microtubule dynamics and the proteins that can interact with microtubules in many contexts. However, most studies investigating the roles of tubulin PTMs have been conducted in vitro either with purified components or in cultured cells. Lu et al. use CRISPR/Cas9 genome editing to mutate tubulin genes in C. elegans, testing the role of specific tubulin residues on neuronal development. This study is a real tour de force, tackling multiple proposed tubulin modifications and following the resulting phenotypes with respect to neurite outgrowth in vivo. There is a ton of data that experts in the field will likely reference for years to come as this is one of the most comprehensive in vivo analyses of tubulin PTMs in vivo.

      This paper will be very important to the field, however, it would be strengthened if: 1) the authors demonstrated that the mutations they introduced had the intended consequences on microtubule PTMs, 2) the authors explored how the various tubulin mutations directly affect microtubules, and 3) the findings are made generally more accessible to non C. elegans neurobiologist.

      (1) The authors introduce several mutations to perturb tubulin PTMs, However, it is unclear to what extent the engineered mutations affecting tubulin in the intended way. i.e. are the authors sure that the PTMs they want to perturb are actually present in C. elegans. Many of the antibodies used did not appear to be specific and antibody staining was not always impacted in the mutant cases as expected. For example, is there any evidence that S172 is phosphorylated in C. elegans, e.g. from available phosphor-proteomic data? Given the significant amount of staining left in the S172A mutant, the antibody seems non-specific in this context and therefore not a reliable readout of whether MTs are actually phosphorylated at this residue. As another example, there is no evidence presented that K252 is acetylated in C. elegans. At the very least, the authors should consider demonstrating the conservation of these residues and the surrounding residues with other organisms where studies have demonstrated PTMs exist.

      (2) Given that the authors have the mutants in hand, it would be incredibly valuable to assess the impact of these mutations on microtubules directly in all cases. MT phenotypes are inferred from neurite outgrowth phenotypes in several cases, the authors should look directly at microtubules and/or microtubule dynamics via EBP-2 when possible OR show evidence that the only way to derive the neurite phenotypes shown is through the inferred microtubule phenotypes. For example, the effect of the acetylation or detyrosination mutants on MTs was not assessed.

      (3) There is a ton of data here that will be important for experts working in this field to dig into, however, for the more general cell biologist, some of the data are quite inaccessible. More cartoons and better labeling will be helpful as will consistent comparisons to control worms in each experiment. A good example of this issue is demonstrated in Figure 2 and Figure 4:

      - Fig. 2: Please label images with what is being probed in each panel<br /> - Fig 2G is very hard to interpret-cartoon diagramming what is being observed would be helpful.<br /> - Line 182-185: is this referring to your data or to Wu et al? It is not clear in this paragraph when the authors are describing published work versus their own data presented here.<br /> - Fig 2!-2K is not well described. What experiment is being done here? What is dlk-1 and why did you look at this mutant?<br /> - Figure 4C: this phenotype is hard to interpret. Where is the wt control? Where is the quantification?<br /> - There are no WT comparison images in Figure 4I, making the quantification difficult to interpret

      (4) In addition, I am left unconvinced of the negative data demonstrating that MBK does not phosphorylate tubulin. First, the data described in lines 207-211 does not appear to be presented anywhere. Second, RNAi is notoriously finicky in neurons, thus necessitating tissue specific degradation using either the ZF/ZIF-1 or AID/TIR1 systems which both work extremely well in C. elegans. Third, there appears to be increasing S172 phosphorylation in Figure 3 supplement 2 with added MBK-2, but there is no anti-tubulin blot to show equal loading, so this experiment is hard to interpret.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      (1) The manuscript by Lu et al aims to study the effects of tubulin post-translational modification in C. elegans touch receptor neurons. Authors use gene editing to engineer various predicted PTM mutations in a-tubulin MEC-12 and b-tubulin MEC-7. Authors generate and analyze an impressive battery of mutants in predicted phosphorylation site and acetylation site of b-tubulin MEC-7, K40 acetylation site in a-tubulin MEC-12, enzymatic site of the a-tubulin acetyltransferase MEC-17, and PTM sites in the MEC-12 and MEC-7 C-tails (glutamylation, detyrosination, delta-tubulin). This represents a lot of work, and will appeal to a readership interested in C. elegans touch receptor neurons. The major concern/criticism of this manuscript is whether the introduced mutation(s) directly affects a specific PTM or whether the mutation affects gene expression, protein expression/stability/localization, etc. As such, this work does convincingly demonstrate, as stated in the title, that "Editing of endogenous tubulins reveals varying effects of tubulin posttranslational modifications on axonal growth and regeneration." 

      We thank the reviewer for the constructive comments. With regards to the major concern or criticism, we like to point out that we have previously characterized ~100 missense mutations in mec-7 and mec-12 (Zheng et al., 2017, PMID: 28835377; Lee et al., 2021, PMID: 33378215). So, we are familiar with the phenotypes associated with mutations that affect gene expression or protein stability, which mostly result in a null phenotype. When analyzing the PTM site mutants, we compared their phenotypes with the previously categorized phenotypes of null alleles, neomorphic mutations that increase microtubule stability, and antimorphic mutations that prevent polymerization or disrupt microtubule stability. For example, in the case of mec-7 S172 mutations, we found that S172P mutants had the same phenotype as the mec-7 knockout (mild neurite growth defects), suggesting that S172P likely affects protein folding or stability, resulting in the loss of MEC-7. In contrast, S172A and S172E mutations showed phenotypes similar to neomorphic alleles (the emergence of ectopic ALM posterior neurite) and antimorphic alleles (the severe shortening of all neurites in the TRNs), respectively. These phenotypic differences suggested to us that the effects of S172A and S172E mutations cannot be simply attributed to the loss of protein expression and stability. Similar logic was applied to the studies of other PTM-inactivating or -mimicking mutations.

      (2) For example, the authors manipulate the C-terminal tail of MEC-12 and MEC-7, to test the idea that polyglutamylation may be an important PTM. These mutants displayed subtle phenotypes. The authors show that branch point GT335 and polyglutamyation polyE recognizing antibodies stain cultured embryonic touch receptor neurons (TRNs), but did not examine staining in C. elegans TRNs in situ. To my knowledge, these antibodies have not been shown to stain the TRNs in any published papers, raising the question of how these "glutamylation" mutations are affecting mec-12 and -7. The rationale for using cultured embryonic TRNs and the relevance of the data and its interpretation are not clear. 

      The GT335 and polyE antibodies were used by previous studies (O’Hagan et al., 2011, PMID: 21982591; and O’Hagan et al., 2017, PMID: 29129530) to detect the polyglutamylation signals in the sensory cilia of C. elegans. We initially tried to stain the whole animals using these antibodies but could not get clear and distinct signals in the TRNs. We reason that the tubulin polyglutamylation signals in the TRNs may be weak, and the in situ staining method which requires the antibodies to penetrate multiple layers of tissues (e.g., cuticles and epidermis) to reach the TRN axons may be not sensitive enough to detect the signal. In fact, the TRN axons are located deeper in the worm body compared to the sensory cilia that are mostly exposed to the environment. Another reason could be that the tissues (mostly epidermis) surrounding the TRN axons also have polyglutamylation staining, which makes it difficult to recognize TRN axons. This is a situation different from the anti-K40 acetylation staining, which only occurs in the TRNs because MEC-12 is the only a-tubulin isotype that carries K40. Due to these technical difficulties, we decided to use the in vitro cultured TRNs for the staining experiment, which allows both easy access of the antibodies (thus higher sensitivity) and the dissociation of the TRNs from other tissues. The fact that we were able to observe reduced staining in the ttll mutants and the tubulin mutants that lost the glutamate residues suggest that these antibodies indeed detected glutamylation signals in the cells.

      (3) The final paragraph of the discussion is factually incorrect. The C. elegans homologs of the CCP carboxypeptidases are called CCPP-1 and CCPP-6. There are several publications on their functions in C. elegans.

      We thank the reviewer for pointing out the mistake in the text. We intended to say that “there is no C. elegans homolog of the known tubulin carboxypeptidases that catalyze detyrosination”, which is true given that the detyrosinase vasohibins (VASH1/VASH2) homologs cannot be found in C. elegans. We are aware of the publications on CCPP-1 and CCPP-6; CCPP-1 is known to regulate tubulin deglutamylation in the cilia of C. elegans (O’Hagan et al., 2011 and 2017), while CCPP-6 may function in the PLM to regulate axonal regeneration (Ghosh-Roy et al., 2012). In the revised manuscript, we have corrected the error.

      Reviewer #2 (Public Review):

      Summary:

      The tubulin subunits that make up microtubules can be posttranslationally modified and these PTMs are proposed to regulate microtubule dynamics and the proteins that can interact with microtubules in many contexts. However, most studies investigating the roles of tubulin PTMs have been conducted in vitro either with purified components or in cultured cells. Lu et al. use CRISPR/Cas9 genome editing to mutate tubulin genes in C. elegans, testing the role of specific tubulin residues on neuronal development. This study is a real tour de force, tackling multiple proposed tubulin modifications and following the resulting phenotypes with respect to neurite outgrowth in vivo. There is a ton of data that experts in the field will likely reference for years to come as this is one of the most comprehensive in vivo analyses of tubulin PTMs in vivo.

      This paper will be very important to the field, however would be strengthened if: 1) the authors demonstrated that the mutations they introduced had the intended consequences on microtubule PTMs, 2) the authors explored how the various tubulin mutations directly affect microtubules, and 3) the findings are made generally more accessible to non C. elegans neurobiologists.

      (1) The authors introduce several mutations to perturb tubulin PTMs, However, it is unclear to what extent the engineered mutations affect tubulin in the intended way i.e. are the authors sure that the PTMs they want to perturb are actually present in C. elegans. Many of the antibodies used did not appear to be specific and antibody staining was not always impacted in the mutant cases as expected. For example, is there any evidence that S172 is phosphorylated in C. elegans, e.g. from available phosphor-proteomic data? Given the significant amount of staining left in the S172A mutant, the antibody seems non-specific in this context and therefore not a reliable readout of whether MTs are actually phosphorylated at this residue. As another example, there is no evidence presented that K252 is acetylated in C. elegans. At the very least, the authors should consider demonstrating the conservation of these residues and the surrounding residues with other organisms where studies have demonstrated PTMs exist. 

      We thank the reviewer for the comments. To our knowledge, there are very few phosphor-proteome data available for C. elegans. We searched a previously published dataset (Zielinska et al., 2009; PMID: 19530675) and did not find the S172 phosphorylation signal in MEC-7. This is not surprising, given that only six touch receptor neurons expressed MEC-7 and the abundance of MEC-7 in the whole animal lysate may be below the detection limit. However, this phosphorylation site S172 is highly conserved across species and tubulin isotypes (Figure 1-figure supplement 1 in the revised manuscript), suggesting that this site is likely phosphorylated in MEC-7.

      In the case of K252, the potential acetylation site and the flanking sequences are extremely conserved across species and isotypes. In fact, the 20 amino acids from 241-260 a.a. are identical among the tubulin genes of C. elegans, fruit flies, Xenopus, and humans (Figure 4-figure supplement 1B). Thus, although K252 acetylation was found in the HeLa cells, this site can possibly be acetylated. 

      In the case of K40, we observed sequence divergence at the PTM site and adjacent sequences among the tubulin isotypes in C. elegans. MEC-12 is the only C. elegans a-tubulin isotype that has the K40 residue, and the 40-50 a.a. region of MEC-12 appears to be more conserved than other isotypes when compared to Drosophila, frog, and human a-tubulins (Figure 4-figure supplement 1A).

      (2) Given that the authors have the mutants in hand, it would be incredibly valuable to assess the impact of these mutations on microtubules directly in all cases. MT phenotypes are inferred from neurite outgrowth phenotypes in several cases, the authors should look directly at microtubules and/or microtubule dynamics via EBP-2 when possible OR show evidence that the only way to derive the neurite phenotypes shown is through the inferred microtubule phenotypes. For example, the effect of the acetylation or detyrosination mutants on MTs was not assessed. 

      We thank the reviewer for the suggestions. In this study, we created >20 tubulin mutants. Due to limited time and resources, we were not able to examine microtubule dynamics in every mutant strain using EBP-2 kymographs. We assessed the effects of the tubulin mutations mostly based on the changes on neurite growth pattern. From our previous experience of analyzing ~100 mec-7 and mec-12 missense mutations (Zheng et al., 2017, MBoC; Lee et al., 2021, MBoC), we found that the changes in microtubule dynamics are correlated with the changes in neuronal morphologies. For example, the growth of ectopic ALM-PN is correlated with fewer EBP-2 comets and potentially reduced microtubule dynamics; this correlation holds true for several mec-7 neomorphic missense alleles we examined before (Lee et al., 2021, MBoC) and the PTM site mutants [e.g., mec-7(S172A) and mec-12(4Es-A)] analyzed in this study. Similarly, the shortening of TRN neurites is correlated with more EBP-2 comets and increased microtubule dynamics. For the mutants that don’t show neurite growth defects, our previous experience is that they are not likely to show altered microtubule dynamics in EBP-2 tracking experiments. So, we did not analyze the acetylation mutants (which had no defects in neurite growth) and the detyrosination mutants (which had weak ALM-PN phenotype). Nevertheless, we agree with the reviewer that we could not rule out the possibility that there may be some slight changes to microtubule dynamics in these mutants.

      Using tannic acid staining and electron microscopy (EM), we previously examined the microtubule structure in several tubulin missense mutants (Zheng et al., 2017, MBoC) and found that the loss-of-function and antimorphic mutations significantly reduced the number of microtubules and altered microtubule organizations by reducing protofilament numbers. These structural changes are consistent with highly unstable microtubules and defects in neurite growth. On the other hand, neomorphic mutants had only slight decrease in microtubule abundance, maintained the 15-protofilament structure, and had a more tightly packed microtubule bundles that filled up most of the space in the TRN neurite (Zheng et al., 2017, MBoC). These structural features are consistent with increased microtubule stability and ectopic neurite growth. Although we did not directly examine the microtubule abundance and structure using EM in this study, we would expect similar changes that are correlated with the neurite growth phenotypes in the PTM mutants. We agree with the reviewer, it will be informative to conduct more comprehensive analysis on these mutants using EM and other structural biology methods.

      (3) There is a ton of data here that will be important for experts working in this field to dig into, however, for the more general cell biologist, some of the data are quite inaccessible. More cartoons and better labeling will be helpful as will consistent comparisons to control worms in each experiment.

      Response: We thank the reviewer for the comment. In the revised manuscript, we added some cartoons to Figure 2G to show the location of the synaptic vesicles. The neurite growth phenotype should be quite straightforward. Nevertheless, we added one more Figure (Figure 8) to summarize all the results in the study with cartoons that depicted the changes to neuronal morphologies.

      (4) In addition, I am left unconvinced of the negative data demonstrating that MBK does not phosphorylate tubulin. First, the data described in lines 207-211 does not appear to be presented anywhere. Second, RNAi is notoriously finicky in neurons, thus necessitating tissue-specific degradation using either the ZF/ZIF-1 or AID/TIR1 systems which both work extremely well in C. elegans. Third, there appears to be increasing S172 phosphorylation in Figure 3 Supplement 2 with added MBK-2, but there is no anti-tubulin blot to show equal loading, so this experiment is hard to interpret.

      We added the results of mbk-1, mbk-2, and hpk-1 mutants and cell-specific knockdown of MBK-2 into Figure 3-figure supplement 1D. Considering the reviewer’s suggestion, we attempted to use a ZIF-1 system to remove the MBK-2 proteins specifically in the TRNs using a previously published method (PMID: 28619826). We fused endogenous MBK-2 with GFP by gene editing and then expressed an anti-GFP nanobodies fused with ZIF-1 in the TRNs to induce the degradation of MBK-2::GFP. To our surprise, unlike the mbk-2p::GFP transcriptional reporter, the MBK-2::GFP did not show detectable expression in the TRNs, although expression can be seen in early embryos, which is consistent with the “embryonic lethal” phenotype of the mbk-2(-) mutants (Figure 3-figure supplement 2A-B in the revised manuscript). We reason that either endogenous MBK-2 is not expressed in the TRNs or is expressed at a very low level. We then crossed mbk-2::GFP with ItSi953 [mec-18p::vhhGFP4::Zif-1] to trigger the degradation of any potential MBK-2 proteins and did not observe the ectopic growth of ALM-PN (Figure 3- figure supplement 2C). These results suggest that MBK-2 is not likely to regulate tubulin phosphorylation in the TRNs, which is consistent with the results of other genetic mutants and the RNAi experiments.

      For Figure 3 Supplement 2 (Figure 3-figuer supplement 3 in revised manuscript), because we added the same amount of purified MEC-12/MEC-7 to all reactions and had established equal loading in Figure 3E, we did not do the anti-tubulin staining in this experiment. Since higher concentration (1742 nM) of MBK-2 did not produce stronger signal than the condition with 1268 nM, we don’t think the 1268 nM band represents true phosphorylation. Moreover, the signal is not significantly stronger than the control without MBK-2 and is much lower than the signal generated by CDK1 in Figure 3E. Based on these results, we concluded that MBK-2 is not likely to phosphorylate MEC-7.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      General:

      A summary table would help the reader digest the vast amount of phenotypic data.

      Cartoons to help a non-C. elegans reader understand the figures. 

      We added Figure 8 to summarize and illustrate the effects of the various mutants analyzed in this study.

      Specific:

      The authors engineered mutations into the predicted phosphorylation site of b-tubulin mec-7. These CRISPR-alleles mutations phenocopied previously identified loss-of-function, gain-of-function, and neomorphic mec-7 alleles identified in genetic screens by the Chalfie lab. Next, the authors sought to identify the responsible kinase, taking a candidate gene approach. The most likely family - minibrain - had no effect when knocked down/out. The authors showed that cdk-1 mutants displayed ectopic ALM-PN outgrowth. Whether cdk-1 specifically acts in the TRNs was not demonstrated, calling into question whether CDK-1 phosphorylates S172 in vivo. In their introduction (lines 45-59), the authors built a case for engineering PTM mutations directly into tubulins, because the PTM enzymes may have multiple substrates. This logic applies to the cdk-1 experiment and its interpretation. 

      The reviewer is right. Since CDK1 and minibrain kinase are the only known kinases that catalyze S172 phosphorylation, our results suggest that CDK-1 is more likely to catalyze S172 phosphorylation in the TRNs compared to MBK-1/2. Genetic studies found that cdk-1(-); mec-7(S172A) double mutants did not show stronger phenotype than the two single mutants, suggesting that they function in the same pathway. Nevertheless, we could not rule out the possibility that other kinases may also control S172 phosphorylation, and the effect of CDK-1 is indirect. We mentioned this possibility in the revised manuscript.

      For a-tubulin MEC-12, acetyl-mimicking K40Q and unmodifiable K40R mutants failed to stain with the anti-acetyl-a-tubulin (K40) antibody and displayed subtle TRN phenotypes. The enzymatically dead MEC-17 had phenotypes similar to those described by Topalidou (2012), confirming the Chalfie lab finding that MEC-17 has functions in addition and independent of its acetyltransferase activity. The authors moved onto a predicted acetylation site in MEC-7 and observed TRN developmental defects, and acknowledged that this may be due to tubulin instability and not a PTM. This is a concern for all mutants, as there is no way to measure whether the protein is expressed, stable, or localized properly. 

      We acknowledge that this is a caveat of mutational studies. An amino acid substitution at the PTM site may have multiple effects, including the change of the PTM state and potential alteration of protein conformation. Without direct evidence for enzymatic modification of the PTM site in the neurons, we could not rule out the possibility the phenotype we observed is not related to PTM and instead is the result of abnormal protein conformation and function caused by the mutation.

      Nevertheless, as stated in our above response to the first point in the public review, we can phenotypically differentiate loss-of-function and gain-of-function mutants. If the mutation reduces expression or general protein stability, it is more likely to cause a loss-of-function phenotype. For most PTM site mutants, this is not the case. We observed mostly gain-of-function phenotype, suggesting that the missense mutations did not simply inactivate the tubulin protein and instead affected the functional properties of the protein.

      From here, the authors manipulate the C-terminal tail of MEC-12 and MEC-7, testing the idea that polyglutamylation may be an important PTM. These mutants displayed subtle phenotypes. The authors show that branch point GT335 and polyglutamyation polyE recognizing antibodies stain cultured embryonic TRNs, but did not examine staining in TRNs. To my knowledge, these antibodies have not been shown to stain the TRNs in any published papers (see next point). The rationale for using cultured embryonic TRNs is not clear. 

      See our response to the second point in the public review.

      Lines 548-553 There are several publications on CCPP-1 and CCPP-6 functions in TRNs and ciliated sensory neurons. See

      PMID: 20519502

      PMID: 21982591

      PMID: 21943602

      PMID: 23000142

      PMID: 29129530

      PMID: 33064774

      PMID: 36285326

      PMID: 37287505 

      We thank the reviewer for pointing out these references, some of which were cited in the revised manuscript. We made a mistake in the Discussion by saying that there are no C. elegans homologs of tubulin carboxypeptidases while we intended to state that there is no homolog of tubulin detyrosinase in C. elegans. We are aware of the studies of CCPP-1 and CCPP-6 and have corrected the mistake in revised manuscript (also see our response to the third point in the public review).

      Reviewer #2 (Recommendations For The Authors):

      Figures: 

      As stated in the public review, more cartoons and better labeling will be helpful as will consistent comparisons to control worms in each experiment. A good example of this issue is demonstrated in Figure 2 and Figure 4: 

      (1) Figure 2: Please label images with what is being probed in each panel. 

      We added labels to the panels.

      (2) Figure 2G is very hard to interpret - cartoon diagramming what is being observed would be helpful. 

      We added cartoons to help illustrate the images.

      (3) Line 182-185: is this referring to your data or to Wu et al? It is not clear in this paragraph when the authors are describing published work versus their own data presented here. 

      It is from our data. We have made it clear in the revised manuscript.

      (4) Figure 2 - 2K is not well described. What experiment is being done here? What is dlk-1 and why did you look at this mutant? 

      Figure 2K showed that both wild-type animals and S172A mutants could reconnect the severed axons after laser axotomy. Previous studies have found that dlk-1(-) mutants were not able to regenerate axons due to altered microtubule dynamics (PMID: 19737525; PMID: 23000142). We used dlk-1(-) mutants as a negative control, because DLK-1 promotes microtubule growth following axotomy, and the DLK-1 pathway is essential for regeneration (PMID: 23000142). We want to highlight the phenotypic difference between dlk-1(-) mutants and the S172E mutants. Although both mutants showed similar regrowth length, dlk-1(-) mutants showed unbranched regrowth probably due to the lack of microtubule polymerization, whereas the S172E mutants showed a mesh-like regrowth pattern likely due to highly dynamic and unstable microtubules. We explained the different phenotypes in the revised manuscript.

      (5) Figure 4C: this phenotype is hard to interpret. Where is the wt control? Where is the quantification? 

      In the Figure legend, we have referred the readers to Figure 1G for the wild-type image. Quantification is provided in the text (~20% of the animals showed the branching defects).

      (6) There are no WT comparison images in Figure 4I, making the quantification difficult to interpret 

      In the Figure legend, we have referred the readers to Figure 1A for the wild-type control. Moreover, we included a new Figure 8 to summarize the phenotypes of all mutants.

      Experimental:

      (1) Is it clear that only MEC-7/MEC-12 are the only a- and b-tubulin present in the TRNs? The presence of other tubulins not mutated would complicate the interpretation of the results. 

      According to the mRNA levels, the expression of MEC-7 and MEC-12 are >100 fold higher than other tubulin isotypes. For example, single-cell transcriptomic data (Taylor et al., 2021) showed that mec-7 mRNA is at 135,940 TPM in ALM neurons, whereas two other tubulin isotypes, tbb-1 and tbb-2, have expression value of 54 and 554 TPM, respectively in the ALM. So, even if there are some other tubulin isotypes, their abundance is much lower than mec-7 and mec-12 and are not likely to interfere with the effects of the mec-7 and mec-12 mutants.

      (2) The in vitro kinase assays should be quantified. 

      We have added the quantification.

      (3) The idea that Cdk1 phosphorylates tubulin in interphase is surprising and I am left wondering how the authors propose that Cdk1 is activated in interphase. Is cyclin B (or another cyclin) present in interphase in this cell type? Expression but not activation of Cdk1 is not discussed. 

      CDK1 can work with cyclin A and cyclin B. C. elegans has one cyclin A gene (cya-1) and four cyclin B genes (cyb-1, cyb-2.1, cyb-2.2, and cyb-3). According to single-cell transcriptomic data of L4 animals, cya-1 and cyb-1 showed weak expression in many postmitotic neurons (including the ALM neurons), while cyb-2.1, cyb-2.2, and cyb-3 had no expression in neurons. So, it is possible that cya-1/cyclin A and cyb-1/cyclin B has low level of expression in the TRNs. A previous study also found the expression of cell cycle regulators (including cyclins) in postmitotic neurons in mouse brain (Akagawa et al., 2021; PMID: 34746147).

      (4) What is the significance of neurite swelling and looping in Figure 4H? The underlying cause of this phenotype is not described. 

      The neurite swelling and looping phenotype of mec-17(-) mutants were described by Topalidou et al., (2012; PMID: 22658602) and were caused by the bending of the microtubules. It appears that the loss of the a-tubulin acetyltransferase altered the organization of microtubules in the TRNs. These defects were partially rescued by the enzymatically dead MEC-17, suggesting that MEC-17 may play a non-enzymatic (and likely structural) role in regulating microtubule organization. We added more explanation in the revised manuscript.

      (5) It is quite surprising that polyglutamylation is not affected in the quintuple ttll mutant. Since the authors made the sextuple ttll mutant, could they demonstrate whether polyglutamylation is further reduced in this mutant via GT335 staining? 

      We did not make the comparison of the quintuple and sextuple ttll mutants because they were crossed with TRN markers with different colors for technical reasons. The quintuple mutants CGZ1475 carried uIs115 [mec-17p::TagRFP] IV, whereas the sextuple mutants CGZ1474 carried zdIs5 [mec-4p::GFP] I. As a result, we need to use different secondary antibodies for the antibody staining, which makes the results not compatible.

      Polyglutmaylation signal in the cell body was strongly affected by the ttll mutations. In fact, in the ttll-4(-); ttl-5(-); ttll-12(-) triple mutants, the signal is significantly reduced in the cell body of the TRNs, as well as the cell body of other cells. What’s surprising is that the signal in the axons persisted in the ttll triple and quintuple mutants. As the reviewers suggested, we also stained the sextuple mutants and found similar pattern as the triple and quintuple mutants (new Figure 6-figure supplement 1C in the revised manuscript), although the results are not quantitatively comparable due to the use of secondary antibodies with different fluorophores.

      Writing:

      (1) The beginning of the results section is quite jarring. The information in lines 96-104 should be in the Introduction. 

      Due to the nature of this paper, each section deals with a particular PTM. We think it is helpful to discuss some background information before describing our results on each PTM rather than giving all in the introduction. Nevertheless, we modified the beginning of the results to make it more coherent and more connected with the preceding paragraphs.

      (2) Line 122-126: conclusions are not supported by the data: it is suggested from previous experiments, but authors do not look at MTs directly. 

      We have rephrased the statement to acknowledge that we made such conclusion based on phenotypic similarity with mutants we previously examined.

      (3) I am confused by the usage of both mec-12(4EtoA) and mec-12(4Es-A). Are these the same mutations? If so, there needs to be consistency. If not, each case needs to be defined. 

      They are the same. We have corrected the mistake and are now using mec-12(4Es-A) to refer to the mutants.

      Line 105: phosphor --> phospho 

      Line 187: were --> was 

      Line 298: is --> are

      The above typos are corrected.

    1. eLife assessment

      This paper provides a useful analysis of the variation of the burden of strokes across geographic regions, finding differences in the relationship between strokes and their comorbidities. This dataset and the correlations found within will be a resource for directing the focus of future investigations. The statistical analyses are incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      The paper measures the prevalence and mortality of stroke and its comorbidities across geographic regions in order to find differences in risks that may lead to more effective guidance for these subpopulations. It also does a genetic analysis to look for variants that may drive these phenotypic variations.

      Strengths:

      The data provided here will provide a foundation for a lot of future research into the causes of the observed correlations as well as whether the observed differences in comorbidities across regions have clinically relevant effects on risk management.

      The use of data from before COVID-19 is both a strength and a weakness. Because COVID had effects on vascular health and had higher death rates for groups with the comorbidities of interest here, it has likely shifted the demographics in ways that would shift the results in unpredictable ways if the analysis were repeated with current data. This can be a strength in providing a reference point for studying those changes as well as allowing researchers to study differences between regions without the complication of different public health responses adding extra variation to the data. On the other hand, it limits the usefulness of the data in research concerned with the current status of the various populations.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have analyzed ethnogeographic differences in the comorbidity factors, such as a diabetes and heart disease, for the incidences of stroke and whether it leads to mortality.

      Strengths:

      The idea is interesting and data are compelling. The results are technically solid.

      The authors identify specific genetic loci that increase the risk of a stroke and how they differ by region.

      Weaknesses:

      The presentation is not focused. It would be better to include p-values and focus presentation on the main effects from the dataset analysis.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Recommendations For The Authors):

      I still find it really impressive that the Purkinje cell stimulation so closely mimics the pathogenic phenotypes - in my opinion, the strongest part of the paper. I would like just a little clarification on some of my previous questions.

      Major points:

      (1) Can the authors clarify where the new units came from? Are these units that were recorded before the initial submission and excluded, but are now included? If so, why were they excluded before? Or are these units that were recorded since the original submission?

      The number of units increased in Figure 1 for three reasons: 1) We have now plotted the classifier results in Figure 1 instead of the validation results, which have been moved to Figure 1 Supplement 3. 2) In response to reviewer comments, we no longer include units that had >60 s of recording in both our model creation and validation. We had previously used 30 s for creating the model and a different 30 s for validating the model, if an additional 30 s were available. 3) We changed our model creation and validation strategy based on previous reviewer comments. The new units in Figures 2-4 were taken from our pool of previously collected but unanalyzed data (we collect neural data on a rolling basis and thus these data were not initially available). We were fortunate to have these data to analyze in order to address the concerns about the number of cells included in the manuscript. The number of units increased in Figure 5 because new units were recorded in response to reviewer comments.

      (2) Why did some of the neuron counts go down? For example, in Pdx1Cre;Vglut2fl/fl mice, the fraction of units with the control signature went from 11/21 to 7/23. Is this because the classifier changed between the original submission and the revision?

      Yes, the proportion of cells matching each classification changed due to the different parameters and thresholds used in the updated classifier model.

      Minor points:

      In the Discussion: "We find some overlap and shared spike features between the different disease phenotypes and show that healthy cerebellar neurons can adapt multiple disease-associated spike train signatures." I think "adapt" should be "adopt"

      In the Discussion: "compare" is misspelled as "compared"

      Thank you for bringing these typos to our attention. We will upload a new version of the text with the typos corrected.


      The following is the authors’ response to the original reviews.

      We would like to thank the Reviewers for providing excellent and constructive suggestions that have enabled us to strengthen our overall presentation of our data. We have addressed each of the comments by altering the text, providing additional data, and revising the figures, as requested.

      Below are our explanations for how we have altered the manuscript in this revised version.

      Recommendations for the authors:

      I think you will have seen from the comments that there was great enthusiasm for the importance of this study. There were also shared concerns about how the classifier may be inadequate in its current format, as well as specific suggestions to consider to improve. I hope that you will consider a revision to really amplify the impact of the importance of this study.

      Reviewer #1 (Recommendations For The Authors):

      Distinct motor phenotypes are reflected in different neuronal firing patterns at different loci in motor circuits. However, it is difficult to determine if these altered firing patterns: 1) reflect the underlying neuropathology or phenotype, 2) whether these changes are intrinsic to the local cell population or caused by larger network changes, and 3) whether abnormal firing patterns cause or reflect abnormal movement patterns. This manuscript attempts to address these questions by recording neural firing patterns in deep cerebellar nucleus neurons in several models of cerebellar dysfunction with distinct phenotypes. They develop a classifier based on parameters of single unit spike trains that seems to do an inconsistent job of predicting phenotype (though it does fairly well for tremor). The major limitation of the recording/classifier experiments is the low number of single units recorded in each model, greatly limiting statistical power. However, the authors go on to show that specific patterns of Purkinje cell stimulation cause consistent changes in interposed nucleus activity that map remarkably well onto behavioral phenotypes. Overall, I did not find the recording/classifier results to be very convincing, while the stimulation results strongly indicate that interposed nucleus firing patterns are sufficient to drive distinct behavioral phenotypes.

      We thank the reviewer for their comments. We describe below how we have addressed the major concerns.

      Major concerns:

      (1) I don't think it's legitimate to use two 30-second samples from the same recording to train and validate the classifier. I would expect recordings from the same mouse, let alone the same unit, to be highly correlated with each other and therefore overestimate the accuracy of the classifier. How many of the recordings in the training and validation sets were the same unit recorded at two different times?

      We previously published a paper wherein we measured the correlation (or variability) between units recorded from the same mouse versus units recorded from different mice (see: Van der Heijden et al., 2022 – iScience, PMID: 36388953). In this paper we did not find that nuclei neuron recordings from the same mouse were more correlated or similar to each other than recordings from different mice. 

      Upon this reviewer comment, however, we did observe strong correlations between the two 30-second samples from the same recording units. We therefore decided to no longer validate our classifier based on a training and validation sets that had overlapping units. Instead, we generated 12 training sets and 12 non-overlapping validation sets based on our entire database. We then trained 12 classifier models and ranked these based on their classification ability on the validation sets (Figure 1 – supplemental Figure 3). We found that the top two performing classifier models were the same, and used this model for the remainder of the paper. 

      (2) The n's are not convincing for the spike signature analyses in different phenotypic models. For example, the claim is that Pdx1Cre;Vglut2fl/fl mice have more "control" neurons than ouabain infusion mice (more severe phenotype). However, the numbers are 11/21 and 7/20, respectively. The next claim is that 9/21 dystonic neurons are less than 11/20 dystonic neurons. A z-test for proportions gives a p-value of 0.26 for the first comparison and a pvalue of 0.44 for the second. I do not think any conclusions can be drawn based on these data.

      We included more cells in our analyses and found that the z-test for n the proportion of cells with the “control” and “dystonia” signature is indeed statistically significant. 

      (3) Since the spiking pattern does not appear to predict an ataxic phenotype and the n's are too small to draw a conclusion for the dystonic mice, I think the title is very misleading - it does not appear to be true that "Neural spiking patterns predict behavioral phenotypes...", at least in these models.

      We have changed the title to: “Cerebellar nuclei cells produce distinct pathogenic spike signatures in mouse models of ataxia, dystonia, and tremor.” We feel that this new title captures the idea that we find differences between spike signatures associated with ataxia, dystonia, and tremor and that these signatures induce pathological movements.

      (4) I don't think it can be concluded from the optogenetic experiments that the spike train signatures do not depend on "developmental changes, ...the effect of transgene expression, ... or drug effects outside the cerebellum." The optogenetic experiments demonstrate that modulating Purkinje cell activity is sufficient to cause changes in DCN firing patterns and phenotypes (i.e., proof-of-principle). However, they do not prove that this is why DCN firing is abnormal in each model individually.

      Thank you for highlighting this section of the text. We agree that the optogenetic experiments cannot explain why the DCN is firing abnormally in each model. We have edited this section of the text to prevent this conclusion from being drawn by the readers.

      Minor points:

      (1) It would be nice to see neural recordings in the interposed nucleus during Purkinje terminal stimulation to verify that the firing patterns observed during direct Purkinje neuron illumination are reproduced with terminal activation. This should be the case, but I'm not 100% certain it is.

      We have edited the text to clarify that representative traces and analysis of interposed nucleus neurons in response to Purkinje terminal stimulation are the data in Figure 5.

      (2) How does the classifier validation (Fig. 1E) compare to chance? If I understand correctly, 24/30 neurons recorded in control mice are predicted to have come from control mice (for example). This seems fairly high, but it is hard to know how impressive this is. One approach would be to repeat the analysis many (1000s) of times with each recording randomly assigned to one of the four groups and see what the distribution of "correct" predictions is for each category, which can be compared against the actual outcome.

      We have now also included the proportion of spike signatures in the entire population of neurons and show that the spike signatures are enriched in each of the four groups (control, ataxia, dystonia, tremor) relative to the presence of these signatures in the population (Figure 1E). 

      (3) I don't think this is absolutely necessary, but do the authors have ideas about how their identified firing patterns might lead to each of these phenotypes? Are there testable hypotheses for how different phenotypes caused by their stimulation paradigms arise at a network level?

      We have added some ideas about how these spike signatures might lead to their associated phenotypes to the discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) As mentioned earlier, my main concern pertains to the overall architecture and training of the classifier. Based on my reading of the methods and the documentation for the classifier model, I believe that the classifier boundaries may be biased by the unequal distribution of neurons across cerebellar disease groups (e.g., n=29 neurons in control versus n=19 in ataxics). As the classifier is trained to minimize the classification error across the entire sample, the actual thresholds on the parameters of interest may be influenced by the overrepresentation of neurons from control mice. To address this issue, one possible solution would be to reweight each group so that the overall weight across classes is equal. However, I suggest a better strategy might be to revise the classifier architecture altogether (as detailed below).

      We have retrained the classifier model based on equal numbers of ataxic, dystonic, and tremor cells (n=20) but we intentionally included more control cells (n=25). We included more control cells because we assume this is the baseline status for all cerebellar neurons and wanted to avoid assigning disease signatures to healthy neurons too easily. 

      (2) As the authors make abundantly clear, one mouse model of disease could potentially exhibit multiple phenotypes (e.g., a mouse with both ataxia and tremor). To address this complexity, it might be more valuable to predict the probability of a certain CN recording producing specific behavioral phenotypes. In this revised approach, the output of the classifier wouldn't be a single classification (e.g., "this is an ataxic mouse") but rather the probability of a certain neural recording corresponding to ataxia-like symptoms (e.g., "the classifier suggests that this mouse has a 76% likelihood of exhibiting ataxic symptoms given this CN recording"). This modification wouldn't require additional data collection, and the exemplar disease models could still be used to train such a revised network/classifier, with each mouse model corresponding to 0% probability of observing all other behavioral phenotypes except for the specific output corresponding to the disease state (e.g., L7CreVgat-fl/fl would be 0% for all categories except ataxia, which would be trained to produce a score of 100%). This approach could enhance the validation results across other mouse models by allowing flexibility in a particular spike train parameter to produce a diverse set of phenotypes.

      This is a great comment. Unfortunately, our current dataset is constrained to fully address this comment for the following reasons:

      - We have a limited number of neurons on which we can train our classifier neurons. Further dividing up the groups of neurons or complicating the model limited the power of our analyses and resulted in overfitting of the model on too few neurons.

      - The recording durations (30 seconds) used to train our model are likely too short to find multiple disease signatures within a single recording. We feel that the complex phenotypes are likely resulting from cells within one mouse exhibiting a mix of disease signatures (as in the Car8wdl/wdl mice).

      We think this question would be great for a follow-up study that uses a large number of recordings from single mice to fully predict the mouse phenotype based on the population spike signatures. 

      To limit confusion about our classifier model, we have also altered the language of our manuscript and refer to the cells exhibiting a spike signature instead of predicting a phenotype. 

      However, the paper falls short in terms of the classifier model itself. The current implementation of this classifier appears to be rather weak. For instance, the crossvalidated performance on the same disease line mouse model for tremor is only 56%. While I understand that the classifier aims to simplify a high-dimensional dataset into a more manageable decision tree, its rather poor performance undermines the authors' main objectives. In a similar vein, although focusing on three primary features of spiking statistics identified by the decision tree model (CV, CV2, and median ISI) is useful for understanding the primary differences between the firing statistics of different mouse models, it results in an overly simplistic view of this complex data. The classifier and its reliance on the reduced feature set are the weakest points of the paper and could benefit from further analysis and a different classification architecture. Nevertheless, it is commendable that the authors have collected high-quality data to validate their classifier. Particularly impressive is their inclusion of data from multiple mouse models of ataxia, dystonia, and tremor, enabling a true test of the classifier's generalizability.

      We intentionally simplified our parameter space from a high-dimensional dataset into a more manageable decision tree. We did this for the following reasons:

      - The parameters, even though all measuring different features, are highly correlated (see Figure 1 – supplemental Figure 2). Further, we were training our dataset on a limited number of recordings. We found that including all parameters (for example using a linear model) caused overfitting of the data and poor model performance.

      - Describing the spike signatures using a lower number of parameters allowed us to design optogenetic parameters that would mimic this parameter space. This would be infinitely more complex with a bigger parameter space. 

      We agree with the reviewer that inclusion of multiple mouse models in addition to the optogenetics experiments provide the classifier’s generalizability. 

      Minor Comments:

      (1) The blown-up CN voltage traces in Figures 5C and Supplementary Figure 2B appear more like bar plots than voltage traces on my machine.

      Thank you for bringing this to our attention. We have improved the rendering of the traces.

      (2) The logic in lines 224-228 is somewhat confusing. The spike train signatures are undoubtedly affected by all the factors mentioned by the authors. What, I believe, the authors intend to convey is that because changes in CN firing rates can be driven by multiple factors, it is the CN firing properties themselves that likely drive disease-specific phenotypes.

      We agree that our discussion of the CN firing needs clarification. We have made the appropriate edits in the text.

      Reviewer #3 (Recommendations For The Authors):

      It's quite astounding that this can be done from single spike trains from what are almost certainly mixed populations of neurons. Could you add something to the discussion about this? Some questions that could be addressed would be would multiple simultaneous recordings additionally help classify these diseases, or would non-simultaneous recordings from the same animal be useful? Also more discussion about which cells you are likely recording from would be useful.

      Thank you for this suggestion. We have added discussion about multiple recordings, simultaneous vs non-simultaneous recordings, and our thoughts on the cell population recorded in this work.

      Data in figure 2 is difficult to understand - it appears that the majority of dysregulated cells in 2 ataxic models are classified as dystonia cells, not ataxic cells. This appears surprising as it seems to be at odds with earlier data from Fig 1. In my opinion, it is not discussed adequately in the Results or Discussion section.

      We have added further discussion of the ataxia models represented in Figures 1 and 2.

      Minor comment:

      The colours of the subdivisions of the bars in 2C and 3C, and the rest of the paper appear to be related to the groups in the middle (under "predicted"), but the colours are much paler in the figure than in the legend, although the colours in the bars and the legends match in the first figure (1E). Does this signify something?

      These figures were remade with the same colors across the board.

    1. eLife assessment

      This important and novel study addresses the challenge of antimicrobial resistance by targeting plasmid proteins that interfere with plasmid transfer as a strategy to limit the spread of antibiotic-resistance genes. The evidence presented and the integration of two approaches to tackle antimicrobial resistance is convincing. This work will interest those working on plasmid transfer and antimicrobial resistance.

    2. Reviewer #1 (Public Review):

      The study by Prieto et al. faces the increasingly serious problem of bacterial resistance to antimicrobial agents. This work has an important element of novelty proposing a new approach to control antibiotic resistance spread by plasmids. Instead of targeting the resistance determinant, plasmid-borne proteins are used as antigens to be bound by specific nanobodies (Nbs). Once bound plasmid transfer was inhibited and Salmonella infection blocked. This in-depth study is quite detailed and complex, with many experiments (9 figures with multiple panels), rigorously carried out. Results fully support the authors' conclusions. Specifically, the authors investigated the role of two large molecular weight proteins (RSP and RSP2) encoded by the IncHI1 derivative-plasmid R27 of Salmonella. These proteins have bacterial Ig-like (Big) domains and are expressed on the cell surface, creating the opportunity for them to serve as immunostimulatory antigens. Using a mouse infection model, the authors showed that RSP proteins can properly function as antigens, in Salmonella strains harboring the IncHI1 plasmid. The authors clearly showed increased levels of specific IgG and IgA antibodies against these RSP proteins proteins in different tissues of immunized animals. In addition, non-immunized mice exhibited Salmonella colonization in the spleen and much more severe disease than immunized ones.

      However, the strength of this work is the selection and production of nanobodies (Nbs) that specifically interact with the extracellular domain of RSP proteins. The procedure to obtain Nbs is lengthy and complicated and includes the immunization of dromedaries with purified RPS and the construction of a VHH (H-chain antibody variable region) library in E. coli. As RSP is expressed on the surface of E. coli, specific Nbs were able to agglutinate Salmonella strains harboring the p27 plasmid encoding the RSP proteins.

      The authors demonstrated that Nbs-RSP reduced the conjugation frequency of p27 thus limiting the diffusion of the amp resistance harbored by the plasmid. This represents an innovative and promising strategy to fight antibiotic resistance, as it is not blocked by the mechanism that determines, in the specific case, the amp resistance of p27 but it targets an antigen associated with HincHI- derivative plasmids. Thus, RPS vaccination could be effective not only against Salmonella but also against other enteric bacteria. A possible criticism could be that Nbs against RSP proteins reduce the severity of the disease but do not completely prevent the infection by Salmonella.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript aims to tackle the antimicrobial resistance through the development of vaccines. Specifically, the authors test the potential of the RSP protein as a vaccine candidate. The RSP protein contains bacterial Ig-like domains that are typically carried in IncHl1 plasmids like R27. The extracellular location of the RSP protein and its role in the conjugation process makes it a good candidate for a vaccine. The authors then use Salmonella carrying an IncHl plasmid to test the efficacy of the RSP protein as a vaccine antigen in providing protection against infection of antibiotic-resistant bacteria carrying the IncHl plasmid. The authors found no differences in total IgG or IgA levels, nor in pro-inflammatory cytokines between immunized and non-immunized mice. They however found differences in specific IgG and IgA, attenuated disease symptoms, and restricted systemic infection.

      The manuscript also evaluates the potential use of nanobodies specifically targeting the RSP protein by expressing it in E. coli and evaluating their interference in the conjugation of IncHl plasmids. The authors found that E. coli strains expressing RSP-specific nanobodies bind to Salmonella cells carrying the R27 plasmid thereby reducing the conjugation efficacy of Salmonella.

      Strengths:

      - The main strength of this manuscript is that it targets the mechanism of transmission of resistance genes carried by any bacterial species, thus making it broad.

      - The experimental setup is sound and with proper replication.

      Weaknesses:

      - The two main experiments, evaluating the potential of the RSP protein and the effects of nanobodies on conjugation, seem as parts of two different and unrelated strategies.

      - The survival rates shown in Figure 1A and Figure 3A for Salmonella pHCM1 and non-immunized mice challenged with Salmonella, respectively, are substantially different. In the same figures, the challenge of immunized mice and Salmonella pHCM1 and mice challenged with Salmonella pHCM1 with and without ampicillin are virtually the same. While this is not the only measure of the effect of immunization, the inconsistencies in the resulting survival curves should be addressed by the authors more thoroughly as they can confound the effects found in other parameters, including total and specific IgG and IgA, and pro-inflammatory cytokines.

      - Overall the results are inconsistent and provide only partial evidence of the effectiveness of the RSP protein as a vaccine target.

      - The conjugative experiments use very long conjugation times, making it harder to asses if the resulting transconjugants are the direct result of conjugation or just the growth of transconjugants obtained at earlier points in time. While this could be assessed from the obtained results, it is not a direct or precise measure.

      - While the potential outcomes of these experiments could be applied to any bacterial species carrying this type of plasmids, it is unclear why the authors use Salmonella strains to evaluate it. The introduction does a great job of explaining the importance of these plasmids but falls short in introducing their relevance in Salmonella.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study by Prieto et al. faces the increasingly serious problem of bacterial resistance to antimicrobial agents. This work has an important element of novelty proposing a new approach to control antibiotic resistance spread by plasmids. Instead of targeting the resistance determinant, plasmid-borne proteins are used as antigens to be bound by specific nanobodies (Nbs). Once bound plasmid transfer was inhibited and Salmonella infection blocked. This in-depth study is quite detailed and complex, with many experiments (9 figures with multiple panels), rigorously carried out. Results fully support the authors' conclusions. Specifically, the authors investigated the role of two large molecular weight proteins (RSP and RSP2) encoded by the IncHI1 derivative-plasmid R27 of Salmonella. These proteins have bacterial Ig-like (Big) domains and are expressed on the cell surface, creating the opportunity for them to serve as immunostimulatory antigens. Using a mouse infection model, the authors showed that RSP proteins can properly function as antigens, in Salmonella strains harboring the IncHI1 plasmid. The authors clearly showed increased levels of specific IgG and IgA antibodies against these RSP proteins proteins in different tissues of immunized animals. In addition, non-immunized mice exhibited Salmonella colonization in the spleen and much more severe disease than immunized ones. 

      However, the strength of this work is the selection and production of nanobodies (Nbs) that specifically interact with the extracellular domain of RSP proteins. The procedure to obtain Nbs is lengthy and complicated and includes the immunization of dromedaries with purified RPS and the construction of a VHH (H-chain antibody variable region) library in E. coli. As RSP is expressed on the surface of E. coli, specific Nbs were able to agglutinate Salmonella strains harboring the p27 plasmid encoding the RSP proteins. 

      The authors demonstrated that Nbs-RSP reduced the conjugation frequency of p27 thus limiting the diffusion of the amp resistance harbored by the plasmid. This represents an innovative and promising strategy to fight antibiotic resistance, as it is not blocked by the mechanism that determines, in the specific case, the amp resistance of p27 but it targets an antigen associated with HincHI- derivative plasmids. Thus, RPS vaccination could be effective not only against Salmonella but also against other enteric bacteria. A possible criticism could be that Nbs against RSP proteins reduce the severity of the disease but do not completely prevent the infection by Salmonella.

      It is true that vaccina2on of mice with purified RSP protein did not provide complete protec2on against infec2on with a Salmonella strain harboring an IncHI plasmid. As this finding is based on an animal model, further inves2ga2on is required to evaluate its clinical efficacy. In any case, even par2al protec2on provided by nanobodies or by a vaccine could poten2ally improve survival rates among cri2cally ill pa2ents infected with a pathogenic bacterium harboring an IncHI plasmid. An addi2onal beneficial aspect of our approach is that it will reduce dissemina2on of IncHI plasmids among pathogenic bacteria, which would reduce the presence of an2bio2c resistance plasmids in the environment and in the bacteria infec2ng pa2ents. 

      Reviewer #2 (Public Review):

      Summary:

      This manuscript aims to tackle the antimicrobial resistance through the development of vaccines. Specifically, the authors test the potential of the RSP protein as a vaccine candidate. The RSP protein contains bacterial Ig-like domains that are typically carried in IncHl1 plasmids like R27. The extracellular location of the RSP protein and its role in the conjugation process makes it a good candidate for a vaccine. The authors then use Salmonella carrying an IncHl plasmid to test the efficacy of the RSP protein as a vaccine antigen in providing protection against infection of antibioticresistant bacteria carrying the IncHl plasmid. The authors found no differences in total IgG or IgA levels, nor in pro-inflammatory cytokines between immunized and non-immunized mice. They however found differences in specific IgG and IgA, attenuated disease symptoms, and restricted systemic infection.

      The manuscript also evaluates the potential use of nanobodies specifically targeting the RSP protein by expressing it in E. coli and evaluating their interference in the conjugation of IncHl plasmids. The authors found that E. coli strains expressing RSPspecific nanobodies bind to Salmonella cells carrying the R27 plasmid thereby reducing the conjugation efficacy of Salmonella. 

      Strengths:

      The main strength of this manuscript is that it targets the mechanism of transmission of resistance genes carried by any bacterial species, thus making it broad.

      The experimental setup is sound and with proper replication.

      Weaknesses:

      The two main experiments, evaluating the potential of the RSP protein and the effects of nanobodies on conjugation, seem as parts of two different and unrelated strategies.

      In preparing our manuscript, we were aware that we included two different strategies to combat an2microbial resistance. However, we deemed it valuable to include both in the paper. The development of new vaccines and the inhibi2on of the transfer of an2bio2c resistance determinants are currently considered relevant approaches to combat an2microbial resistance. Our inten2on in the ar2cle is to integrate these two strategies. 

      The survival rates shown in Figure 1A and Figure 3A for Salmonella pHCM1 and non-immunized mice challenged with Salmonella, respectively, are substantially different. In the same figures, the challenge of immunized mice and Salmonella pHCM1 and mice challenged with Salmonella pHCM1 with and without ampicillin are virtually the same. While this is not the only measure of the effect of immunization, the inconsistencies in the resulting survival curves should be addressed by the authors more thoroughly as they can confound the effects found in other parameters, including total and specific IgG and IgA, and pro-inflammatory cytokines.

      Overall the results are inconsistent and provide only partial evidence of the effectiveness of the RSP protein as a vaccine target.

      To address the concerns regarding the disparities in survival rates depicted in Figures 1A and 3A, it is important to refer to several factors that contribute to these variations. Firstly, it should be noted that the data depicted in these figures stem from distinct experimental sets conducted at different times employing different batches of mice. Despite the use of the same strain and supplier, individual animals and their batches can exhibit variability in susceptibility to infection due to inherent biological differences.

      Unlike in vitro cell culture experiments, which can achieve high replicability due to the homogeneity of cell lines, in vivo animal studies often exhibit greater variability. This variability is influenced not only by genetic variations within animal populations, even if originating from the same supplier, but also by environmental factors within the animal facility. These factors include temperature variations, the concentration y of non-pathogenic microorganisms in the facility, which can modify the immune responses, or the density of animals in the environment, consequently affecting human traffic and generating potential disturbances. 

      When designing experiments with animals, it is desirable for the results to be consistent across different animal batches. If one bacterial strain exhibits higher mortality rates than another across multiple experimental series, this pattern should be reproducible despite the inherent variability in in vivo studies. It is more important to demonstrate consistency in trends than to focus on absolute figures when validating experimental results. 

      It is also important to clarify that when we refer to survival rates, it doesn’ t necessarily mean that the animals were found deceased. The animal procedures were approved by the Ethics Committee of Animal Experimentation of the Universitat de Barcelona, which include an animal monitoring protocol. Our protocol requires close daily monitoring of several health and behavioral parameters, each evaluated according to specific criteria. When an animal reaches a predetermined score threshold indicating severe distress or suffering, euthanasia is administered to alleviate further suffering. At this point, biological samples are collected for subsequent analysis.

      The conjugative experiments use very long conjugation times, making it harder to assess if the resulting transconjugants are the direct result of conjugation or just the growth of transconjugants obtained at earlier points in time. While this could be assessed from the obtained results, it is not a direct or precise measure.

      In the conjuga2on experiments we u2lized a reduced number of donor cells expressing the RSP protein and of recipient cells, as well as long conjuga2on 2mes, to reflect more accurately a situa2on that may occur naturally in the environment. Short conjuga2on 2mes are efficient in controlled laboratory condi2ons using high densi2es of donor and recipient cells, but these condi2ons are not commonly found in the environment. For the interference of the conjuga2ve transfer of the IncHI plasmid we used an E. coli strain displaying the nanobody binding RSP to simulate a process that could be also scaled-up in a natural environment (i.e., a probio2c strain in a livestock farm) and that could be cost effec2ve. See discussion sec2on, lanes 326-328.   

      While the potential outcomes of these experiments could be applied to any bacterial species carrying this type of plasmids, it is unclear why the authors use Salmonella strains to evaluate it. The introduction does a great job of explaining the importance of these plasmids but falls short in introducing their relevance in Salmonella.

      The prevalence of IncHI plasmids in Salmonella was indicated in the introduc2on sec2on, lanes 65-67. Nevertheless, we understand the reviewer’s cri2cisms and have modified both these sentences in the introduc2on sec2on and also added comments in the results sec2on (lanes 118-128).

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      I understand working with mice can be challenging in terms of repeating experiments to further support the study's claims. For this reason, I think the authors need to discuss more thoroughly the following things:

      Can the authors comment on why the presence of Ampicillin leads to a lower upregulation of proinflammatory cytokines in the spleen despite harboring resistance against ampicillin?

      At the intestinal level, physiological inflammatory responses play a crucial role in enabling the host to identify foreign and commensal bacterial antigens and initiate a highly regulated and "controlled" immune response (Fiocchi, 2008. Inflamm Bowel Dis. 2008, 14 Suppl 2:S77-8). The administration of antibiotics such as ampicillin, reduces the load of intestinal resident microbiota, thereby lowering the extent of intestinal immune activation. This decline in immune activation extends to systemic levels, potentially accounting for the reduced expression of proinflammatory cytokines observed in the spleen.

      There are inconsistent results in the survival rates in Figures 1A and 3A, please discuss how this could alter the observed differences in total and specific IgG and IgA, and pro-inflammatory cytokines.

      To address the reviewer concerns regarding the discrepancies in survival rates shown in Figures 1A and 3A, and how these differences might influence the observed variations in total and specific IgG and IgA, as well as pro-inflammatory cytokines, it is important to clarify the terminology used in our study. In our context, "survival" does not solely refer to mortality per se, but encompasses the endpoints defined by our animal welfare protocols, which are rigorously supervised by the Animal Experimentation Ethics Committee of the University of Barcelona. Our protocol mandates close daily monitoring of several health and behavioral parameters, each scored according to specific criteria. When an animal reaches a predefined score threshold indicating severe distress or suffering, euthanasia is conducted to prevent further distress, at which point we collect biological samples for analysis.

      In contrast to in vitro cell culture experiments, which often achieve high replicability thanks to the homogeneity of cell lines, in vivo animal studies frequently display greater variability. This variability stems not only from genetic differences within animal populations, even if originating from the same supplier, but also from environmental factors within the animal facility. These factors encompass variations in temperature, the presence of non-pathogenic microorganisms in the facility (capable of altering immune responses) and the density of animals, which can impact human traffic and potentially lead to disturbances. 

      The experiments depicted in Figs. 1A and 3A were separated in time, and hence may be influenced by environmental factors within the animal facility. Nevertheless, in the comparative analysis performed between immunized and non-immunized animals, experiments were performed simultaneously and hence under similar environmental conditions in the animal facility. For several parameters (i.e., immunoglobulins and proinflammatory cytokines) statistically significant differences were observed. 

      Regarding the conjugation assays, it is not entirely clear to me why the conjugation times are so long. It would be beneficial to have more data about the conjugation efficacy between the donor and recipient without any E. coli expressing the nanobodies at different time intervals. This would help to differentiate between transconjugants and transconjugants obtained from early conjugation events.

      This comment is par2ally answered in a previous response, regarding the numbers of donor and recipient cells and dura2on of conjuga2on.  We note here that in fig. 9, the requested experiment with donor and recipient cells without E. coli interferent cells is already present, corresponding to the label “none”. To avoid confusion, we have modified the legend in fig. 9.

    1. eLife assessment

      How the triplicate interaction between chemokines with both GAGs and G protein-coupled receptors (GPCR) works and how gradients are created and potentially maintained in vivo are poorly understood. The authors provide solid evidence to show phase separation can drive chemotactic gradient formation. The paper is a useful advance in the field of chemokine biology.

    2. Joint Public Review:

      Chemokines are known to create chemotactic gradients and it is generally recognized that in order to create these gradients they need to bind to glycosaminoglycans (GAGs) on cells and in tissues. However, how the triplicate interaction between chemokines with both GAGs and G protein-coupled receptors (GPCR) works and how gradients are created and potentially maintained in vivo is poorly understood. In their manuscript, Yu et al investigated and showed in detail the ability of soluble and cell-bound GAGs to create gradients of the chemokine CCL5. They show in vitro in a modified leukocyte migration assay that soluble GAGs and GAGs on the tumor cell line THP-1 affect leukocyte migration. This useful work contributes to our in-depth understanding of the role of GAGs in chemokine gradient creation which is important for site-directed leukocyte and potentially tumor cell migration and as such is of potential interest for scientists studying immune responses in infection, inflammation, autoimmunity and tumor biology. In their reply to the comments of both reviewers they indicate that liquid-liquid phase separation (LLPS) was not detected at lower CCL5 concentrations. This is important information since, together with the tendency of CCL5 to form oligomers, it may indicate that oligomerization is crucial for LLPS. This info should at least be added to the discussion of the manuscript.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      Although the study by Xiaolin Yu et al is largely limited to in vitro data, the results of this study convincingly improve our current understanding of leukocyte migration.

      (1) The conclusions of the paper are mostly supported by the data and in the revised manuscript clarification is provided concerning the exact CCL5 forms (without or with a fluorescent label or His-tag) and amounts/concentrations that were used in the individual experiments. This is important since it is known that modification of CCL5 at the N-terminus affects the interactions of CCL5 with the GPCRs CCR1, CCR3 and CCR5 and random labeling using monosuccinimidyl esters (as done by the authors with Cy-3) is targeting lysines. The revised manuscript more clearly indicates for each individual experiment which form is used. However, a discussion on the potential effects of the modifications on CCL5 in the results and discussion sections is still missing.

      Many thanks for the reviewer's suggestion. We fully agree it is important to clarify the potential issue of Cy3 labeling, and believe it is more suitable in the Materials and Methods section (line 312-314).

      (2) In general, authors used high concentrations of CCL5 in their experiments. In their reply to the comments they indicate that at lower CCL5 concentrations no LLPS is detected. This is important information since it may indicate the need for chemokine oligomerization for LLPS. This info should be added to the manuscript and comparison with for instance the obligate monomer CCL7 and another chemokine such as CXCL4 that easily forms oligomers may clarify whether LLPS is controlled by oligomerization.

      We are pleased by the help of the reviewers and accordingly inserted a brief discussion as suggested (line 240-246).

      (3) Statistical analyses have been improved in the revised manuscript.

      Thanks to the reviewer for his/her comment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study uses a novel experimental design to elegantly demonstrate how we exploit stimulus structure to overcome working memory capacity limits. While the behavioural evidence is convincing, the neural evidence is incomplete, as it only provides partial support for the proposed information compression mechanism. This study will be of interest to cognitive neuroscientists studying structure learning and memory.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Huang and Luo investigated whether regularities between stimulus features can be exploited to facilitate the encoding of each set of stimuli in visual working memory, improving performance. They recorded both behavioural and neural (EEG) data from human participants during a sequential delayed response task involving three items with two properties: location and colour. In the key condition ('aligned trajectory'), the distance between locations of successively presented stimuli was identical to their 'distance' in colour space, permitting a compression strategy of encoding only the location and colour of the first stimulus and the relative distance of the second and third stimulus (as opposed to remembering 3 locations and 3 colours, this would only require remembering 1 location, 1 colour, and 2 distances). Participants recalled the location and colour of each item after a delay.

      Consistent with the compression account, participants' location and colour recall errors were correlated and were overall lower compared to a non-compressible condition ('misaligned trajectory'). Multivariate analysis of the neural data permitted decoding of the locations and colours during encoding. Crucially, the relative distance could also be decoded - a necessary ingredient for the compression strategy.

      Strengths:

      The main strength of this study is a novel experimental design that elegantly demonstrates how we exploit stimulus structure to overcome working memory capacity limits. The behavioural results are robust and support the main hypothesis of compressed encoding across a number of analyses. The simple and well-controlled design is suited to neuroimaging studies and paves the way for investigating the neural basis of how environmental structure is detected and represented in memory. Prior studies on this topic have primarily studied behaviour only (e.g., Brady & Tenenbaum, 2013).

      Thanks for the positive comments and excellent summary.

      Weaknesses:

      The main weakness of the study is that the EEG results do not make a clear case for compression or demonstrate its neural basis. If the main aim of this strategy is to improve memory maintenance, it seems that it should be employed during the encoding phase. From then on, the neural representation in memory should be in the compressed format. The only positive evidence for this occurs in the late encoding phase (the re-activation of decoding of the distance between items 1 and 2, Fig. 5A), but the link to behaviour seems fairly weak (p=0.068).

      Thanks for raising this important concern. The reviewer is correct that in principle subjects should employ the compression strategy during the encoding phase when sequence stimuli are presented, yet our results show that the 1-2 trajectory could only be decoded during the late encoding phase.

      Meanwhile, subjects could not get enough information to form the compressed strategy for the location and color sequences until the appearance of the 3rd item. Specifically, based on the first two items, the 1st and 2nd item, they only learn whether the 1st-2nd trajectories are congruent between location and color features. However, they could not predict whether it would also apply to the incoming 2nd-3rd trajectory. This is exactly what we found in neural decoding results. The 1st-2nd trajectory could be decoded after the 2nd item presentation, and the 2nd-3rd trajectory appears after the 3rd item onset. Most critically, the 1st-2nd trajectory is reactivated after the 3rd item but only for alignment condition, implicating formation of the full-sequence compression strategy wherein the previously formed 1st-2nd trajectory is reactivated to be connected to the 2nd-3rd trajectory.

      Regarding the difference between higher- and lower-correlation groups, previously we used the time window based on the overall 2nd-3rd neural reactivations, which might not be sensitive to reactivation strength. We now re-chose the time window based on the higher-correlation group (bootstrap test, p = 0.037, two sides).

      Results have been updated (Figure 5; Results, Page 16). Interpretations about the formation of compression strategy during encoding phase have been added to Results (Page 15-16) and Discussion (Page 18).

      Stronger evidence would be showing decoding of the compressed code during memory maintenance or recall, but this is not presented. On the contrary, during location recall (after the majority of memory maintenance is already over), colour decoding re-emerges, but in the un-compressed item-by-item code (Fig. 4B). The authors suggest that compression is consolidated at this point, but its utility at this late stage is not obvious.

      Thank you for the important question we apologize for omitting previously - neural evidence for the compressive account.

      The reason we did not perform neural decoding during maintenance is that previous EEG/MEG studies including our own failed to reveal robust and sustained time-resolved memory decoding during this period. This is posited to arise from “activity-silent” WM states, wherein memories are not necessarily retained in sustained firing but silently stored within connection weights of WM networks (Stokes, Trends Cogn. Sci., 2015; Rose, Curr Dir Psychol Sci, 2020). Our previous work showed that by transiently perturbing the 'activity-silent' WM using a retrocue or neutral impulse, memories could be reactivated and robustly decoded from neural activities (Huang et al., eLife, 2021). However, due to the lack of transient events during retention in the current design, we do not expect robust decoding results during maintenance. As shown below (AB), this is indeed what we have observed, i.e., no robust neural decoding of trajectories during retention.

      We further used alpha-band (8-11 Hz) neural activities, which have been shown to carry WM information (de Vries et al., Trends Cogn. Sci, 2020; Foster et al., Curr. Biol, 2016; Fukuda et al., J. Neurophysiol, 2016; Sutterer et al., PLOS Biol., 2019) to perform decoding analysis of compression trajectories during maintenance. As shown below, the alpha-band decoding results are indeed stronger than raw activities. Importantly, as shown below (CD), the aligned condition indeed showed significant and long-lasting decoding of compression trajectories (1st-2nd, 2nd-3rd) during retention, while the misaligned condition only showed decoding at the beginning (GH), which might be due to the non-specific offset response of the 3rd item. The results, although not as clear as those during encoding and recalling periods, support the reviewer’s hypothesis that the compressive strategy, if exploited, would be demonstrated during both encoding and maintenance periods. New results and related discussion have been added (Page 16, Supplementary Figure 4).

      With regards to the observed item-by-item color replay during location recall, the reviewer was concerned that this was not consistent with the compressive account, given the lack of trajectory decoding.

      First, item sequences stored in compressive formats need to be converted to sequences during serial recall. In other words, even though color and location sequences are retained in a compressive format (i.e., common 1st-2nd, 2nd-3rd trajectories) throughout the encoding and retention phases, they should be transferred to two sequences as outputs. This is exactly why we performed decoding analysis on individual color and location items rather than trajectories.

      Second and most importantly, we observed serial replay of color sequences when recalling locations. In our view, these results constitute strong evidence for common structure, since the spontaneous color replay during location recall for aligned condition highlights the close bound between color and location sequences stored in WM. In fact, item-by-item serial replay has been well acknowledged as a critical neural index of cognitive maps, not only for spatial navigation but also for higher-order tasks (e.g., Liu et al., Cell, 2019; Liu et al., Science, 2021). Therefore, spontaneous color sequence replay during location sequence recall supports their shared underlying cognitive map.

      Finally, spontaneous serial replay is also correlated with the reactivation of compressive trajectories during encoding (Supplementary Figure 3). This further indicates that serial replay during recalling is associated with memory reorganization formed during encoding.

      Taken together, we posit that memories need to be converted to sequences as outputs, which leads to serial reactivations during recalling. Importantly, the observed spontaneous replay of color sequences for the aligned condition provides strong evidence supporting the associations between color and location sequences in WM.

      We have now added relevant interpretations and discussions (Page 11&13).

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors wanted to test if using a shared relational structure by a sequence of colors in locations can be leveraged to reorganize and compress information.

      Strength:

      They applied machine learning to EEG data to decode the neural mechanism of reinstatement of visual stimuli at recall. They were able to show that when the location of colors is congruent with the semantically expected location (for example, green is closer to blue-green than purple) the related color information is reinstated at the probed location. This reinstatement was not present when the location and color were not semantically congruent (meaning that x displacement in color ring location did not displace colors in the color space to the same extent) and semantic knowledge of color relationship could not be used for reducing the working memory load or to benefit encoding and retrieval in short term memory.

      Weakness:

      The experiment and results did not address any reorganization of information or neural mechanism of working memory (that would be during the gap between encoding and retrieval).

      We apologize for not presenting clear neural evidence for memory reorganization, particularly neural decoding during WM maintenance and retrieval, in the previous version. As below, we explain why the findings provide converging neural evidence for WM reorganization based on a shared cognitive map.

      First, during the encoding phase when location and color sequences are serially presented, our results reveal reactivation of the 1st-2nd trajectories upon the onset of the 3rd item when location and color sequences are aligned with each other. The reactivation of 1st-2nd trajectory right after the emergence of 2nd-3rd trajectory for aligned but not for misaligned sequences strongly supports WM reorganization, since only stimulus sequences that could be compressed based on shared trajectories (aligned condition) show the co-occurrence of 1st-2nd and 2nd-3rd trajectories. Moreover, the relevance of 1st-2nd reactivation to behavioral measurements of color-location reorganization (i.e., behavioral trajectory correlation, Figure 5D) further indicates its link to WM reorganization.

      Second, the reason we originally did not perform neural decoding during maintenance is that previous EEG/MEG studies including our own failed to reveal robust and sustained time-resolved memory decoding during this period. This is posited to arise from “activity-silent” WM states, wherein memories are not necessarily retained in sustained firing but silently stored within connection weights of WM networks (Stokes, Trends Cogn. Sci., 2015; Wolff et al., Nat. Neurosci, 2017; Rose et al., Curr Dir Psychol Sci, 2020). Our previous work showed that by transiently perturbing the 'activity-silent' WM using a retrocue or neutral impulse, memories could be reactivated and robustly decoded from neural activities (Huang et al., eLife, 2021). However, due to the lack of transient events during retention in the current design, we do not expect robust decoding results during maintenance. As shown in Supplementary Figure 4(AB), this is indeed what we have observed, i.e., no robust neural decoding of trajectories during retention.

      We then used alpha-band (8-11 Hz) neural activities, which have been found to carry WM information (de Vries et al., Trends Cogn. Sci, 2020; Foster et al., Curr. Biol, 2016; Fukuda et al., J. Neurophysiol, 2016; Sutterer et al., PLOS Biol., 2019) to perform decoding analysis of compression trajectories during maintenance. As shown below, the alpha-band decoding results are indeed stronger than raw activities. Importantly, as shown in Supplementary Figure 4(CD), the aligned condition indeed showed significant and long-lasting decoding of compression trajectories (1st-2nd, 2nd-3rd) during retention, while the misaligned condition only showed decoding at the beginning (GH), which might be due to the non-specific offset response of the 3rd item. The results, although not as clear as those during encoding and recalling periods, thus also support WM reorganization.

      Finally, during the recalling period, we observed automatic serial replay of color sequences when recalling locations. In our view, these results constitute strong evidence for common structure, since the spontaneous color replay during location recall for aligned condition highlights the close bound between color and location sequences stored in WM. In fact, item-by-item serial replay has been well acknowledged as a critical neural index of cognitive maps, not only for spatial navigation but also for higher-order tasks (e.g., Liu et al., Cell, 2019; Liu et al., Science, 2021). Therefore, spontaneous replay of color sequence during location recall supports their shared underlying cognitive map. Moreover, the spontaneous serial replay is correlated with the reactivation of compressive trajectories during encoding (Supplementary Figure 3). This further indicates that serial replay during recalling is associated with memory reorganization formed during encoding.

      Taken together, we have added updated results about the maintenance period (Page 16, Supplementary Figure 4) and included clarifications and interpretations about why the findings during the encoding and retrieval periods support the WM reorganization view (Page 15-16).

      There was also a lack of evidence to rule out that the current observation can be addressed by schematic abstraction instead of the utilization of a cognitive map.

      The likely impact of the initial submission of the study would be in the utility of the methods that would be helpful for studying a sequence of stimuli at recall. The paper was discussed in a narrow and focused context, referring to limited studies on cognitive maps and replay. The bigger picture and long history of studying encoding and retrieval of schema-congruent and schema-incongruent events is not discussed.

      We agree with the reviewer that cognitive map referred here could be understood as schematic abstraction. Cognitive map refers to the internal representation of spatial relations in a specific environment (Tolman 1948). Schematic abstraction denotes a more broad range of circumstances, whereby the gist or structure of multiple environments or episodes can be integrated (Bartlett, 1932; Farzanfar et al., Nat. Rev. Neurosci, 2023).

      In other words, schema refers to highly abstract framework of prior knowledge that captures common patterns across related experiences, which does not necessarily occur in a spatial framework as cognitive maps do. Meanwhile, in the current design, we specifically manipulate the consistency of spatial trajectory distance between color and location sequences. Therefore, we would argue that cognitive map is a more conservative and appropriate term to frame our findings.

      Relevant discussions have been added (Page 3&19).

      We apologize for the lack of more generalized discussion and have added schema-related literatures. Thanks for the suggestion.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Do time-frequency-domain data (e.g., alpha-band power) in the delay provide evidence for delay-period decoding of trajectory lengths? This might strengthen the case for compression.

      Thanks for the suggestion. We now performed decoding analysis of the delay period based on alpha-band power. As shown in supplementary figure 4, both the 1st-2nd and 2nd-3rd trajectories could be decoded for the aligned condition.

      Added in supplementary figure 4 and Page 16.  

      (2) Do participants erroneously apply the compression strategy in the misaligned condition? This would not show up in the trajectory error correlation analysis, but might be visible when examining correlations between raw trajectory lengths.

      Thanks for raising this interesting suggestion. To test the hypothesis, we chose a typical misaligned condition where 1st-2nd trajectory distances are same between location and color sequences, while the 2nd-3rd trajectory distances are different between the two features.

      In this case, participants might exploit the compression strategy for the first two items and erroneously apply the strategy to the 3rd item. If so, we would expect better memory performance for the first two items but worse memory for the 3rd item, compared to the rest of misaligned trials. As shown below, the 1st-2nd aligned trials showed marginally significant higher performance than misaligned trials for the first two items (t(32) = 1.907, p = 0.066, Cohen’s d = 0.332) . Unfortunately, we did not find significant worse performance for the 3rd item between the two conditions (t(32) = -0.4847, p = 0.631, Cohen’s d = -0.084). We observed significant interactions between the last two items and the alignment effect (t(32) = 2.082, p = 0.045, Cohen’s d = 0.362), indicating a trend of applying wrong compression strategy to the 3nd item.

      Author response image 1.

      (3a) Some more detail on some of the methods might help readers. For instance, did trajectories always move in a clockwise direction? Could the direction reverse on the third item? If not, did this induce a response bias? Could such a bias possibly account for the trajectory error correlations

      Sorry for the unclear statement. For individual trial, both the color and location features of the three items are randomly selected from nine possible values without any constraint about the directions. That is to say, the trajectories can move in a clockwise or anticlockwise direction, and the direction can also reverse on the third item in some trials. Thus, we think the current design can actually help us to reduce the influence of response bias. Taking a step back, if trajectory error correlations are due to response bias, we should expect consistent significant correlation for all conditions, instead of only observing significant correlation for 1st-2nd and 2nd-3rd trajectories but not for 1st-3rd trajectory and only in aligned trajectory condition but not in misaligned condition. Therefore, we think the trajectory error correlations cannot be simply explained by response bias.

      Details have been added (Page 23).

      (3b) Is the colour wheel always oriented the same way for a participant? If so, given there are only nine colors, it seems possible that colors are mapped to locations and remembered in a location code instead. This does not seem to be a problem in principle for the behavioural findings, but might change the interpretation of what is being decoded from the EEG. If this is a possibility then this might be acknowledged.

      The color wheel is always oriented the same way for each participant. We agree with the reviewer that it is possible that participants tend to map colors to locations and remembered in a location code. We don’t have sufficient evidence to rule out this possibility. One possible way could be running another experiment with varied color wheel during response period. Meanwhile, we would like to point out that the underlying logic of the current design is based on the facts that thinking spatially is intuitive and spatial metaphors like “location” and “distance” is commonly used to describe world, e.g., the well-known mental number line (Dehaene et al., JEP: General, 1993). Therefore, we expected participants to associate or integrate location and color maps based on trajectory distance.

      The reviewer is correct that the color decoding would reflect spatial location rather than the genuine color feature. This is actually the point of the experimental design, whereby two irrelevant features could be possibly combined within a common cognitive map. Without the realignment of the two feature maps defined in space, subjects could not at all form the strategy to compress the two sequences. In other words, decoding of color sequences could be understood as neural representation of a series of corresponding locations along the ring that are independent of the physical locations of the items.

      Interpretations and clarifications have been added (Page 23&26).

      (4) Does the discretisation of the stimulus distribution (to only 9 possible locations) make the compression strategy easier to use? If the features had been continuously distributed across the location/colour circle, would participants still pick up on and use the shared trajectory structure?

      Thanks for the question. Without further data, it’s hard to say whether the discretization of the stimulus distribution would make the compression strategy easier to use or not, compared to continuous distribution. Both outcomes seem possible. On the one hand, discrete stimulus distribution would result in discrete trajectory distribution, which helps participants to realize the common trajectory strategy. On the other hand, discrete stimulus distribution would result in category or label representation, which may weaken the effectiveness of structure compression strategy. We postulate that our findings could be generalized to continuous trajectories in a cognitive map within certain resolution.

      (5a) Minor point: I disagree that avoiding the same points for location and colour for a given item allows them to be independently decoded. I would argue the contrary - this kind of constraint should create a small anti-correlation that in principle could lead to spurious decoding of one variable (although this seems unlikely here).

      We appreciate the concern. As mentioned above, with discrete stimulus distribution (9 possible values for both color and location domains), it is quite possible that a fraction of trials would share same values in location and color. Therefore, the neural decoding for one domain might be confounded by another domain. To dissociate their neural representations, we imposed constraints that color and location could not occupy the same value for a given item.

      We agree that this kind of constraint might create a small anti-correlation, even though it is not observed here. Future studies using continuous stimulus distribution would reduce the correlation or anti-correlation between stimuli.

      (5b) Very minor point: 1,000 permutations for significance testing seems on the low side. Since some of the p-values are close to 0.05 it may be worth running more permutations.

      Thanks for this suggestion. We got similar results using 1000 or 10000 permutations.

      (6) Missing reference: H. H. Li et al., 2021 (line 213) seems not to be on the list of references.

      Sorry for the mistake. Added.

      Reviewer #2 (Recommendations For The Authors):

      The study aimed to discuss the working memory mechanism, instead, it seems to be focused on the encoding and recall strategies after a short while, I recommend updating the manuscript to refer to the relevant cognitive mechanism.

      There was a strong voice on the effect of using the cognitive map in working memory, without any tests on if indeed a cognitive map was used (for example the novel link between stimuli and how a cognitive map can be used to infer shortcuts). Was the participant required to have any mental map beyond the schema of the shown color ring?

      In the current experiment, to discuss if the effect is driven by utilizing a cognitive map or schematic abstraction of color-relatedness, further analysis is required to possibly assess the effects of schema on neural activity and behavior. Namely,<br /> (1) Was there any reinstatement of schematically congruent (expected) colors that were probed by location 1, at locations 2 and 3 in the MAT condition?

      Thanks for pointing out this possibility. However, we don’t think there will be stable color expectations given location information under the MAT condition. First, as the trajectory distance varied on a trial-by-trial basis, no prior common trajectory knowledge could be used to make inference about the current stimuli in individual trial. Second, the starting points for color and location (1st item) were randomly and independently selected, such that color sequence could not be predicted based on the location sequence for both aligned and misaligned conditions.

      (2) Given that response time can be a behavioral marker of schematic conflict, was the response time faster for congruent than incongruent conditions?

      Thanks for this question. Unfortunately, due to the experimental design, the response time could not be used as a behavioral marker to infer mental conflicts, since participants were not required to respond as fast as possible. Instead, they took their own pace to reproduce sequences without time limit. They could even take a short break before submitting their response to initiate the next trial.

      (3) In case you cannot rule out that utilizing schema is the cognitive mechanism that supports working memory performance (the behavior), please add the classical literature (on the memory of schematically congruent and incongruent events) to the discussion.

      Thanks for this suggestion and we have added relevant literatures now (Page 3&19).

      (4) On page 6, 'common structure in the cognitive map' is the schema, isn't it?

      Correct. Based on our understanding, ‘common structure in the cognitive map’ is a spatial schema.

      (5) In Figure 2 EFG, would you please use a mixed effect model or show evidence that all participants demonstrated a correlation between the location trajectory error and color trajectory error?

      Thanks for the suggestion. We have added the mixed effect model results, which are consistent with Figure 2EFG (AT: 1st-2nd trajectory, β = 0.071, t = 4.215, p < 0.001; 2nd-3rd trajectory, β = 0.077, t = 3.570, p < 0.001; 1st-3rd trajectory, β = 0.019, t = 1.118, p = 0.264; MAT: 1st-2nd trajectory, β = 0.031, t = 1.572, p = 0.116; 2nd-3rd trajectory, β = 0.002, t = 0.128 , p = 0.898; 1st-3rd trajectory, β = -0.017, t = -1.024, p = 0.306).

      In general, doesn't such correlation just show that good participants/trials were good (some did well in the study and some did poorly throughout?)

      We don’t think the trajectory error correlation results just reveal that some participants did well and some participants did poorly. If that is the case, we shouldn’t observe significant correlation in Figure 2D, where we first run correlation for each participant and then test correlation significance at group level. Indeed, trajectory error correlation between color and location domains characterizes the consistent changes between the two domains.

      It is worth to note that the correlation was estimated with signed trajectory errors in color and location domains, which meant that we indeed cared about whether the errors in the two domains were consistently varied in the same direction, i.e., whether longer trajectory memory compared to the actual trajectory in location domain would predict longer trajectory memory in color domain.

      Moreover, as shown in Figure 2EFG, by dividing trials into 4 bins according to the location trajectory error for each participant and pooling the data across participants, we observed 4 clusters along x-axis (location trajectory error). This suggests that participants’ memory performance is rather consistent instead of being extremely good or bad. Besides, if trajectory error correlation is due to different overall memory performance between participants, we should observe significant trajectory error correlations both in AT and MAT conditions, instead of only under AT condition and for 1st-2nd and 2nd-3rd trajectories but not for 1st-3rd trajectory.

      In Figure 2 G, is the marginal error just too big to be sensitive? I am not sure what we are learning here, please clarify.

      Sorry for the confusion. To examine this possibility, we excluded errors which are beyond 2.5 * σ, and still observed non-significant 1st-3rd trajectory error correlation between color and location domains (r = 0.119, p = 0.167).

      The 1st-3rd trajectory showed nonsignificant behavioral correlation and neural representation, which suggests that the current sequential memory task would encourage participants to organize all information by relying more on the adjacent items and their distance. Thus, we think the 1st-3rd trajectory would serve as a control trajectory, which helps us not only exclude other possible explanation (e.g., systematic response bias), but also validate current findings both in behavioral and neural level.

      Results and statements (Page 10-11) added now.

      Author response image 2.

      (6) Regarding the first lines on page 11, did you do qualitative research to know if less information was encoded in congruent conditions?

      The current experimental design is inspired by the mental compression of spatial sequence studies from Dehaene’s lab (Amalric er al., 2017; Roumi et al., 2021), in which they propose that human brain compresses spatial sequence using an abstract language and formalize minimal description length of a sequence as the “language-of-thought complexity.” Based on this evidence, we think less information is required to describe congruent condition compared to incongruent condition. This idea is supported by better memory performance for congruent condition. Unfortunately, we couldn’t manage to quantify how less information was encoded in congruent condition.

    2. eLife assessment

      This valuable study uses a novel experimental design to elegantly demonstrate how we exploit stimulus structure to overcome working memory capacity limits. The presented behavioural and neural evidence are solid and in line with the proposed information compression mechanism. This study will be of interest to cognitive neuroscientists studying structure learning and memory.

    3. Reviewer #1 (Public Review):

      Summary:

      Huang and Luo investigated whether regularities between stimulus features can be exploited to facilitate the encoding of each set of stimuli in visual working memory, improving performance. They recorded both behavioural and neural (EEG) data from human participants during a sequential delayed response task involving three items with two properties: location and colour. In the key condition ('aligned trajectory'), the distance between locations of successively presented stimuli was identical to their 'distance' in colour space, permitting a compression strategy of encoding only the location and colour of the first stimulus and the relative distance of the second and third stimulus (as opposed to remembering 3 locations and 3 colours, this would only require remembering 1 location, 1 colour, and 2 distances). Participants recalled the location and colour of each item after a delay.

      Consistent with the compression account, participants' location and colour recall errors were correlated and overall lower compared to a non-compressible condition ('misaligned trajectory'). Multivariate analysis of the neural data permitted decoding of the locations and colours during encoding. Crucially, the relative distance could also be decoded - a necessary ingredient for the compression strategy.

      Strengths:

      The main strength of this study is a novel experimental design that elegantly demonstrates how we exploit stimulus structure to overcome working memory capacity limits. The behavioural results are robust and support the main hypothesis of compressed encoding across a number of analyses. The simple and well-controlled design is suited to neuroimaging studies and paves the way for investigating the neural basis of how environmental structure is detected and represented in memory. Prior studies on this topic have primarily studied behaviour only (e.g., Brady & Tenenbaum, 2013).

      Weaknesses:

      The main weakness of the study is that the EEG results could make a clearer case for compression. There is some evidence that distance decoding is present in alpha-band activity in the maintenance delay, but the strongest evidence for this occurs only briefly in the late encoding phase (the re-activation of decoding of the distance between items 1 and 2, Fig. 5A). The link to behaviour (Fig. 5D) seems fairly weak and based on a potentially circular analysis. During location recall, colour decoding re-emerges and is reactivated in sequence, but this finding is consistent both with compression-based and conventional rehearsal mechanisms. Nevertheless, the balance of evidence appears to favour the compression account.

      Impact:

      This important study elegantly demonstrates that the use of shared structure can improve capacity-limited visual working memory. The paradigm and approach explicitly link this field to recent findings on the role of replay in structure learning and will therefore be of interest to neuroscientists studying both topics.

    1. eLife assessment

      This study focuses on the regulation of GLP-1 in enteroendocrine L cells and how this may be stimulated by the mechanogated ion channel Piezo1. The work is innovative and the hypothesis that is being tested may have important mechanistic and translational implications. The data remains incomplete at present and needs a substantial amount of supporting evidence and corrections to be a stronger manuscript and publication.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors intended to prove that gut GLP-1 expression and secretion can be regulated by Piezo1, and hence by mechanistic/stretching regulation. For this purpose, they have assessed Piezo1 expression in STC-1 cell line (a mouse GLP-1 producing cell line) and mouse gut, showing the correlation between Piezo1 level and Gcg levels (Figure S1). They then aimed to generate gut L cell-specific Piezo1 KO mice, and claimed the mice show impaired glucose tolerance and GLP-1 production, which can be mitigated by Ex-4 treatment (Figures 1-2). Pharmacological agents (Yoda1 and GsMTx4) and mechanic activation (intestinal bead implantation) were then utilized to prove the existence of ileal Piezo1-regulated GLP-1 synthesis (Figure 3). This was followed by testing such mechanism in a limited amount of primary L cells and mainly in the STC-1 cell line (Figures 4-7).

      While the novelty of the study is somehow appreciable, the bio-medical significance is not well demonstrated in the manuscript. The authors stated (in lines between lines 78-83) a number of potential side effects of GLP-1 analogs, how can the mechanistic study of GLP-1 production on its own be essential for the development of new drug targets for the treatment of diabetes. Furthermore, the study does not provide a clear mechanistic insight on how the claimed CaMKKbeta/CaMKIV-mTORC1 signaling pathway upregulated both GLP-1 production and secretion. This reviewer also has concerns about the experimental design and data presented in the current manuscript, including the issue of how proglucagon expression can be assessed by Western blotting.

      Strengths:

      The novelty of the concept.

      Weaknesses:

      Experimental design and key experiment information.

    3. Reviewer #2 (Public Review):

      Summary:

      The study by Huang and colleagues focuses on GLP-1 producing entero-endocrine (EEC) L-cells and their regulation of GLP-1 production by a mechano-gated ion channel Piezo1. The study describes Piezo1 expression by L-cells and uses an exciting intersectional mouse model (villin to target epithelium and Gcg to target GLP-1-producing cells and others like glucagon-producing pancreatic endocrine cells), which allows L-cell specific Piezo1 knockout. Using this model, they find an impairment of glucose tolerance, increased body weight, reduced GLP-1 content, and changes to the CaMKKbeta-CaMKIV-mTORC1 signaling pathway using a normal diet and then high-fat diet. Piezo1 chemical agonist and intestinal bead implantation reversed these changes and improved the disrupted phenotype. Using primary sorted L-cells and cell model STC-1, they found that stretch and Piezo1 activation increased GLP-1 and altered the molecular changes described above.

      Strengths:

      This is an interesting study testing a novel hypothesis that may have important mechanistic and translational implications. The authors generated an important intersectional genetics mouse model that allowed them to target Piezo1 L-cells specifically, and the surprising result of impaired metabolism is intriguing.

      Weaknesses:

      However, there are several critical limitations that require resolution before making the conclusions that the authors make.

      (1) A potential explanation for the data, and one that is consistent with existing literature [see for example, PMC5334365, PMC4593481], is that epithelial Piezo1, which is broadly expressed by the GI epithelium, impacts epithelial cell density and survival, and as such, if Piezo1 is involved in L-cell physiology, it may be through regulation of cell density. Thus, it is critical to determine L-cell densities and epithelial integrity in controls and Piezo1 knockouts systematically across the length of the gut, since the authors do not make it clear which gut region contributes to the phenotype they see. Current immunohistochemistry data are not convincing.

      (2) Calcium signaling in L-cells is implicated in their typical role of being gut chemo-sensors, and Piezo1 is a calcium channel, so it is not clear whether any calcium-related signaling mechanism would phenocopy these results.

      (3) Intestinal bead implantation, while intriguing, does not have clear mechanisms - and is likely to provide a point of intestinal obstruction and dysmotility.

      (4) Previous studies, some that are very important, but not cited, contradict the presented results (e.g., epithelial Piezo1 role in insulin secretion) and require reconciliation.

      Overall, this study makes an interesting observation but the data are not currently strong enough to support the conclusions.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors intended to prove that gut GLP-1 expression and secretion can be regulated by Piezo1, and hence by mechanistic/stretching regulation. For this purpose, they have assessed Piezo1 expression in STC-1 cell line (a mouse GLP-1 producing cell line) and mouse gut, showing the correlation between Piezo1 level and Gcg levels (Figure S1). They then aimed to generate gut L cell-specific Piezo1 KO mice, and claimed the mice show impaired glucose tolerance and GLP-1 production, which can be mitigated by Ex-4 treatment (Figures 1-2). Pharmacological agents (Yoda1 and GsMTx4) and mechanic activation (intestinal bead implantation) were then utilized to prove the existence of ileal Piezo1-regulated GLP-1 synthesis (Figure 3). This was followed by testing such mechanism in a limited amount of primary L cells and mainly in the STC-1 cell line (Figures 4-7).

      While the novelty of the study is somehow appreciable, the bio-medical significance is not well demonstrated in the manuscript. The authors stated (in lines between lines 78-83) a number of potential side effects of GLP-1 analogs, how can the mechanistic study of GLP-1 production on its own be essential for the development of new drug targets for the treatment of diabetes. Furthermore, the study does not provide a clear mechanistic insight on how the claimed CaMKKbeta/CaMKIV-mTORC1 signaling pathway upregulated both GLP-1 production and secretion. This reviewer also has concerns about the experimental design and data presented in the current manuscript, including the issue of how proglucagon expression can be assessed by Western blotting.

      Strengths:

      The novelty of the concept.

      Weaknesses:

      Experimental design and key experiment information.

      Current GLP-1-based therapies for diabetes use GLP-1 agonists/analogs. Although generally safe, there are some side effect or risks of GLP-1 agonists/analogs. We agree to the reviewer that a mechanistic study on the regulation of GLP-1 production will not directly lead to development of new drug targets for the treatment of diabetes. However, understanding the mechanism of GLP-1 production may shed light onto alternative treatment strategies for diabetes that targeting the production of GLP-1. In our previous studies, we have elucidated the role of mTOR/S6K pathway in regulating GLP-1 production in L cells. Using STC-1 cell line and different mouse models, including Neurog3-Tsc1−/− mice, rapamycin or L-lucine treatment to stimulate mTOR activity, we have demonstrated that mTOR stimulates proglucagon gene expression and thus GLP-1 production (Diabetologia 2015;58(8):1887-97; Mol Cell Endocrinol. 2015 Nov 15:416:9-18.). Based on our previous studies, we found that Piezo1 regulated mTOR/S6K pathway and thus proglucagon expression and GLP-1 production through Ca2+/CaMKKbeta/CaMKIV in our present study. Although we could not exclude involvement of other signaling pathways downstream of Piezo1 in regulating the cleavage of proglucagon, granule maturation and the final release of GLP-1, our present study provided evidence to support the involvement of the Ca2+/CaMKKbeta/CaMKIV/mTOR pathway in mediating the role Piezo1 in proglucagon expression and GLP-1 production. The reviewer also expressed concerns on the use of western blot to detect proglucagon expression. In fact, western blot is often used in detection of proglucagon. Here are some examples from other researchers: Diabetes. 2013 Mar;62(3):789-800. Gastroenterology. 2011 May;140(5):1564-74. 2004 Jul 23;279(30):31068-75. The proglucagon antibody we used in our study was purchased from abcam (Cat#ab23468), which can detect proglucagon of 21 kDa.

      Reviewer #2 (Public Review):

      Summary:

      The study by Huang and colleagues focuses on GLP-1 producing entero-endocrine (EEC) L-cells and their regulation of GLP-1 production by a mechano-gated ion channel Piezo1. The study describes Piezo1 expression by L-cells and uses an exciting intersectional mouse model (villin to target epithelium and Gcg to target GLP-1-producing cells and others like glucagon-producing pancreatic endocrine cells), which allows L-cell specific Piezo1 knockout. Using this model, they find an impairment of glucose tolerance, increased body weight, reduced GLP-1 content, and changes to the CaMKKbeta-CaMKIV-mTORC1 signaling pathway using a normal diet and then high-fat diet. Piezo1 chemical agonist and intestinal bead implantation reversed these changes and improved the disrupted phenotype. Using primary sorted L-cells and cell model STC-1, they found that stretch and Piezo1 activation increased GLP-1 and altered the molecular changes described above.

      Strengths:

      This is an interesting study testing a novel hypothesis that may have important mechanistic and translational implications. The authors generated an important intersectional genetics mouse model that allowed them to target Piezo1 L-cells specifically, and the surprising result of impaired metabolism is intriguing.

      Weaknesses:

      However, there are several critical limitations that require resolution before making the conclusions that the authors make.

      (1) A potential explanation for the data, and one that is consistent with existing literature [see for example, PMC5334365, PMC4593481], is that epithelial Piezo1, which is broadly expressed by the GI epithelium, impacts epithelial cell density and survival, and as such, if Piezo1 is involved in L-cell physiology, it may be through regulation of cell density. Thus, it is critical to determine L-cell densities and epithelial integrity in controls and Piezo1 knockouts systematically across the length of the gut, since the authors do not make it clear which gut region contributes to the phenotype they see. Current immunohistochemistry data are not convincing.

      We appreciate the reviewer’s comment. We agree that Piezo1 may affect L-cell density and epithelial integrity. We will do quantification of L-cell density and test the epithelial integrity by examining the expression of tight junction proteins (ZO-1 and Occludin) and determine the transepithelial resistance in different regions of the gut

      (2) Calcium signaling in L-cells is implicated in their typical role of being gut chemo-sensors, and Piezo1 is a calcium channel, so it is not clear whether any calcium-related signaling mechanism would phenocopy these results.

      We will examine whether other calcium-related signaling mechanism also contribute the phenotype seen in the IntL-Piezo1-/- mice.

      (3) Intestinal bead implantation, while intriguing, does not have clear mechanisms - and is likely to provide a point of intestinal obstruction and dysmotility.

      To ascertain if intestinal bead implantation led to intestinal obstruction and dysmotility, we conducted a bowel transit time test. The results revealed no difference in bowel transit time between the sham-operated mice and those implanted with beads.

      (4) Previous studies, some that are very important, but not cited, contradict the presented results (e.g., epithelial Piezo1 role in insulin secretion) and require reconciliation.

      Overall, this study makes an interesting observation but the data are not currently strong enough to support the conclusions.

      We will cite more previous studies on GLP-1 production and discuss the discrepancy between our study and others’ studies. The lack of changes in blood glucose seen in Villin-Piezo1-/- mice reported by Sugisawa et. al. is not surprising (Cell. 2020 Aug 6;182(3):609-624.e21.). Actually, in another recent study from our group, we found similar results when the Villin-Piezo1-/- mice Piezo1fl/fl control mice were fed with normal chow diet. Since Villin-1 is expressed in all the epithelial cells of the gut, including enterocytes and various types of endocrine cells, the effect of L-cell Piezo1 loss may be masked by other cell types under normal condition. However, impair glucose tolerance was seen in Villin-Piezo1-/- mice compared to the Piezo1fl/fl control mice after high fat diet for 8 weeks. We further found that Piezo1 in enterocytes exerted a negative effect on the glucose and lipid absorption. Loss of Piezo1 in enterocytes led to over-absorption of nutrients under high-fat diet (Tian Tao, Qing Shu, Yawen Zhao, Wenying Guo, Jinting Wang, Yuhao Shi, Shiqi Jia, Hening Zhai, Hui Chen, Cunchuan Wang*, Geyang Xu*, Mechanical regulation of lipid and sugar absorption by Piezo1 in enterocytes, Acta Pharmaceutica Sinica B , Accepted, 2024,https://doi.org/10.1016/j.apsb.2024.04.016).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Your editorial guidance, reviews, and suggestions have led us to make substantial changes to our manuscript. While we detail point-by-point responses in typical fashion below, I wanted to outline, at a high level, what we’ve done.

      (1) Methods. Your suggestions led us to rethink our presentation of our methods, which are now described more cohesively in a new methods section in the main text.

      (2) Model Validation & Robustness. Reviewers suggested various validations and checks to ensure that our findings were not, for instance, the consequence of a particular choice of parameter. These can be found in the supplementary materials.

      (3) Data Cleaning & Inclusion/Exclusion. Finally, based on feedback, our new methods section fully describes the process by which we cleaned our original data, and on what grounds we included/excluded individual faculty records from analysis.

      eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and while the evidence in support of the findings is solid, the description of the methods used is in need of improvement.

      Reviewer #1 (Public Review):

      Summary and strengths

      This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings but this study uses a larger than usual dataset that spans many fields and institutions, so it is a good contribution to the field.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Weaknesses

      The paper uses a mixture of mathematical models (basically Leslie matrices, though that term isn't mentioned here) parameterised using statistical models fitted to data. However, the description of the methods needs to be improved significantly. The author should consider citing Matrix Population Models by Caswell (Second Edition; 2006; OUP) as a general introduction to these methods, and consider citing some or all of the following as examples of similar studies performed with these models:

      Shaw and Stanton. 2012. Proc Roy Soc B 279:3736-3741

      Brower and James. 2020. PLOS One 15:e0226392

      James and Brower. 2022. Royal Society Open Science 9:220785 Lawrence and Chen. 2015.

      [http://128.97.186.17/index.php/pwp/article/view/PWP-CCPR-2015-008]

      Danell and Hjerm. 2013. Scientometrics 94:999-1006

      We have expanded the description of methods in a new methods section of the paper which we hope will address the reviewer’s concerns.

      We agree that our model of faculty hiring and attrition resembles Leslie matrices. In results section B, we now mention Leslie matrices and cite Matrix Population Models by Caswell, noting a few key differences between Leslie matrices and the model of hiring and attrition presented in this work. Most notably, in the hiring and attrition model presented, the number of new hires is not based on per-capita fertility constants. Instead, population sizes are predetermined fixed values for each year, precluding exponential population growth or decay towards 0 that is commonly observed in the asymptotic behavior of linear Leslie Matrix models.

      We have additionally revised the main text to cite the listed examples of similar studies (we had already cited James and Brower, 2022). We thank the reviewer for bringing these relevant works to our attention.

      The analysis also runs the risk of conflating the fraction of women in a field with gender diversity! In female-dominated fields (e.g. Nursing, Education) increasing the proportion of women in the field will lead to reduced gender diversity. This does not seem to be accounted for in the analysis. It would also be helpful to state the number of men and women in each of the 111 fields in the study.

      We have carefully examined the manuscript and revised the text to correctly differentiate between gender diversity and women’s representation.

      We have additionally added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Reviewer #2 (Public Review):

      Summary:

      This important study by LaBerge and co-authors seeks to understand the causal drivers of faculty gender demographics by quantifying the relative importance of faculty hiring and attrition across fields. They leverage historical data to describe past trends and develop models that project future scenarios that test the efficacy of targeted interventions. Overall, I found this study to be a compelling and important analysis of gendered hiring and attrition in US institutions, and one that has wide-reaching policy implications for the academy. The authors have also suggested a number of fruitful future avenues for research that will allow for additional clarity in understanding the gendered, racial, and socioeconomic disparities present in US hiring and attrition, and potential strategies for mitigating or eliminating these disparities.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Strengths:

      In this study, LaBerge et al use data from over 268,000 tenured and tenure-track faculty from over 100 fields at more than 12,000 PhD-granting institutions in the US. The period they examine covers 2011-2020. Their analysis provides a large-scale overview of demographics across fields, a unique strength that allows the authors to find statistically significant effects for gendered attrition and hiring across broad areas (STEM, non-STEM, and topical domains).

      LaBerge et al. find gendered disparities in attrition-using both empirical data and their counterfactual model-that account for the loss of 1378 women faculty across all fields between 2011 and 2020. It is true that "this number is both a small portion of academia... and a staggering number of individual careers," as ." - as this loss of women faculty is comparable to losing more than 70 entire departments. I appreciate the authors' discussion about these losses-they note that each of these is likely unnecessary, as women often report feeling that they were pushed out of academic jobs.

      LaBerge et al. also find-by developing a number of model scenarios testing the impacts of hiring, attrition, or both-that hiring has a greater impact on women's representation in the majority of academic fields in spite of higher attrition rates for women faculty relative to men at every career stage. Unlike many other studies of historical trends in gender diversity, which have often been limited to institution-specific analyses, they provide an analysis that spans over 100 fields and includes nearly all US PhD-granting institutions. They are able to project the impacts of strategies focusing on hiring or retention using models that project the impact of altering attrition risk or hiring success for women. With this approach, they show that even relatively modest annual changes in hiring accumulate over time to help improve the diversity of a given field. They also demonstrate that, across the model scenarios they employ, changes to hiring drive the largest improvement in the long-term gender diversity of a field.

      Future work will hopefully - as the authors point out - include intersectional analyses to determine whether a disproportionate share of lost gender diversity is due to the loss of women of color from the professoriate. I appreciate the author's discussion of the racial demographics of women in the professoriate, and their note that "the majority of women faculty in the US are white" and thus that the patterns observed in this study are predominately driven by this demographic. I also highly appreciate their final note that "equal representation is not equivalent to equal or fair treatment," and that diversifying hiring without mitigating the underlying cause of inequity will continue to contribute to higher losses of women faculty.

      Weaknesses

      First, and perhaps most importantly, it would be beneficial to include a distinct methods section. While the authors have woven the methods into the results section, I found that I needed to dig to find the answers to my questions about methods. I would also have appreciated additional information within the main text on the source of the data, specifics about its collection, inclusion and exclusion criteria for the present study, and other information on how the final dataset was produced. This - and additional information as the authors and editor see fit - would be helpful to readers hoping to understand some of the nuance behind the collection, curation, and analysis of this important dataset.

      We have expanded upon the description of methods in a new methods section of the paper.

      We have also added a detailed description of the data cleaning steps taken to produce the dataset used in these analyses, including the inclusion/exclusion criteria applied. This detailed description is at the beginning of the methods section. This addition has substantially enhanced the transparency of our data cleaning methods, so we thank the reviewer for this suggestion.

      I would also encourage the authors to include a note about binary gender classifications in the discussion section. In particular, I encourage them to include an explicit acknowledgement that the trends assessed in the present study are focused solely on two binary genders - and do not include an analysis of nonbinary, genderqueer, or other "third gender" individuals. While this is likely because of the limitations of the dataset utilized, the focus of this study on binary genders means that it does not reflect the true diversity of gender identities represented within the professoriate.

      In a similar vein, additional context on how gender was assigned on the basis of names should be added to the methods section.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      I do think that some care might be warranted regarding the statement that "eliminating gendered attrition leads to only modest changes in field-level diversity" (Page 6). while I do not think that this is untrue, I do think that the model scenarios where hiring is "radical" and attrition is unchanged from present (equal representation of women and men among hires (ER) + observed attrition (OA)) shows that a sole focus on hiring dampens the gains that can otherwise be addressed via even modest interventions (see, e.g., gender-neutral attrition (GNA) + increasing representation of women among hires (IR)). I am curious as to why the authors did not include an additional scenario where hiring rates are equal and attrition is equalized (i.e., GNA + ER). The importance of including this additional model is highlighted in the discussion, where, on Page 7, the authors write: "In our forecasting analysis, we find that eliminating the gendered attrition gap, in isolation, would not substantially increase representation of women faculty in academia. Rather, progress towards gender parity depends far more heavily on increasing women's representation among new faculty hires, with the greatest change occurring if hiring is close to gender parity." I believe that this statement would be greatly strengthened if the authors can also include a comparison to a scenario where both hiring and attrition are addressed with "radical" interventions.

      Our rationale for omitting the GNA + ER scenario in the presented analysis is that we can reason about the outcomes of this scenario without the need for computation; if a field has equal inputs of women and men faculty (on average) and equal retention rates between women and men (on average), then, no matter the field’s initial age and gender distribution of faculty, the expected value for the percentage of women faculty after all of the prior faculty have retired (which may take 40+ years) is exactly 50%. We have updated the main text to discuss this point.

      Reviewer #3 (Public Review):

      This manuscript investigates the roles of faculty hiring and attrition in influencing gender representation in US academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      Strengths:

      Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses:

      The primary weakness of the study lies in its focus on US academia, which may limit the generalizability of its findings to other cultural and academic contexts.

      We agree that the U.S. focus of this study limits the generalizability of our findings. The findings that we present in this work will only generalize to other populations–whether it be to an alternate industry, e.g., tech workers, or to faculty in different countries–to the extent that these other populations share similar hiring patterns, retention patterns, and current demographic representation. We have added a discussion of this limitation to the manuscript.

      Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Our projection analysis is intended to illustrate the potential gender representation outcomes of several possible counterfactual scenarios, with each projection being conditioned on transparent and simple assumptions. In this way, the projection analysis is not intended to predict or forecast the future.

      To resolve this point for our readers, we now introduce our projections in the context of the related terms of prediction and forecast, noting that they have distinct meanings as terms of art: On one hand, prediction and forecasting involve anticipating a specific outcome based on available information and analysis, and typically rely on patterns, trends, or historical data to make educated guesses about what will happen. Projections are based on assumptions and are often presented in a panel of possible future scenarios. While predictions and forecasts aim for precision, projections (which we make in our analysis) are more generalized and may involve a range of potential outcomes.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not included in the AARC (Academic Analytics Research Centre) dataset.

      In our revision, we have elevated this important point, and clarified it in the context of the various ways in which we count hires and attritions. We now explicitly state that “We define faculty hiring and faculty attrition to include all cases in which faculty join or leave a field or domain within our dataset.” Then, we enumerate the number of situations that could be counted as hires and attritions, including the reviewer’s example of faculty who move to another country.

      Reviewer #1 (Recommendations For The Authors):

      Section B: The authors use an age structured Leslie matrix model (see Caswell for a good reference to these) to test the effect of making the attrition rates or hiring rates equal for men and women. My main concern here is the fitting techniques for the parameters. These are described (a little too!) briefly in section S1B. Some specific questions that are left hanging include:

      A 5th order polynomial is an interesting choice. Some statistical evidence as to why it was the best fit would be useful. What other candidate models were compared? What was the "best fit" judgement made with: AIC, r^2? What are the estimates for how good this fit is? How many data points were fitted to? Was it the best fit choice for all of the 111 fields for men and women?

      We use a logistic regression model for each field to infer faculty attrition probabilities across career ages and time, and we include the career age predictor up to its fifth power to capture the career-age correlations observed in Spoon et. al., Science Advances, 2023. For ease of reference, we reproduce the attrition risk curves in Fig S4.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement.

      This function shape starts low and ends high, and includes at least one local minimum, which indicates that career age should be odd-ordered in the model and at least order-3, but only including career age up to its 3rd order term tended to miss some of the overserved career-age/attrition correlations. We evaluated the fit using 5-fold cross validation with a Brier score loss metric, and among options of polynomials of degree 1, 3, 5, or 7, we found that 5th order performed well overall on average over all fields (even if it was not the best for every field), without overfitting in fields with fewer data. Example fits, reminiscent of the figure from Spoon et al, are now provided in Figs S4 and S5.

      While the model fit with fifth order terms may not be the best fit for all 111 fields (e.g., 7th order fits better in some cases), we wanted to avoid field-specific curves that might be overfitted to the field-specific data, especially due to low sample size (and thus larger fluctuations) on the high career age side of the function. Our main text and supplement now includes justifications for our choice to include career age up to its fifth order terms.

      You used the 5th order logistic regression (bottom of page 11) to model attrition at different ages. The data in [24] shows that attrition increases sharply, then drops then increases again with career age. A fifth order polynomial on its own could plausibly do this but I associate logistic regression models like this as being monotonically increasing (or decreasing!), again more details as to how this worked would be useful.

      Our first submission did not explain this point well, but we hope that Supplementary Figures S4 and S5 provide clarity. In short, we agree of course that typical logistic regression assumes a linear relationship between the predictor variables and the log odds of the outcome variable. This means that the relationship between the predictor variables and the probability of the outcome variable follows a sigmoidal (S-shaped) curve. However, the relationship between the predictor variables and the outcome variable may not be linear.

      To capture more complex relationships, like the increasing, decreasing and then increasing attrition rates as a function of career age, higher-order terms can be added to the logistic regression model. These higher-order terms allow the model to capture nonlinear relationships between the predictor variables and the outcome variable — namely the non-monotonic relationship between rates of attrition and career age — while staying within a logistic regression framework.

      "The career age of new hires follows the average career age distribution of hires" did you use the empirical distribution here or did you fit a standard statistical distribution e.g. Gamma?

      We used the empirical distribution. This information has been added to the updated methods section in the main text.

      How did you account for institution (presumably available)? Your own work has shown that institution types plays a role which could be contributing to these results.

      See below.

      What other confounding variables could be at play here, what is available as part of the data and what happens if you do/don't account for them?

      A number of variables included in our data have been shown to correlate with faculty attrition, including PhD prestige, current institution prestige, PhD country, and whether or not an individual is a “self-hire,” i.e., trained and hired at the same institution (Wapman et. al., Nature, 2022). Additional factors that faculty self-report as reasons for leaving academia include issues of work-life balance, workplace climate, and professional reasons, and in some cases to varying degrees between men and women faculty (Spoon et. al., Sci. Adv., 2023).

      Our counterfactual analysis aims to address a specific question: how would women’s representation among faculty be different today if men and women were subjected to the same attrition patterns over the past decade? To answer this question, it is important to account for faculty career age, which we accept as a variable that will always correlate strongly with faculty attrition rates, as long as the tenure filter remains in place and faculty continue to naturally progress towards retirement age. On the other hand, it is less clear why PhD country, self-hire status, or any of the other mentioned variables should necessarily correlate with attrition rates and with gendered differences in attrition rates more specifically. While some or all of these variables may underlie the causal roots of gendered attrition rates, our analysis does not seek to answer causal questions about why faculty leave their jobs (e.g., by testing the impact of accounting for these variables in simulations per the reviewers suggestion). This is because we do not believe the data used in this analysis is sufficient to answer such questions, lacking comprehensive data on faculty stress (Spoon et. al., Sci. Adv., 2023), parenthood status, etc.

      What career age range did the model use?

      The career age range observed in model outcomes are a function of the empirically derived attrition rates for faculty across academic fields. The highest career age observed in the AARC data was 80, and the faculty career ages that result from our model simulations and projections do not exceed 80.

      We have also added the distribution of faculty across career ages for the projection scenario model outputs in the supplemental materials Fig. S3 (see response to your later comment regarding career age for further details). Looking at these distributions, it is observed that very few faculty have career age > 60, both in observation and in our simulations.

      What was the initial condition for the model?

      Empirical 2011 Faculty rosters are used as the initial conditions for the counterfactual analysis, and 2020 faculty rosters are these as the initial conditions for the projections analysis. This information has been added to the descriptions of methods in the main text.

      Starting the model in 2011 how well does it fit the available data up to 2020?

      Thank you for this suggestion. We ran this analysis for each field starting in 2011, and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields. This finding is not surprising, because the model is fit to the observed data, but it serves to validate the methods that we used to extract the model's parameters. We have added these results to the supplement (Fig. S2).

      What are the sensitivity analysis results for the model? If you have made different fitting decisions how much would the results change? All this applied to both the hiring and attrition parameters estimates.

      We model attrition and hiring using logistic regression, with career age included as an exogenous variable up to its fifth power. A natural question follows: what if we used a model with career age only to its first or third power? Or to higher powers? We performed this sensitivity analysis, and added three new figures to the supplement to present these findings:

      First, we show the observed attrition probabilities at each career age, and four model fits to attrition data (Supplementary Figs S4 and S5). The first model includes career age only to its first power, and this model clearly does not capture the full career age / attrition correlation structure. The second model includes career age to its third power, which does a better job of fitting to the observed patterns. The third model includes career age up to its fifth power, which appears to very modestly improve upon the former model. The fourth model includes career age up to its seventh power, and the patterns captured by this model are largely the same as the 5th-power model up to career age 50, beyond which there are some notable differences in the inferred attrition probabilities. These differences would have relatively little impact on model outcomes because the vast majority of faculty have a career age below 50.

      Second, we show the observed probability that hires are women, conditional on the career age of the hire. Once again, we fit four models to the data, and find that career age should be included at least up to its fifth order in order to capture the correlation structures between career age and the gender of new hires. However, limited differences result from including career age up to the 7th degree in the model (relative to the 5th degree).

      As a final sensitivity analysis, we reproduce Fig. 2, but rather than including career age as an exogenous variable up to its fifth power in our models for hiring and attrition, we include career age up to its third power. Findings under this parameterization are qualitatively very similar to those presented in Fig. 2, indicating that the results are robust to modest changes to model parameterization (shown in supplement Fig. S6).

      Far more detail in this and some interim results from each stage of the analysis would make the paper far more convincing. It currently has an air of "black box" too much of the analysis which would easily allow an unconvinced reader to discard the results.

      We have added more detailed descriptions of the methods to the main text. We hope that the changes made will address these concerns.

      Section C: You use the Leslie model to predict the future population. As the model is linear the population will either grow exponentially (most likely) or dwindle to zero. You mention you dealt with this by scaling the average value of H to keep the population at 2020 levels? This would change the ratio of hiring to attrition. How did this affect the timescale of the results. If a field had very minimal attrition (and hence grew massively over the time period of the dataset) the hiring rate would have to be very small too so there would be very little change in the gender balance. Did you consider running the model to steady state instead?

      We chose the 40 year window (2020-2060) for this projection analysis because 40 years is roughly the timespan of a full-length faculty career. In other words, it will take around 40 years for most of the pre-existing faculty from 2020 to retire, such that the new, simulated faculty will have almost entirely replaced all former faculty by 2060.

      For three out of five of our projection scenarios (OA, GNA, OA+ER), the point at which observed faculty are replaced by simulated faculty represents steady state. One way to check this intuition is to observe the asymptotic behavior of the trajectories in Fig. 3B; the slopes for these 3 scenarios nearly level out within 40 years.

      The other two scenarios (OA + IR, GNA+IR) represent situations where women’s representation among new hires is increasing each year. These scenarios will not reach steady state until women represent 100% of faculty. Accordingly, the steady state outcomes for these scenarios would yield uninteresting results; instead, we argue that it is the relative timescales that are interesting.

      What did you do to check that your predictions at least felt realistic under the fitted parameters? (see above for presenting the goodness of fit over the 10 years of the data).

      We ran the analysis suggested in a prior comment (Starting the model in 2011 how well does it fit the available data up to 2020?) and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields, plus the “All STEM” and “All non-STEM” aggregations.

      You only present the final proportion of women for each scenario. As mentioned earlier, models of this type have a tendency to lead to strange population distributions with wild age predictions and huge (or zero populations). Presenting more results here would assuage any worries the reader had about these problems. What is the predicted age distribution of men and women in the long term scenarios? Would a different method of keeping the total population in check have yielded different results? Interim results, especially from a model as complex as this one, rather than just presenting a final single number answer are a convincing validation that your model is a good one! Again, presenting this result will go a long way to convincing readers that your results are sound and rigorous.

      Thank you for this suggestion. We now include a figure that presents faculty age distributions for each projection scenario at 2060 against the observed faculty age distribution in 2020 (pictured below, and as Fig. S3 in the supplementary materials). We find that the projected age distributions are very similar to the observed distributions for natural sciences (shown) and for the additional academic domains. We hope this additional validation will inspire confidence in our model of faculty hiring and attrition for the reviewer, and for future readers.

      In Fig S3, line widths for the simulated scenarios span the central 95% of simulations.

      Other people have reached almost identical conclusions (albeit it with smaller data sets) that hiring is more important than attrition. It would be good to compare your conclusions with their work in the Discussion.

      We have revised the main text to cite the listed examples of similar studies. We thank the reviewer for bringing these relevant works to our attention.

      General comments:

      What thoughts have you given to non-binary individuals?

      Be careful how you use the term "gender diversity"! In many countries "Gender diverse" is a term used in data collection for non-binary individuals, i.e. Male, female, gender diverse. The phrase "hiring more gender diverse faculty" can be read in different ways! If you are only considering men and women then gender balance may be a better framework to use.

      We have added language to the main text which explicitly acknowledges that our analysis focuses on men and women due to limitations in our name-based gender tool, which only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      We have also taken additional care with referring to “gender diversity,” per reviewer 1’s point in their public review.

      Reviewer #2 (Recommendations For The Authors):

      Data availability: I did not see an indication that the dataset used here is publicly available, either in its raw format or as a summary dataset. Perhaps this is due to the sensitive nature of the data, but regardless of the underlying reason, the authors should include a note on data availability in the paper.

      The dataset used for these analyses were obtained under a data use agreement with the Academic Analytics Research Center (AARC). While these data are not publicly available, researchers may apply for data access here: https://aarcresearch.com/access-our-data.

      We also added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Additionally, a variety of summary statistics based on this dataset are available online, here: https://github.com/LarremoreLab/us-faculty-hiring-networks/tree/main

      Gender classification: Was an existing package used to classify gender from names in the dataset, or did the authors develop custom code to do so? Either way, this code should be cited. I would also be curious to know what the error rate of these classifications are, and suggest that additional information on potential biases that might result from automated classifications be included in the discussion, under the section describing data limitations. The reliability of name-based gender classification is particularly of interest, as external gender classifications such as those applied on the basis of an individual's name - may not reflect the gender with which an individual self-identifies. In other words, while for many people their names may reflect their true genders, for others those names may only reflect their gender assigned at birth and not their self-perceived or lived gender identity. Nonbinary faculty are in particular invisibilized here (and through any analysis that assigns binary gender on the basis of name). While these considerations do not detract from the main focus of the study - which was to utilize an existing dataset classified only on the basis of binary gender to assess trends for women faculty-these limitations should be addressed as they provide additional context for the interpretation of the results and suggest avenues for future research.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      As we mentioned in response to the public review, we use a free and open source python package called nomquamgender to estimate the strengths of name-gender associations, and we apply gender labels to the names with sufficiently strong associations with a binary gender. This package is based on a paper by Van Buskirk et. al. 2023, “An open-source cultural consensus approach to name-based gender classification,” which documents error rates and potential biases.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      Page 1: The sentence beginning "A trend towards greater women's representation could be caused..." is missing a conjunction. It should likely read: "A trend towards greater women's representation could be caused entirely by attrition, e.g., if relatively more men than women leave a field, OR entirely by hiring..."

      We have edited the paragraph to remove the sentence in question.

      Pages 1-2: The sentence beginning "Although both types of strategy..." and ending with "may ultimately achieve gender parity" is a bit of a run-on; perhaps it would be best to split this into multiple sentences for ease of reading.

      We have revised this run-on sentence.

      Page 2: See comments in the public review about a methods section, the addition of which may help to improve clarity for the readers. Within the existing descriptions of what I consider to be methods (i.e., the first three paragraphs currently under "results"), some minor corrections could be added here. First, consider citing the source of the dataset in the line where it is first described (in the sentence "For these analyses, we exploit a census-level dataset of employment and education records for tenured and tenure-track faculty in 12,112 PhD-granting departments in the United States from 2011-2020.") It also may be helpful to include context here (or above, in the discussion about institutional analyses) about how "departments" can be interpreted. For example, how many institutions are represented across these departments? More information on how the authors eliminated the gendered aspect of patterns in their counterfactual model would be helpful as well; this is currently hinted at on page 4, but could instead be included in the methods section with a call-out to the relevant supplemental information section (S2B).

      We have added a citation to Academic Analytics Research Center’s (AARC) list of available data elements to the data’s introduction sentence. We hope this will allow readers to familiarize themselves with the data used in our analysis.

      Faculty department membership was determined by AARC based on online faculty rosters. 392 institutions are represented across the 12,112 departments present in our dataset. We have updated the main text to include this information.

      Finally, we have added a methods section to the main text, which includes information on how the gendered aspect of attrition patterns were eliminated in the counterfactual model.

      Page 2: Perhaps some indication of how many transitions from an out-of-sample institution might be helpful to readers hoping to understand "edge cases."

      In our analysis, we consider all transitions from out-of-sample institutions to in-sample institutions as hires, and all transitions away from in-sample institutions–whether it be to an out of sample institution, or out of academia entirely–as attritions. We choose to restrict our analysis of hiring and attrition to PhD granting institutions in the U.S. in this way because our data do not support an analysis of other, out-of-sample institutions.

      I also would have liked additional information on how many faculty switched institutions but remained "in-sample and in the same field" - and the gender breakdowns of these institutional changes, as this might be an interesting future direction for studies of gender parity. (For example, readers may be spurred to ask: if the majority of those who move institutions are women, what are the implications for tenure and promotion for these individuals?)

      While these mid-career moves are not counted as attritions in the present analysis, a study of faculty who switch institutions but remain (in-sample) as faculty could shed light on issues of gendered faculty retention at the level of institutions. We share the reviewer’s interest in a more in depth study of mid-career moves and how these moves impact faculty careers, and we now discuss the potential value of such a study towards the end of the paper. In fact, this subject is the topic of a current investigation by the authors!

      Page 3: I was confused by the statement that "of the three types of stable points, only the first point represents an equitable steady-state, in which men and women faculty have equal average career lengths and are hired in unchanging proportions." Here, for example, computer science appears to be close to the origin on Figure 1, suggesting that hiring has occurred in "unchanging proportions" over the study interval. However, upon analysis of Table S2, it appears that changes in hiring in Computer Science (+2.26 pp) are relatively large over the study interval compared to other fields. Perhaps I am reading too literally into the phrase that "men and women faculty are hired in unchanging proportions" - but I (and likely others) would benefit from additional clarity here.

      We had created an arrow along with the computer science label in Fig. 1, but it was difficult to see, which is likely the source of this confusion. This was our fault, and we have moved the “Comp. Sci.” label and its corresponding arrow to be more visible in Figure 1.

      Changes in women’s representation in Computer Science due to hiring over 2011 - 2020 was +2.26 pp as the reviewer points out, but, consulting Fig. 1 and the corresponding table in the supplement, we observe that this is a relatively small amount of change compared to most fields.

      Page 3: If possible it may be helpful to cite a study (or multiple) that shows that "changes in women's representation across academic fields have been mostly positive." What does "positive" mean here, particularly when the changes the authors observe are modest? Perhaps by "positive" you mean "perceived as positive"?

      We used the term positive in the mathematical sense, to mean greater than zero. We have reworded the sentence to read “women's representation across academic fields has been mostly increasing…” We hope this change clarifies our meaning to future readers.

      Page 3: The sentence that ends with "even though men are more likely to be at or near retirement age than women faculty due to historical demographic trends" may benefit from a citation (of either Figure S3 or another source).

      We now cite the corresponding figure in this sentence.

      Page 4: The two sentences that begin with "The empirical probability that a person leaves their academic career" would benefit from an added citation.

      We have added a citation to the sentences.

      Figure 3: Which 10 academic domains are represented in Panel 3B? The colors in appear to correspond to the legend in Panel 3A, but no indication of which fields are represented is provided. If possible, please do so - it would be interesting and informative to be able to make these comparisons.

      This was not clear in the initial version of Fig. 3B, so we now label each domain. For reference, the domains represented in 3B are (from top to bottom):

      ● Health

      ● Education

      ● Journalism, Media, Communication

      ● Humanities

      ● Social Sciences

      ● Public Administration and Policy

      ● Medicine

      ● Business

      ● Natural Sciences

      ● Mathematics and Computing

      ● Engineering

      Page 6: Consider citing relevant figure(s) earlier up in paragraph 2 of the discussion. For example, the first sentence could refer to Figure 1 (rather than waiting until the bottom of the paragraph to cite it).

      Thank you for this suggestion, we now cite Fig. 1 earlier in this discussion paragraph.

      Page 10: A minor comment on the fraction of women faculty in any given year-the authors assume that the proportion of women in a field can be calculated from knowing the number of women in a field and the number of men. This is, again, true if assuming binary genders but not true if additional gender diversity is included. It is likely that the number of nonbinary faculty is quite low, and as such would not cause a large change in the overall proportions calculated here, but additional context within the first paragraph of S1 might be helpful for readers.

      We have added additional context in the first paragraph of S1, explaining that an additional term could be added to the equation to account for nonbinary faculty representation if our data included nonbinary gender annotations. Thank you for making this point.

      Page 10: Please include a range of values for the residual terms of the decomposition of hiring and attrition in the sentence that reads "In Figure S1 we show that the residual terms are small, and thus the decomposition is a good approximation of the total change in women's representation."

      These residual terms range from -0.51pp to 1.14pp (median = 0.2pp). We have added this information to the sentence in question.

      Page 12: It may be helpful to readers to include a description of the information contained in Table S2 in the supplemental text under section S3.

      We refer to table S2 twice in the main text (once in the observational findings, and once for the counterfactual analysis), and the contents of table S2 are described thoroughly in the table caption.

      Reviewer #3 (Recommendations For The Authors):

      (1) There is a potential limitation in the generalizability of the findings, as the study focuses exclusively on US academia. Including international perspectives could have provided a more global understanding of the issues at hand.

      The U.S. focus of this study limits the generalizability of our findings, as non-U.S. other faculty may exhibit differences in hiring patterns, retention patterns, and current demographic representations. We have added a discussion of this limitation to the manuscript. Unfortunately, our data do not support international analyses of hiring and attrition.

      (2) I am not sure that everyone who disappeared from the AARC dataset could be count as "attrition" from academia. Indeed, some who disappeared might have completely left academia once they disappeared from the AARC dataset. Yet, there's also the possibility that some professors left for academic positions in countries outside of the US, or US institutions that are not included in the AARC dataset. These individuals didn't leave academia. Furthermore, it is also possible that these scholars who moved to an institution outside of US or not indexed by AARC are gender specific. Therefore, analyses that this study conducts should find a way to test whether the assumption that anyone who disappeared from AARC is indeed valid. If not, how will this potentially challenge the current conclusions?

      The reviewer makes an important point: faculty who move to faculty positions in other countries and faculty who move to non-PhD granting institutions, or to institutions that are otherwise not included in the AARC data are all counted as attritions in our analysis. We intentionally define hiring and attrition broadly to include all cases in which faculty join or leave a field or domain within our dataset.

      The types of transitions that faculty make out of the tenure track system at PhD granting institutions in the U.S. may correlate with faculty attributes, like gender. For example, women or men may be more likely to transition to tenure track positions at non-U.S. institutions. Nevertheless, these types of career transition represent an attrition for the system of study, and a hire for another system. Following this same logic, faculty who transition from one field to another field in our analysis are treated as an attrition from the first field and a hire into the new field.

      By focusing on “all-cause” attrition in this way, we are able to make robust insights for the specific systems we consider (e.g.,, STEM and non-STEM faculty at U.S. PhD granting institutions), without being roadblocked by the task of annotating faculty departures and arbitrating which should constitute “valid” attritions.

      (3) It would be very interesting to know how much of the attribution was due to tenure failure. Previous studies have suggested that women are less likely to be granted tenure, which makes me wonder about the role that tenure plays in the gendered patterns of attrition in academia.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement. The first local maximum appears to coincide roughly with the tenure clock timing, but we can only speculate that these attritions are tenure related. Our dataset is unfortunately not equipped to determine the causal mechanisms driving attrition.

      We reproduce the attrition risk curve in the supplementary materials, Fig. S4:

      (4) The dataset used doesn't fully capture the complexities of academic environments, particularly smaller or less research-intensive institutions (regional universities, historically black colleges and universities, and minority-serving institutions). This could be potentially added to the manuscript for discussions.

      We have added this point to the description of this study’s limitations in the discussion.

    2. eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and the evidence in support of the findings is convincing.

    3. Reviewer #1 (Public Review):

      Summary<br /> This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings, i.e. that improving hiring rates does more for women's representation than reducing attrition, but this study uses a larger than usual dataset that spans many fields and institutions so it is a good contribution to the field.

      The paper is much improved and far more convincing as a result of the revisions made by the authors.

      Strengths<br /> A large data set with many individuals, many institutions and fields of research.<br /> A good sensitivity analysis to test for potential model weaknesses.

      Weaknesses<br /> Only a single country with a very specific culture and academic system.<br /> Complex model fitting with many steps and possible places for model bias.

    4. Reviewer #3 (Public Review):

      Summary<br /> This study investigates the roles of faculty hiring and attrition in influencing gender representation in U.S. academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      The revisions made by the authors have improved the paper.

      Strengths<br /> Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses<br /> The primary weakness of the study lies in its focus on U.S. academia, which may limit the generalizability of its findings to other cultural and academic contexts. Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not indexed by AA.

    1. eLife assessment

      This valuable study describes mice with a knock out of the IQ motif-containing H (IQCH) gene, to model a human loss-of-function mutation in IQCH associated with male sterility. The infertility is reproduced in the mouse, making it a compelling model, but the mechanistic experiments provide only incomplete evidence for interaction between IQCH and potential RNA binding proteins, which are prominently mentioned in the title. The paper, which has undergone multiple rounds of review, could be of interest to cell biologists and male reproductive biologists working on the sperm flagellar cytoskeleton and mitochondrial structure.

    2. Reviewer #3 (Public Review):

      In this study, Ruan et al. investigate the role of the IQCH gene in spermatogenesis, focusing on its interaction with calmodulin and its regulation of RNA-binding proteins. The authors examined sperm from a male infertility patient with an inherited IQCH mutation as well as Iqch CRISPR knockout mice. The authors found that both human and mouse sperm exhibited structural and morphogenetic defects in multiple structures, leading to reduced fertility in Ichq-knockout male mice. Molecular analyses such as mass spectrometry and immunoprecipitation indicated that RNA-binding proteins are likely targets of IQCH, with the authors focusing on the RNA-binding protein HNRPAB as a critical regulator of testicular mRNAs. The authors used in vitro cell culture models to demonstrate an interaction between IQCH and calmodulin, in addition to showing that this interaction via the IQ motif of IQCH is required for IQCH's function in promoting HNRPAB expression. In sum, the authors concluded that IQCH promotes male fertility by binding to calmodulin and controlling HNRPAB expression to regulate the expression of essential mRNAs for spermatogenesis. These findings provide new insight into molecular mechanisms underlying spermatogenesis and how important factors for sperm morphogenesis and function are regulated.

      The strengths of the study include the use of mouse and human samples, which demonstrate a likely relevance of the mouse model to humans; the use of multiple biochemical techniques to address the molecular mechanisms involved; the development of a new CRISPR mouse model; ample controls; and clearly displayed results. Assays are done rigorously and in a quantitative manner. Overall, the claims made by the authors in this manuscript are well-supported by the data provided.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      By identifying a loss of function mutant of IQCH in infertile patient, Ruan et al. shows that IQCH is essential for spermiogenesis by generating a knockout mouse model of IQCH. Similar to infertile patient with mutant of IQCH, Iqch knockout mice are characterized by a cracked flagellar axoneme and abnormal mitochondrial structure. Mechanistically, IQCH regulates the expression of RNA-binding proteins (especially HNRPAB), which are indispensable for spermatogenesis.

      Although this manuscript contains a potentially interesting piece of work that delineates a mechanism of IQCH that associates with spermatogenesis, this reviewer feels that a number of issues require clarification and re-evaluation for a better understanding of the role of IQCH in spermatogenesis.

      Line 251 - 253, "To elucidate the molecular mechanism by which IQCH regulates male fertility, we performed liquid chromatography tandem mass spectrometry (LC‒MS/MS) analysis using mouse sperm lysates and detected 288 interactors of IQCH (Figure 5-source data 1)."

      The reviewer had already raised significant concerns regarding the text above, noting that "LC‒MS/MS analysis using mouse sperm lysates" would not identify interactors of IQCH. However, this issue was not addressed in the revised manuscript. In the Methods section detailing LC-MS/MS, the authors stated that it was conducted on "eluates obtained from IP". However, there was no explanation provided on how IP for LC-MS/MS was performed. Additionally, it was unclear whether LC-MS or LC-MS/MS was utilized. The primary concern is that if LC‒MS/MS was conducted for the IP of IQCH, IQCH itself should have been detected in the results; however, as indicated by Figure 5-source data 1, IQCH was not listed.

      Thanks to reviewer’s comments. Additional details regarding the IP protocol for LC-MS/MS analysis have been included in the methods section in the revised manuscript. Furthermore, we apologize for the previous inconsistencies in the terminology used for LC-MS/MS and have now ensured its consistent usage throughout the document. Regarding the primary concern about the absence of IQCH in Figure 5-source data 1, our study only showed identifying proteins that interact with IQCH, not IQCH itself. Additionally, we conducted co-IP experiments to validate the interactions identified by LC-MS/MS analysis. Actually, we identified the IQCH itself by LC-MS/MS analysis (Author response table 1).

      Author response table 1.

      Results of the LC-MS/MS analysis.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors should know what experiments have been done for the studies.

      We apologize for our oversights. The method for RNA-binding protein immunoprecipitation (RIP) has been detailed in the revised manuscript.

      Typos still remain in the text, e.g., line 253, "Fiugre".

      We are sorry for the spelling errors. We have engaged professional editing services to refine our manuscript.

    1. eLife assessment

      This study presents an important finding on the function of PLP1+ enteric glia. The evidence supporting the claims of the authors is solid, although the inclusion of additional data showing the mechanisms by which PLP1+ enteric glia acts on Paneth cells would have strengthened the study. The work will be of interest to researchers working on intestinal biology.

    2. Reviewer #1 (Public Review):

      The role of enteric glial cells in regulating intestinal mucosal functions at a steady state has been a matter of debate in recent years. Enteric glial cell heterogeneity and related methodological differences likely underlie the contrasting findings obtained by different laboratories. Here, Prochera and colleagues used Plp1-CreERT2 driver mice to deplete the majority of enteric glia from the gut. They found that glial loss has very limited effects on the transcriptome of gut cells 11 days after tamoxifen treatment (used to induce DTA expression), and by extension - more specifically, has only minimal impact on cells of the intestinal mucosa. Interestingly, in the colon (where Paneth cells are not present) they did observe transcriptomic changes related to Paneth cell biology. Although no overt gene expression alterations were found in the small intestine - also not in Paneth cells - morphological, ultrastructural, and functional changes were detected in the Paneth cells of enteric glia-depleted mice. In addition, and possibly related to Paneth cell dysfunction, enteric glia-depleted mice also show alterations in intestinal microbiota composition.

      In their analyses of enteric glia from existing single-cell transcriptomic data sets, it is stated that these come from 'non-diseased' humans. However, the data on the small intestine is obtained from children with functional gastrointestinal disorders (Zheng 2023). Data on colonic enteric glia was obtained from colorectal cancer patients (Lee 2020). Although here the cells were isolated from non-malignant regions, saying that the large intestines of these patients are non-diseased is probably an overstatement. Another existing dataset including human mucosal enteric glia of healthy subjects is presented in Smillie et al (2019). It would be interesting to see how the current findings relate to the data from Smillie et al.

      The time between enteric glia depletion and analyses (mouse sacrifice) must be a crucial determinant of the type of effects, and the timing thereof. In the current study 11 days after tamoxifen treatment was chosen as the time point for analyses, which is consistent with earlier work by the lab using the same model (Rao et al 2017). What would happen when they wait longer than 11 days after tamoxifen treatment? Data, not necessarily for all parameters, on later time points would strengthen the manuscript significantly.

      The authors found transcriptional dysregulation related to Paneth cell biology in the colon, where Paneth cells are normally not present. Given the bulk RNA sequencing approach, the cellular identity in which this shift is taking place cannot be determined. However, it would be useful if the authors could speculate on which colonic cell type they reckon this is happening in. On the other hand, enteric glia depletion was found to affect Paneth cells structurally and functionally in the small intestine, where transcriptional changes were initially not identified. Only when performing GSEA with the in silico help of cell type-specific gene profiles, differences in Paneth cell transcriptional programs in the small intestine were uncovered. A comment on this discrepancy would be helpful, especially for the non-bioinformatician readers among us.

      From looking at Figure 3B it is clear that Paneth cells are not the only epithelial cell type affected (after less stringent in silico analyses) by enteric glial cell depletion. Although the authors show that this does not translate into ultrastructural or numerical changes of most of these cell types, this makes one wonder how specific the enteric glia - Paneth cell link is. Besides possible indirect crosstalk (via neurons), it is not clear if enteric glia more closely associate with Paneth cells as compared to these other cell types. Immunofluorescence stainings of some of these cells in the Plp1-GFP mice would be informative here. The authors mention IL-22 as a possible link, but do Paneth cells express receptors for transmitters commonly released by enteric glia? Maybe they can have a look at putative cell-cell interactions by mapping ligand-receptor pairs in the scRNAseq datasets they used.

      Previously the authors showed that enteric glia regulation of intestinal motility is sex-dependent (Rao et al 2017). While enteric glia depletion caused dysmotility in female mice, it did not affect motility in males. For this reason, most experiments in the current study were conducted in male mice only. However, for the experiments focusing on the effect of enteric glia depletion on host-microbiome interactions and intestinal microbiota composition both male and female mice were used. In Figure 8A male and female mice are distinctly depicted but this was not done for Figure 8C. Separate characterization of the microbiome of male and female mice would have helped to figure out how much intestinal dysmotility (in females) contributes to the effect on gut microbial composition. This is an important exercise to confirm that the effect on the microbiome is indeed a consequence of altered Paneth cell function, as suggested by the authors (in the results and discussion, and in the abstract). In this context, it would also be interesting to compare the bulk sequencing data after enteric glia depletion between female and male mice.

    3. Reviewer #2 (Public Review):

      This is an excellent and timely study from the Rao lab investigating the interactions of enteric glia with the intestinal epithelium. Two early studies in the late 1990s and early 2000s had previously suggested that enteric glia play a pivotal role in control of the intestinal epithelial barrier, as their ablation using mouse models resulted in severe and fatal intestinal inflammation. However, it was later identified that these inflammatory effects could have been an indirect product of the transgenic mouse models used, rather than due to the depletion of enteric glia. In previous studies from this lab, the authors had identified expression of PLP1 in enteric glia, and its use in CRE driver lines to label and ablate enteric glia.

      In the current paper, the authors carefully examine the role of enteric glia by first identifying that PLP1-creERT2 is the most useful driver to direct enteric glial ablation, in terms of the number of glial cells targeted, their proximity to the intestinal epithelium, and the relevance for human studies (GFAP expression is rather limited in human samples in comparison). They examined gene expression changes in different regions of the intestine using bulk RNA-seq following ablation of enteric glia by driving expression of diphtheria toxin A (PLP1-creERT2;Rosa26-DTA). Alterations in gene expression were observed in different regions of the gut, with specific effects in different regions. Interestingly, while there were gene expression changes in the epithelium, there were limited changes to the proportions of different epithelial cell types identified using immunohistochemistry in control vs glial-ablated mice. The authors then focused on the investigation of Paneth cells in the ileum, identifying changes in the ultrastructural morphology and lysozyme activity. In addition, they identified alterations in gut microbiome diversity. As Paneth cells secrete antimicrobial peptides, the authors conclude that the changes in gut microbiome are due to enteric glia-mediated impacts on Paneth cell activity.

      Overall, the study is excellent and delves into the different possible mechanisms of action, including the investigation of changes in enteric cholinergic neurons innervating the intestinal crypts. The use of different CRE drivers to target enteric glial cells has led to varying results in the past, and the authors should be commended on how they address this in the Discussion.

    4. Reviewer #3 (Public Review):

      In this study, Prochera, et al. identify PLP1+ cells as the glia that most closely interact with the gut epithelium and show that genetic depletion of these PLP1+ glia in mice does not have major effects on the intestinal transcriptome or the cellular composition of the epithelium. Enteric glial loss, however, causes dysregulation of Paneth cell gene expression that is associated with morphological disruption of Paneth cells, diminished lysozyme secretion, and altered gut microbial composition. Overall, the authors need to first prove whether the Plp1CreER Rosa26DTA/+ mice system is viable. Also, most experimental systems have been evaluated by immunohistochemistry, scRNAseq, and electron microscopy, but need quantitative statistical processing. In addition, the value of the paper would be enhanced if the significance of why the phenotype appeared in the large intestine rather than the small intestine when PLP1 is deficient for Paneth cells is clarified.

      Weaknesses:

      Major:

      (1) Supplementary Figure 2; Cannot be evaluated without quantification.

      (2) Figure 2A; Is Plp1CreER Rosa26DTA/+ mice system established correctly? S100B immunohistology picture is not clear. A similar study is needed for female Plp1CreER Rosa26DTA/+ mice. What is the justification for setting 5 dpt, 11 dpt? Any consideration of changes to organs other than the intestine? Wouldn't it be clearer to introduce Organoid technology?

      3) Figure 2B; Need an explanation for the 5 genes that were altered in the colon. Five genes should be evaluated by RT-qPCR. Why was there a lack of change in the duodenum and ileum?

      (4) Supplementary Figure 3; Top 3 genes should be evaluated by RT-qPCR.

      (5) Supplementary Figure 4B, C, and D; Why not show analysis in the small intestine?

      (6) Supplementary Figure 4D; Cannot be evaluated without quantification.

      (7) Figure 3D; Cannot be evaluated without quantification.

      (8) Supplementary Figure 5B and C; Top 3 genes should be evaluated by RT-qPCR.

      (9) Supplementary Figure 6; Top 3 genes should be evaluated by RT-qPCR.

      (10) Figure 4A; Cannot be evaluated without quantification.

      (11) Figure 4D; Cannot be evaluated without quantification.

      (12) Additional experiments on in vivo infection systems comparing Plp1CreER Rosa26DTA/+ mice and controls would be great.

    5. Author response:

      We thank the reviewers for their thoughtful consideration of our study and are delighted they found the findings to be important. In this initial response to the overall positive reviews, we want to address common themes raised, clarify points relevant to a few specific reviewer concerns, and frame plans for the revised manuscript.

      (1) Analysis of data from human tissue: Reviewer 1 notes “In their analyses of enteric glia from existing single-cell transcriptomic data sets, it is stated that these come from 'non-diseased' humans. However, the data on the small intestine is obtained from children with functional gastrointestinal disorders (Zheng 2023). Data on colonic enteric glia was obtained from colorectal cancer patients (Lee 2020). Although here the cells were isolated from non-malignant regions, saying that the large intestines of these patients are non-diseased is probably an overstatement.

      In the Zheng et al. dataset, “functional GI disorders” refers to biopsies from children that do not have any histopathologic evidence of digestive disease. The children do, however, have at least one GI symptom that prompted a diagnostic endoscopy with biopsies, leading to the designation of “functional” disorder. Given that diagnostic endoscopies are invasive procedures that necessitate anesthesia, obtaining biopsies from completely healthy, asymptomatic children without any clinical indication would not be allowable per most institutional review boards, leading the authors of that study to use these samples as a control group. We thus used the “non-diseased” label to encompass these samples as well as those from the unaffected regions of large intestine from colorectal cancer patients. We recognize, however, that this label might be misleading and will revise the manuscript to more accurately reflect the information on control tissue origin.

      Another existing dataset including human mucosal enteric glia of healthy subjects is presented in Smillie et al (2019). It would be interesting to see how the current findings relate to the data from Smillie et al.” 

      We thank the reviewer for directing us to the Smillie et al. 2019 dataset. This dataset derives from colonic mucosal biopsies from 12 healthy adults (8480 stromal cells) and 18 adults with ulcerative colitis (10,245 stromal cells from inflamed bowel segments and 13,146 from uninflamed), all between the ages of 20-77 years. Our preliminary analysis shows that the putative glial cluster in this dataset does not separate by inflammation or disease state based on the common glial genes: S100B, PLP1, and SOX10. PLP1 and S100B are broadly expressed across this cluster while GFAP is not detected in this dataset, consistent with our observations from the two other human datasets included in our manuscript. In the revised manuscript, we will include the Smillie et al. 2019 data in a supplemental figure as additional supportive evidence.

      (2) Validation and further details of the Plp1CreER-DTA model for genetic depletion of enteric glia: Reviewer 1 notes “The time between enteric glia depletion and analyses (mouse sacrifice) must be a crucial determinant of the type of effects, and the timing thereof. In the current study 11 days after tamoxifen treatment was chosen as the time point for analyses, which is consistent with earlier work by the lab using the same model (Rao et al 2017). What would happen when they wait longer than 11 days after tamoxifen treatment?”  Reviewer 3 asks whether “the Plp1CreER Rosa26DTA/+ mice system established correctly” and raises concern about quantitative characterization.

      In previous work, we discovered that the gene Plp1 is broadly expressed by enteric glia and, within the mouse intestine, is quite specific to glial cells (PMID: 26119414). We characterized the Plp1CreER mouse line as a genetic tool in detail in this initial study. Then in a subsequent study, we used Plp1CreER-DTA mice to genetically deplete enteric glia and study the consequences on epithelial barrier integrity, crypt cell proliferation, enteric neuronal health and gastrointestinal motility (PMID: 28711628). In this second study, we performed extensive validation of the Plp1CreER-DTA mouse model including detailed quantification of glial depletion in the small and large intestines across the myenteric, intramuscular and mucosa compartments by immunohistochemical (IHC) staining of whole tissue segments to sample thousands of cells. We found that the majority of S100B+ enteric glia were depleted within 5 days in both sexes, including more than 88% loss of mucosal glia, and that this loss was stable at 3 subsequent timepoints (7, 9 and 14 days post-tamoxifen induction of Cre activity). Glial loss was further confirmed by IHC for GFAP in the myenteric plexus, and by ultrastructural analysis of the small intestine to ensure cell depletion rather than simply loss of marker expression. Our group was the first to use this model to study enteric glia, and since then similar models and our key observations have been replicated by other groups (PMID: 33282743, 34550727). Thus, we consider this model to be well established.

      Reviewer 1 raises an excellent question about examining epithelial health beyond 11 days post-tamoxifen (11dpt) in this model. Particularly given the longer-lived nature of Paneth cells relative to other epithelial cell types, this would be very interesting to explore. Through 11dpt, Cre+ mice are well-appearing and indistinguishable from their Cre-negative control littermates. Unfortunately, a limitation of the Plp1CreER-DTA model is that beyond 11dpt, Cre+ mice become anorexic, lose body weight, and have signs of neurologic debility such as hindlimb weakness and uncoordinated gait that are prominent by 14dpt. These phenotypes are likely the consequence of targeting Plp1+ glia outside the gut, such as Schwann cells and oligodendrocytes (as described in another study which used a similar model to study demyelination in the central nervous system, PMID: 20851998). Given these CNS effects and that starvation is well known to affect Paneth cell phenotypes (PMIDs: 1167179, 21986443), we elected not to examine timepoints beyond 11dpt. Technological advances that enable more selective cell depletion would allow study of more chronic effects of enteric glial loss.

      (3) Sex differences in the microbiome data: All 3 reviewers queried whether there were sex differences in the microbiome data with Reviewer 1 explaining “Previously the authors showed that enteric glia regulation of intestinal motility is sex-dependent (Rao et al 2017). While enteric glia depletion caused dysmotility in female mice, it did not affect motility in males. For this reason, most experiments in the current study were conducted in male mice only. However, for the experiments focusing on the effect of enteric glia depletion on host-microbiome interactions and intestinal microbiota composition both male and female mice were used. In Figure 8A male and female mice are distinctly depicted but this was not done for Figure 8C. Separate characterization of the microbiome of male and female mice would have helped to figure out how much intestinal dysmotility (in females) contributes to the effect on gut microbial composition. This is an important exercise to confirm that the effect on the microbiome is indeed a consequence of altered Paneth cell function…”

      In our microbiome analysis, we initially analyzed males and females separately but did not observe significant differences between the two sexes. Thus, we merged the data to increase the statistical power of the genotype comparisons. It was an oversight on our part to not label the female and male datapoints in Figure 8C as we did for the other data in the manuscript. We will update this graph and related supplemental figures in the revised version. Per Reviewer 2’s suggestion, we will also address this further in the Results and Discussion.

      (4) Reconciling RNA-Seq identification of transcriptional changes in the colon, but not the small intestine, while the GSEA and downstream tissue level morphological and functional analyses detected phenotypes in the small intestine. Reviewers 1 and 3 raised this question with Reviewer 1 noting “…enteric glia depletion was found to affect Paneth cells structurally and functionally in the small intestine, where transcriptional changes were initially not identified. Only when performing GSEA with the in silico help of cell type-specific gene profiles, differences in Paneth cell transcriptional programs in the small intestine were uncovered. A comment on this discrepancy would be helpful, especially for the non-bioinformatician readers among us.” 

      Standard differential gene expression analysis (DEG) of the effects of glial loss revealed significant differences only in the colon, and even there only a handful of genes were changed. These changes were not accompanied by corresponding changes at the protein level, at least as detectable by IHC. In the small intestine, there were no significant differences by standard DEG thresholds. Unlike DEG, gene set enrichment analyses (GSEA), provides a significance value based on whether there is a higher than chance number of genes that are changing in a uniform direction without consideration for the significance of the magnitude of change. Therefore, the GSEA detected that a significant number of genes in the curated Paneth cell gene list exhibited a positive fold change difference in the bulk RNA sequencing data. This prompted us to examine Paneth cells and other epithelial cell types in more detail by IHC, functional and ultrastructural analyses, which all converged on the observation that Paneth cells were relatively selectively disrupted in the epithelium of glial depleted mice.

      (5) Other: We will address all remaining comments in our detailed author response that will accompany our revised manuscript. We thank Reviewer 2 for the very positive feedback overall and highlighting opportunities to better label findings in some of the figures. We will make these suggested changes in our revised manuscript.

    1. eLife assessment

      This valuable study provides solid in vivo data that transfer of IL-15/IL-12-conditioned syngeneic NK cells after primary tumor resection promotes long-term survival of mice with low metastatic burden from breast cancer. Also, the authors conducted an investigator-initiated clinical trial that demonstrated that similar NK cell infusions in cancer patients after resections were safe and showed signs of efficacy. Therefore, this study is of interest and value to oncologists in the field of breast cancer research.

    2. Reviewer #1 (Public Review):

      Summary:

      This is a very nice paper in which the authors addressed the potential for NK cell cellular therapy to treat and potentially eliminate previously established metastases after surgical resections, which are a major cause of death in human cancer patients. To do so they developed a model using the EO771 breast cancer cell line, in which they establish and then resect tumors and the draining lymph node, after which the majority of mice eventually succumb to metastatic disease. They found that when the initiating tumors were resected when still relatively small, adoptive transfers of IL-15/12-conditioned NK cells substantially enhanced the survival of tumor-bearing animals. They then delved into the cellular mechanisms involved. Interestingly and somewhat unexpectedly, the therapeutic effect of the transferred NK cells was dependent on the host's CD8+ T cells. Accordingly, the NK cell therapy contributed to the formation of tumor-specific CD8+ T cells, which protected the recipient animals against tumor re-challenge and were effective in protecting mice from tumor formation when transferred to naive mice. Mechanistically, they used Ifng knockout NK cells to provide evidence that IFNgamma produced by the transferred NK cells was crucial for the accumulation and activation of DCs in the metastatic lung, including expression of CD86, CD40, and MHC genes. In turn, IFNgamma production by NK cells was essential for the induced accumulation of activated CD8 effector T cells and stem cell-like CD8 T cells in the metastatic lung. The authors then expanded their findings from the mouse model to a small clinical trial. They found that inoculations of IL-15/12-conditioned autologous NK cells in patients with various malignancies after resection were safe and showed signs of efficacy.

      Strengths:

      - Monitoring of long-term metastatic disease and survival after resection used in this paper is a physiological model that closely resembles clinical scenarios more than the animal models usually used, a great strength of the approach.

      - Previous literature focused on the notion that NK cells clear metastatic lesions directly, within a short period. The authors' use of a more relevant model and time frame revealed the previously unexplored T cell-dependent mechanism of action of infused NK cells for long-term control of metastatic diseases.

      - Also important, the paper provides solid evidence for the contribution of IFNgamma produced by NK cells for activation of dendritic cells and T cells. This is an interesting finding that provokes additional questions concerning the action of the interferon-gamma in this context.

      - The results from the clinical trial in cancer patients based on the same type of IL-15/12-conditioned NK cell infusions, were encouraging with respect to safety and showed signals of efficacy, which support the translatability of the author's findings.

      Weaknesses:

      - Having demonstrated that NK cell IFNgamma is important for recruiting and activating DCs and T cells in their model, one is left to wonder whether it is important for the therapeutic effect, which was not tested.

      - Relatedly, previous studies, cited by the authors, reported that NK cells promote T cell activation by producing the chemokines CCL5 and XCL1, and FLT3 ligand, which respectively recruit and activate dendritic cells that can subsequently mobilize a T cell response. The present study demonstrates an important role for NK cell-produced IFNgamma in these processes. One is left wondering whether the model used by the authors is also dependent on CCL5, XCL1, and FLT3 production by NK cells, and if so whether IFNgamma plays a role in that or acts in parallel. The issue could be discussed by the authors, even if they cannot easily resolve it.

      - The authors do not address whether the IL-12 in their cocktail is essential for the effects they see. Relatedly, it was of interest that despite the effectiveness of the transferred IL-15/IL-12 cultured NK cells, the cells failed to persist very long after transfer. Published studies have reported that so-called memory-like NK cells, which are pre-activated with a cocktail of IL-12, IL-18 and IL-15, persist much longer in lympho-depleted mice and patients than IL-2 cultured NK cells. It would be illuminating to compare these two types of NK cell products in the author's model system, and with, or without, lymphodepletion, to identify the critical parameters. If greater persistence occurred with the memory-like NK cell product, it is possible that the NK cells might provide greater benefit, including by directly targeting the tumor.

      - It was somewhat difficult to gauge the clinical trial results because the trial was early stage and therefore not controlled. Evaluation of the results therefore relies on historical comparisons. To evaluate how encouraging the results are, it would be valuable for the authors to provide some context on the prognoses and likely disease progression of these patients at the time of treatment.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors show convincing data that increasing NK cell function/frequency can reduce the development and progression of metastatic disease after primary tumor resection.

      Strengths:

      The inclusion of a first-in-human trial highlighting some partial responses of metastatic patients treated with in vitro expanded NK cells is tantalising. It is difficult to perform trials in preventing further metastasis since the timelines are very protracted. However, more data like these that highlight the role of NK cells in improving local cDC1/T cells anti-tumor immunity will encourage deeper thinking around therapeutic approaches to target endogenous NK cells to achieve the same.

      Weaknesses:

      As always, more patient data would help increase confidence in the human relevance of the approach.

    1. eLife assessment

      This valuable study adopted a multi-omics approach to elucidate the regulatory mechanism underlying parturition and myometrial quiescence. The data presented to support the main conclusion remains incomplete. This work will be of interest to both basic researchers who work on reproductive biology and clinicians who practice reproductive medicine.

    2. Reviewer #1 (Public Review):

      Summary:

      The use of a multi-omics approach to elucidate the regulatory mechanism underlying parturition and myometrial quiescence adds novelty to the study. The identification of myometrial cis-acting elements and their association with gene expression, particularly the regulation of the PLCL2 gene by PGR opens the door to further investigate the impact of PGR and other regulators.

      Strengths:

      (1) Multi-Omic Approach: The paper employs a comprehensive multi-omic approach, combining ChIP-Seq, RNA-Seq, and CRISPRa-based Perturb-Seq assays, which allow for a thorough investigation of the regulatory mechanisms underlying myometrial gene expression.

      (2) Clinical Relevance: Investigating human myometrial specimens provides direct clinical relevance, as understanding the molecular mechanisms governing parturition and myometrial quiescence can have significant implications for the management of pregnancy-related disorders.

      (3) Functional work: For functional screening, They have used CRISPRa-based screening of PLCL2 gene regulation using immortalized human cell-line hTERT-HM and T-hESC to add more dimension to the work which strengthens their finding of PGR-dependent regulation of the PLCL2 gene in the human myometrial cells.

      Weaknesses:<br /> (1) Variability in epigenomic mapping: The significant variations in the number and location of H3K27ac-positive intervals across different samples and studies suggest potential challenges in accurately mapping the myometrial epigenome. This variability may introduce uncertainty and complicate the interpretation of results.

      (2) Sample specificity: The study focuses on term pregnant nonlabor myometrial specimens, limiting the generalizability of the findings to other stages of pregnancy or labor.

      (3) Limited Understanding of Regulatory Mechanisms: While the study identifies potential regulatory programs within super-enhancers, the exact mechanisms by which these enhancers regulate gene expression and cellular functions in the myometrium remain unclear. Further mechanistic studies are needed to elucidate these processes.

      (4) Discordant analysis: Why are regular enhancers being understood in terms of motif enrichment of transcription factors and super-enhancers in terms of pathways enriched for active genes? This needs a clear reason.

    3. Reviewer #2 (Public Review):

      Summary:

      In "Assessment of the Epigenomic Landscape in Human Myometrium at Term Pregnancy" the authors generate a number of genome-wide data sets to investigate epigenomic and transcriptomic regulation of the myometrium at term pregnancy. These data provide a useful resource for further evaluation of gene regulatory mechanisms in the myometrium and include the first Hi-C data published for this tissue. There is a comprehensive comparison to previously published histone modification data and integration with RNA-seq to highlight potential enhancer-gene regulatory relationships. The authors further investigate putative enhancers upstream of the PLCL2 gene and identify a candidate region that may be regulated by the PGR (progesterone receptor) signaling.

      Strengths:

      The strengths of this study are in the multi-omics nature of the design as several genome-wide data sets are generated from the same patient samples. Extending this type of approach in the future to a larger number of samples will allow for additional investigation into gene regulation as the correlation between epigenomic features and gene expression across a larger number of samples can reveal regulatory relationships.

      Weaknesses:

      One of the most interesting aspects of this study is the generation of the first Hi-C data for the human pregnant myometrium, however, there is a minimal description in the results section of the Hi-C data analysis and the only data shown are the number of loops identified and one such loop that includes the PLCL2 promoter shown in Figure 3A. The manuscript would benefit from a more extensive analysis of the Hi-C data, for example, the analysis of TADs (topological associating domains) would be interesting to add and could be used to evaluate to what extent H3K27ac domains and putative regulated genes fall within the same TAD.

      The authors present some convincing evidence on the transcriptional regulation of the PLCL2 gene using Perturb-Seq to identify putative upstream enhancer regions and PGR over-expression showing PGR can act as an activator. These two experiments on their own are interesting, however, they are not as mechanistically integrated as they could be to clarify the molecular mechanisms. Deletion of the putative enhancer upstream of PLCL2 followed by over-expression of PGR would clarify the mechanistic relationship between the proposed enhancer, PGR, and PLCL2 expression. Does PGR act through the proposed enhancer? In addition, reporter assays using this proposed enhancer region with and without increased expression of PGR and mutation of any PRE sequences would also provide mechanistic insight. Although CRISPRa and Perturb-Seq can be used to identify potential regulatory regions, the best approach to verify the requirement for a particular enhancer in regulating a specific gene is a deletion approach.

    4. Reviewer #3 (Public Review):

      In this manuscript, Wu et al. investigate active H3K27ac and H3K4me1 marks in term pregnant nonlabor myometrial biopsies, linking putative-enhancers and super-enhancers to gene expression levels. Through their findings, they reveal the PGR-dependent regulation of the PLCL2 gene in human myometrial cells via a cis-acting element located 35-kilobases upstream of the PLCL2 gene. By targeting this region using a CRISPR activation system, they were able to elevate the endogenous PLCL2 mRNA levels in immortalized human myometrial cells.

      This research offers novel insights into the molecular mechanisms governing gene expression in myometrial tissues, advancing our understanding of pregnancy-related processes.

      Major comments:

      (1) A more comprehensive analysis of the epigenetic and transcriptomic data would have strengthened the paper, moving beyond basic association studies. Currently, it is challenging to assess the quality and significance of the data as much of the information is lacking.

      (2) The rationale for and connections between experiments, as well as results, could be bolstered to underscore the significance of this research.

      Strengths:

      - The combination of ChIP-Seq, RNA-Seq, and CRISPRa Perturb-Seq approaches to investigate gene regulation and expression in myometrial cells.

      - The use of CRISPR activation system to specifically target cis-acting elements.

      Weaknesses:

      - The manuscript would strongly benefit from a deeper analysis of the Omic datasets. Furthermore, expanding figures/graphs to effectively contextualize these datasets would be greatly beneficial and would add more value to this research. Currently, it is difficult for us to assess and appreciate the quality of these data sets across the manuscript, which is mostly correlative.

      - Limited sample size, coupled with variability in results and overall lack of details, compromises the robustness of result interpretation.

      - For most parts of the results section, a better description is needed, including rationale, approach, and presentation of data. As it stands, it is challenging to assess the quality of the data and appreciate the results.

      - Additional efforts are needed to dissect the proposed regulatory mechanisms.

      - While the discussion provided helpful context for understanding some of the experiments performed, it lacked interpretation of the results in relation to the existing literature.

    1. eLife assessment

      In this valuable study, the authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction and heart failure. Based on results from a series of solid statistical analyses, the authors conclude that a younger onset age of breast cancer is associated with myocardial infarction and heart failure, highlighting the need to carefully monitor the cardiovascular status of women who have been diagnosed with breast cancer.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction (MI) and heart failure (HF). They employed a secondary data analysis of the UK Biobank. They used descriptive and inferential analysis including Cox proportional hazards models to investigate the associations. Propensity score matching was also used. They found that Among participants with breast cancer, younger onset age was significantly associated with elevated risks of MI (HR=1.36, 95%CI: 1.19 to 1.56, P<0.001) and HF (HR=1.31, 95% CI: 1.18 to 1.46, P<0.001). the reported similar findings after propensity matching.

      Strengths:

      The use of a large dataset is a strength of the study as the study is well-powered to detect differences. Reporting both the unmatched and the propensity-matched estimates was also important for statistical inference.

      Weaknesses:

      Despite the merits of the paper, readers may get confused as to whether authors are referring to "age at breast cancer onset" or "age at breast cancer diagnosis". I suppose the title refers to the latter, in which case it will be best to be consistent in using "age at breast cancer diagnosis" throughout the manuscripts. I would recommend a revision to the title to make it explicit that the authors are referring to, "age at breast cancer diagnosis".

    3. Reviewer #2 (Public Review):

      This is a well-presented large analysis from the UK Biobank of nearly 250,000 female adults. The authors examined the associations of breast cancer diagnosis with incident myocardial infarction and heart failure by different onset age groups. Based on results from a series of statistical analyses, the authors concluded that younger onset age of breast cancer was associated with myocardial infarction and heart failure, highlighting the necessity of careful monitoring of cardiovascular status in women diagnosed with breast cancer, especially those younger ones.

      Comments to consider:

      (1) It's thoughtful for the authors to have included and adjusted for menopausal status, breast cancer surgery, and hormone replacement therapy in their sensitivity analysis. It would be informative if the authors presented the number and percentages of menopause and cancer treatments.

      (2) The analytical baseline used for follow-up should be pointed out in the methods section. It's confusing whether the analytic baseline was defined as the study baseline or the time at breast cancer diagnosis.

      (3) Did the older onset age group have a longer follow-up duration? Could the authors provide information on the length of follow-up by age of onset in Supplementary Table S4? It would give the readers more information regarding different age groups.

    1. eLife assessment

      This study combines genetic, cell biological, and interaction data to propose a model of meiotic double-strand break regulation in C. elegans. Comprehensive cataloging of their interactions (physical and genetic) would be valuable information for the field. However, the analyses used in the manuscript are not consistent or comprehensive, and therefore the evidence to support their model is currently incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Raices et al., provides novel insights into the role and interactions between SPO-11 accessory proteins in C. elegans. The authors propose a model of meiotic DSBs regulation, critical to our understanding of DSB formation and ultimately crossover regulation and accurate chromosome segregation. The work also emphasizes the commonalities and species-specific aspects of DSB regulation.

      Strengths:

      This study capitalizes on the strengths of the C. elegans system to uncover genetic interactions between a large number of SPO-11 accessory proteins. In combination with physical interactions, the authors synthesize their findings into a model, which will serve as the basis for future work, to determine mechanisms of DSB regulation.

      Weaknesses:

      The methodology, although standard, lacks quantification. This includes the mass spectrometry data, along with the cytology. The work would also benefit from clarifying the role of the DSB machinery on the X chromosome versus the autosomes.

    3. Reviewer #2 (Public Review):

      Summary:

      Meiotic recombination initiates with the formation of DNA double-strand break (DSB) formation, catalyzed by the conserved topoisomerase-like enzyme Spo11. Spo11 requires accessory factors that are poorly conserved across eukaryotes. Previous genetic studies have identified several proteins required for DSB formation in C. elegans to varying degrees; however, how these proteins interact with each other to recruit the DSB-forming machinery to chromosome axes remains unclear.

      In this study, Raices et al. characterized the biochemical and genetic interactions among proteins that are known to promote DSB formation during C. elegans meiosis. The authors examined pairwise interactions using yeast two-hybrid (Y2H) and co-immunoprecipitation and revealed an interaction between a chromatin-associated protein HIM-17 and a transcription factor XND-1. They further confirmed the previously known interaction between DSB-1 and SPO-11 and showed that DSB-1 also interacts with a nematode-specific HIM-5, which is essential for DSB formation on the X chromosome. They also assessed genetic interactions among these proteins, categorizing them into four epistasis groups by comparing phenotypes in double vs. single mutants. Combining these results, the authors proposed a model of how these proteins interact with chromatin loops and are recruited to chromosome axes, offering insights into the process in C. elegans compared to other organisms.

      Weaknesses:

      This work relies heavily on Y2H, which is notorious for having high rates of false positives and false negatives. Although the interactions between HIM-17 and XND-1 and between DSB-1 and HIM-5 were validated by co-IP, the significance of these interactions was not tested, and cataloging Y2H interactions does not yield much more insight. Moreover, most experiments lack rigor, which raises serious concerns about whether the data convincingly supports the conclusions of this paper. For instance, the XND-1 antibody appears to detect a band in the control IP; however, there was no mention of the specificity of this antibody. Additionally, epistasis analysis of various genetic mutants is based on the quantification of DAPI bodies in diakinesis oocytes, but the comparisons were made without statistical analyses. For cytological data, a single representative nucleus was shown without quantification and rigorous analysis. The rationale for some experiments is also questionable (e.g. the rescue by dsb-2 mutants by him-5 transgenes in Figure 2), making the interpretation of the data unclear. Overall, while this paper claims to present "the first comprehensive model of DSB regulation in a metazoan", cataloging Y2H and genetic interactions did not yield any new insights into DSB formation without rigorous testing of their significance in vivo. The model proposed in Figure 4 is also highly speculative.

    4. Reviewer #3 (Public Review):

      During meiosis in sexually reproducing organisms, double-strand breaks are induced by a topoisomerase-related enzyme, Spo11, which is essential for homologous recombination, which in turn is required for accurate chromosome segregation. Additional factors control the number and genome-wide distribution of breaks, but the mechanisms that determine both the frequency and preferred location of meiotic DSBs remain only partially understood in any organism.

      The manuscript presents a variety of different analyses that include variable subsets of putative DSB factors. It would be much easier to follow if the analyses had been more systematically applied. It is perplexing that several factors known to be essential for DSB formation (e.g., cohesins, HORMA proteins) are excluded from this analysis, while it includes several others that probably do not directly contribute to DSB formation (XND-1, HIM-17, CEP-1, and PARG-1). The strongest claims seem to be that "HIM-5 is the determinant of X-chromosome-specific crossovers" and "HIM-5 coordinates the actions of the different accessory factors sub-groups." Prior work had already shown that mutations in him-5 preferentially reduce meiotic DSBs on the X chromosome. While it is possible that HIM-5 plays a direct role in DSB induction on the X chromosome, the evidence presented here does not strongly support this conclusion. It is also difficult to reconcile this idea with evidence from prior studies that him-5 mutations predominantly prevent DSB formation on the sex chromosomes, while the protein localizes to autosomes. The one experiment that seems to elicit the conclusion that HIM-5 expression is sufficient for breaks on the X chromosome is flawed (see below). The conclusion that HIM-5 "coordinates the activities of the different accessory sub-groups" is not supported by data presented here or elsewhere.

      Like most other studies that have examined DSB formation in C. elegans, this work relies on indirect assays, here limited to the cytological appearance of RAD-51 foci and bivalent chromosomes, as evidence of break formation or lack thereof. Unfortunately, neither of these assays has the power to reveal the genome-wide distribution or number of breaks. These assays have additional caveats, due to the fact that RAD-51 association with recombination intermediates and successful crossover formation both require multiple steps downstream of DSB induction, some of which are likely impaired in some of the mutants analyzed here. This severely limits the conclusions that can be drawn. Given that the goal of the work is to understand the effects of individual factors on DSB induction, direct physical assays for DSBs should be applied; many such assays have been developed and used successfully in other organisms.

      Throughout the manuscript, the writing conflates the roles played by different factors that affect DSB formation in very different ways. XND-1 and HIM-17 have previously been shown to be transcription factors that promote the expression of many germline genes, including genes encoding proteins that directly promote DSBs. Mutations in either xnd-1 or him-17 result in dysregulation of germline gene expression and pleiotropic defects in meiosis and fertility, including changes in chromatin structure, dysregulation of meiotic progression, and (for xnd-1) progressive loss of germline immortality. It is thus misleading to refer to HIM-17 and XND-1 as DSB "accessory factors" or to lump their activities with those of other proteins that are likely to play more direct roles in DSB induction. For example, statements such as the following sentence in the Introduction should be omitted or explained more clearly: "xnd-1 is also unique among the accessory factors in influencing the timing of DSBs; in the absence of xnd-1, there is precocious and rapid accumulation of DSBs as monitored by the accumulation of the HR strand-exchange protein RAD-51." The evidence that HIM-17 promotes the expression of him-5 presented here corroborates data from other publications, notably the recent work of Carelli et al. (2022), but this conclusion should not be presented as novel here. The other factors also fall into several different functional classes, some of which are relatively well understood, based largely on studies in other organisms. The roles of RAD-50 and MRE-11 in DSB induction have been investigated in yeast and other organisms as well as in several prior studies in C. elegans. DSB-1, DSB-2, and DSB-3 are homologs of relatively well-studied meiotic proteins in other organisms (Rec114 and Mei4) that directly promote the activity of Spo11, although the mechanism by which they do so is still unclear. Mutations in PARG-1 (a Poly-ADP ribose glycohydrolase) likely affect the regulation of poly-ADP-ribose addition and removal at sites of DSBs, which in turn are thought to regulate chromatin structure and recruitment of repair factors; however, there is no convincing evidence that PARG-1 directly affects break formation. CEP-1 is a homolog of p53 and is involved in the DNA damage response in the germline, but again is unlikely to directly contribute to DSB induction. HIM-5 and REC-1 do not have apparent homologs in other organisms and play poorly understood roles in promoting DSB induction. A mechanistic understanding of their functions would be of value to the field, but the current work does not shed light on this. A previous paper (Chung et al. G&D 2015) concluded that HIM-5 and REC-1 are paralogs arising from a recent gene duplication, based on genetic evidence for a partially overlapping role in DSB induction, as well as an argument based on the genomic location of these genes in different species; however, these proteins lack any detectable sequence homology and their predicted structures are also dissimilar (both are largely unstructured but REC-1 contains a predicted helical bundle lacking in HIM-5). Moreover, the data presented here do not reveal overlapping sets of genetic or physical interactions for the two genes/proteins. Thus, this earlier conclusion was likely incorrect, and this idea should not be restated uncritically here or used as a basis to interpret phenotypes.

      DSB-1 was previously reported to be strictly required for all DSB and CO formation in C. elegans. Here the authors test whether the expression of HIM-5 from the pie-1 promoter can rescue DSB formation in dsb-1 mutants, and claim to see some rescue, based on an increase in the number of nuclei with one apparent bivalent (Figure 2C). This result seems to be the basis for the claim that HIM-5 coordinates the activities of other DSB proteins. However, this assay is not informative, and the conclusion is almost certainly incorrect. Notably, a substantial number of nuclei in the dsb-1 mutant (without Ppie-1::him-5) are reported as displaying a single bivalent (11 DAPI staining bodies) despite prior evidence that DSBs are absent in dsb-1 mutants; this suggests that the way the assay was performed resulted in false positives (bivalents that are not actually bivalents), likely due to inclusion of nuclei in which univalents could not be unambiguously resolved in the microscope. A slightly higher level of nuclei with a single unresolved pair of chromosomes in the dsb-1; Ppie-1::him-5 strain is thus not convincing evidence for rescue of DSBs/CO formation, and no evidence is presented that these putative COs are X-specific. The authors should provide additional experimental evidence - e.g., detection of RAD-51 and/or COSA-1 foci or genetic evidence of recombination - or remove this claim. The evidence that expression of Ppie-1::him-5 may partially rescue DSB abundance in dsb-2 mutants is hard to interpret since it is currently unknown why C. elegans expresses 2 paralogs of Rec114 (DSB-1 and DSB-2), and the age-dependent reduction of DSBs in dsb-2 mutants is not understood.

      Several of the factors analyzed here, including XND-1, HIM-17, HIM-5, DSB-1, DSB-2, and DSB-3, have been shown to localize broadly to chromatin in meiotic cells. Co-immunoprecipitation of pairs of these factors, even following benzonase digestion, is not strong evidence to support a direct physical interaction between proteins. Similarly, the super-resolution analysis of XND-1 and HIM-17 (Figure 1EF) does not reveal whether these proteins physically interact with each other, and does not add to our understanding of these proteins' functions, since they are already known to bind to many of the same promoters. Promoters are also likely to be located in chromatin loops away from the chromosome axis, so in this respect, the localization data are also confirmatory rather than novel.

      The phenotypic analysis of double mutant combinations does not seem informative. A major problem is that these different strains were only assayed for bivalent formation, which (as mentioned above) requires several steps downstream of DSB induction. Additionally, the basis for many of the single mutant phenotypes is not well understood, making it particularly challenging to interpret the effects of double mutants. Further, some of the interactions described as "synergistic" appear to be additive, not synergistic. While additive effects can be used as evidence that two genes work in different pathways, this can also be very misleading, especially when the function of individual proteins is unknown. I find that the classification of genes into "epistastasis groups" based on this analysis does not shed light on their functions and indeed seems in some cases to contradict what is known about their functions.

      The yeast two-hybrid (Y2H) data are only presented as a single colony. While it is understandable to use a 'representative' colony, it is ideal to include a dilution series for the various interactions, which is how Y2H data are typically shown.

      Additional (relatively minor) concerns about these data:

      (1) Several interactions reported here seem to be detected in only one direction - e.g., MRE-11-AD/HIM-5-BD, REC-1-AD/XND-1-BD, and XND-1-AD/HIM-17-BD - while no interactions are seen with the reciprocal pairs of fusion proteins. I'm not sure if some of this is due to pasting "positive" colony images into the wrong position in the grid, but this should be addressed.

      (2) DSB-3 was only assayed in pairwise combinations with a subset of other proteins; this should be explained; it is also unclear why the interaction grids are not symmetrical about the diagonal.

      (3) I don't understand why the graphic summaries of Y2H data are split among 3 different figures (1, 2, and 3).

    1. eLife assessment

      Using experiments in the white fly, this manuscript provides evidence that the bacterial symbiont Wolbachia can be transmitted from parasitoid wasps to their insect hosts. Characterizing the transfer of Wolbachia between insect species is a valuable attempt to explain the widespread of this intracellular bacterium. This paper is incomplete as it does not furnish sufficient data to support several of its claims for which additional methods and data are necessary.

    2. Reviewer #1 (Public Review):

      Summary and Strengths:

      The ability of Wolbachia to be transmitted horizontally during parasitoid wasp infections is supported by phylogenetic data here and elsewhere. Experimental analyses have shown evidence of wasp-to-wasp transmission during coinfection (eg Huigins et al), host to wasp transmission (eg Heath et al), and mechanical ('dirty needle') transmission from host to host (Ahmed et al). To my knowledge this manuscript provides the first experimental evidence of wasp to host transmission. Given the strong phylogenetic pattern of host-parasitoid Wolbachia sharing, this may be of general importance in explaining the distribution of Wolbachia across arthropods. This is of interest as Wolbachia is extremely common in the natural world and influences many aspects of host biology.

      Weaknesses:

      The first observation of the manuscript is that the Wolbachia strains in hosts are more closely related to those in their parasitoids. This has been reported on multiple occasions before, dating back to the late 1990s. The introduction cites five such papers (the observation is made in other studies too that could be cited) but then dismisses them by stating "However, without quantitative tests, this observation could simply reflect a bias in research focus." As these studies include carefully collected datasets that were analysed appropriately, I felt this claim of novelty was rather strong. It is unclear why downloading every sequence in GenBank avoids any perceived biases, when presumably the authors are reanalysing the data in these papers.

      I do not doubt the observation that host-parasitoid pairs tend to share related Wolbachia, as it is corroborated by other studies, the effect size is large, and the case study of whitefly is clearcut. It is also novel to do this analysis on such a large dataset. However, the statistical analysis used is incorrect as the observations are pseudo-replicated due to phylogenetic non-independence. When analysing comparative data like this it is essential to correct for the confounding effects of related species tending to be similar due to common ancestry. In this case, it is well-known that this is an issue as it is a repeated observation that related hosts are infected by related Wolbachia. However, the authors treat every pairwise combination of species (nearly a million pairs) as an independent observation. Addressing this issue is made more complex because there are both the host and symbiont trees to consider. The additional analysis in lines 123-124 (including shuffling species pairs) does not explicitly address this issue.

      The sharing of Wolbachia between whitefly and their parasitoids is very striking, although this has been reported before (eg the authors recently published a paper entitled "Diversity and Phylogenetic Analyses Reveal Horizontal Transmission of Endosymbionts Between Whiteflies and Their Parasitoids"). In Lines 154-164 it is suggested that from the tree the direction of transfer between host and parasitoid can be inferred from the data. This is not obvious to me given the poor resolution of the tree due to low sequence divergence. There are established statistical approaches to test the direction of trait changes on a tree that could have been used (a common approach is to use the software BEAST).

    3. Reviewer #2 (Public Review):

      The paper by Yan et al. aims to provide evidence for horizontal transmission of the intracellular bacterial symbiont Wolbachia from parasitoid wasps to their whitefly hosts. In my opinion, the paper in its current form consists of major flaws.

      Weaknesses:

      The dogma in the field is that although horizontal transmission events of Wolbachia occur, in most systems they are so rare that the chances of observing them in the lab are very slim.<br /> For the idea of bacteria moving from a parasitoid to its host, the authors have rightfully cited the paper by Hughes, et al. (2001), which presents the main arguments against the possibility of documenting such transmissions. Thus, if the authors want to provide data that contradict the large volume of evidence showing the opposite, they should present a very strong case.

      In my opinion, the paper fails to provide such concrete evidence. Moreover, it seems the work presented does not meet the basic scientific standards.

      My main reservations are:

      - I think the distribution pattern of bacteria stained by the probes in the FISH pictures presented in Figure 4 looks very much like Portiera, the primary symbiont found in the bacterium of all whitefly species. In order to make a strong case, the authors need to include Portiera probes along with the Wolbachia ones.

      - If I understand the methods correctly, the phylogeny presented in Figure 2a is supposed to be based on a wide search for Wolbachia wsp gene done on the NCBI dataset (p. 348). However, when I checked the origin of some of the sequences used in the tree to show the similarity of Wolbachia between Bemisia tabaci and its parasitoids, I found that most of them were deposited by the authors themselves in the course of the current study (I could not find this mentioned in the text), or originated in a couple of papers that in my opinion should not have been published to begin with.

      - The authors fail to discuss or even acknowledge a number of published studies that specifically show no horizontal transmission, such as the one claimed to be detected in the study presented.

    4. Reviewer #3 (Public Review):

      This is a very ordinary research paper. The horizontal of endosymbionts, including Wolbachia, Rickettsia etc. has been reported in detail in the last 10 years, and parasitoid vectored as well as plant vectored horizontal transmission is the mainstream of research. For example, Ahmed et al. 2013 PLoS One, 2015 PLoS Pathogens, Chiel et al. 2014 Enviromental Entomology, Ahmed et al. 2016 BMC Evolution Biology, Qi et al. 2019 JEE, Liu et al. 2023 Frontiers in Cellular and Infection Microbiology, all of these reported the parasitoid vectored horizontal transmission of endosymbiont. While Caspi-Fluger et al. 2012 Proc Roy Soc B, Chrostek et al. 2017 Frontiers in Microbiology, Li et al. 2017 ISME Journal, Li et al. 2017 FEMS, Shi et al. 2024 mBio, all of these reported the plant vectored horizontal transmission of endosymbiont. For the effects of endosymbiont on the biology of the host, Ahmed et al. 2015 PLoS Pathogens explained the effects in detail.

      Weaknesses:

      In the current study, the authors downloaded the MLST or wsp genes from a public database and analyzed the data using other methods, and I think the authors may not be familiar with the research progress in the field of insect symbiont transmission, and the current stage of this manuscript lacking sufficient novelty.

    1. eLife assessment

      This manuscript presents experiments that address the question of whether the lateral parafacial area (pFL) is active in controlling active expiration, which is particularly significant in patient populations that rely on active exhalation to maintain breathing (eg, COPD, ALS, muscular dystrophy). This study presents solid evidence for a valuable finding of pharmacological mapping of the core medullary region that contributes to active expiration and addresses the question of where these regions lie anatomically. Results from these experiments will be of value to those interested in the neural control of breathing and other neuroscientists as a framework for how to perform pharmacological mapping experiments in the future.

    2. Reviewer #1 (Public Review):

      The main focus of the current study is to identify the anatomical core of an expiratory oscillator in the medulla using pharmacological disinhibition. Although expiration is passive in normal eupneic conditions, activation of the parafacial (pFL) region is believed to evoke active expiration in conditions of elevated ventilatory demands. The authors and others in the field have previously attempted to map this region using pharmacological, optogenetic and chemogenetic approaches, which present with their own challenges.

      In the present study, the authors take a systematic approach to determine the precise anatomical location within the ventral medulla's rostro-caudal axis where the expiratory oscillator is located. The authors used a bicuculline (a GABA-A receptor antagonist) and fluorobeads solution at 5 distinct anatomical locations to study the effects on neuronal excitability and functional circuitry in the pFL. The effects of bicuculline on different phases of the respiratory cycle were characterized using a multidimensional cycle-by-cycle analysis. This analysis involved measuring the differences in airflow, diaphragm electromyography (EMG), and abdominal EMG signals, as well as using a phase-plane analysis to analyze the combined differences of these respiratory signals. Anatomical immunostaining techniques were also used to complement the functional mapping of the pFL.

      Major strengths of this work include a robust study design, complementary neurophysiological and immunohistochemical methods and the use of a novel phase-plane analysis. The authors construct a comprehensive functional map revealing functional nuances in respiratory responses to bicuculline along the rostrocaudal axis of the parafacial region. They convincingly show that although bicuculline injections at all coordinates of the pFL generated an expiratory response, the most rostral locations in the lateral parafacial region play the strongest role in generating active expiration. These were characterized by a strong impact on the duration and strength of ABD activation, and a robust change in tidal volume and minute ventilation. The authors also confirmed histologically that none of the injection sites overlapped grossly with PHOX2B+ neurons, thus confirming the specificity of the injections in the pFL and not the neighboring RTN.

      Although a caveat of the approach is that bicuculine injections have indiscriminate effects on other neuronal populations in the region (GABAergic, glycinergic, and glutamatergic), the results can largely be interpreted as modulation of neuronal populations in different regions of the pFL have differential effects on expiratory output. This limitation of the pharmacological approach has also been aptly discussed by the authors.

      Collectively, these findings advance our understanding of the presumed expiratory oscillator, the pFL, and highlight the functional heterogeneity in the functional response of this anatomical structure.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in controlling active expiration. Stereotactic injections of bicuculline were utilized to map various pFL sites and their impact on respiration. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study indicates that the rostro-caudal organization of the pFL and its influence on breathing is not simple and uniform.

      Strengths:<br /> The data provide novel insights into the importance of rostral locations in controlling active expiration. The authors use innovative analytic methods to characterize the respiratory effects of bicuculline injections into various areas of the pFL.

      Weaknesses:<br /> Bicuculline injections increase the excitability of neurons. Aside of blocking GABA receptors, bicuculline also inhibits calcium-activated potassium currents and potentiates NMDA currents, thus insights into the role of GABAergic inhibition are limited.<br /> Increasing the excitability of neurons provides little insights into the activity pattern and function of the activated neurons. Without recording from the activated neurons, it is impossible to know whether an effect on active expiration or any other respiratory phase is caused by bicuculline acting on rhythmogenic neurons or tonic neurons that modulate respiration. While this approach is inappropriate to study the functional extent of the conditional "oscillator" for active expiration, it still provides valuable insights into this region's complex role in controlling breathing .

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript presents a solid and generally convincing set of experiments to address the question of whether the lateral parafacial area (pFL) is active in controlling active expiration, which is particularly important in patient populations that rely on active exhalation to maintain breathing (eg, COPD, ALS, muscular dystrophy). This study presents a valuable finding by pharmacologically mapping the core medullary region that contributes to active expiration and addresses the question of where these regions lie anatomically. Results from these experiments will be of value to those interested in the neural control of breathing and other neuroscientists as a framework for how to perform pharmacological mapping experiments in the future.

      Thanks for the positive feedback on our study, as well as the assessment of the novelty of our investigation and the advancements to the field that these results will bring in the future.

      We have addressed the specific comments and made changes to the manuscript as indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      The main focus of the current study is to identify the anatomical core of an expiratory oscillator in the medulla using pharmacological disinhibition. Although expiration is passive in normal eupneic conditions, activation of the parafacial (pFL) region is believed to evoke active expiration in conditions of elevated ventilatory demands. The authors and others in the field have previously attempted to map this region using pharmacological, optogenetic, and chemogenetic approaches, which present their own challenges.

      In the present study, the authors take a systematic approach to determine the precise anatomical location within the ventral medulla's rostrocaudal axis where the expiratory oscillator is located. The authors used a bicuculline (a GABA-A receptor antagonist) and fluorobeads solution at 5 distinct anatomical locations to study the effects on neuronal excitability and functional circuitry in the pFL. The effects of bicuculline on different phases of the respiratory cycle were characterized using a multidimensional cycle-by-cycle analysis. This analysis involved measuring the differences in airflow, diaphragm electromyography (EMG), and abdominal EMG signals, as well as using a phase-plane analysis to analyze the combined differences of these respiratory signals. Anatomical immunostaining techniques were also used to complement the functional mapping of the pFL.

      Major strengths of this work include a robust study design, complementary neurophysiological and immunohistochemical methods, and the use of a novel phase-plane analysis. The authors construct a comprehensive functional map revealing functional nuances in respiratory responses to bicuculline along the rostrocaudal axis of the parafacial region. They convincingly show that although bicuculline injections at all coordinates of the pFL generated an expiratory response, the most rostral locations in the lateral parafacial region play the strongest role in generating active expiration. These were characterized by a strong impact on the duration and strength of ABD activation and a robust change in tidal volume and minute ventilation. The authors also confirmed histologically that none of the injection sites overlapped grossly with PHOX2B+ neurons, thus confirming the specificity of the injections in the pFL and not the neighboring RTN.

      Collectively, these findings advance our understanding of the presumed expiratory oscillator, the pFL, and highlight the functional heterogeneity in the functional response of this anatomical structure.

      Thanks for the positive feedback on the results presented in the current manuscript.

      Reviewer #2 (Public Review):

      Summary:

      Pisanski and colleagues map regions of the brainstem that produce the rhythm for active expiratory breathing movements and influence their motor patterns. While the neural origins of inspiration are very well understood, the neural bases for expiration lag considerably. The problem is important and new knowledge pertaining to the neural origins of expiration is welcome.

      The authors perturb the parafacial lateral (pFL) respiratory group of the brainstem with microinjection of bicuculline, to elucidate how disinhibition in specific locations of the pFL influences active expiration (and breathing in general) in anesthetized rats. They provide valuable, if not definitive, evidence that the borders of the pFL appear to extend more rostrally than previously appreciated. Prior research suggests that the expiratory pFL exists at the caudal pole of the facial cranial nucleus (VIIc). Here, the authors show that its borders probably extend as much as 1 mm rostral to VIIc. The evidence is convincing albeit with caveats.

      Strengths:

      The authors achieve their aim in terms of showing that the borders of the expiratory pFL are not well understood at present and that it (the pFL) extends more rostrally. The results support that point. The data are strong enough to cause many respiratory neurobiologists to look at the sites rostral to the VIIc for expiratory rhythmogenic neurons and characterize their properties and mechanisms. At present my view is that most respiratory neurobiologists overlook the regions rostral to VIIc in their studies of expiratory rhythm and pattern.

      Weaknesses:

      The injection of bicuculline has indiscriminate effects on excitatory and inhibitory neurons, and the parafacial region is populated by excitatory neurons that are expiratory rhythmogenic and GABA and glycinergic neurons whose roles in producing active expiration are contradictory (Flor et al. J Physiol, 2020, DOI: 10.1113/JP280243). It remains unclear how the microinjections of bicuculline differentially affect all three populations. A more selective approach would be able to disinhibit the populations separately. Nevertheless, for the main point at hand, the data do suggest that we should reconsider the borders of the expiratory pFL nucleus and begin to examine its physiology up to 1 mm rostral to VIIc.

      The control experiment showed that bicuculline microinjections induced cFos expression in the pFL, which is good, but again we don't know which neurons were disinhibited: glutamatergic, GABAergic, or glycinergic.

      Thanks for sharing your excitement on the results of our study, and appreciating the thorough investigation performed with the use of bicuculline, an approach that was originally used in Pagliardini et al, 2011, PMID: 21414911) and then used by many other groups to generate and study active expiration in vivo.

      In the current study we used the well known effect of Bicuculline to systematically test the area that is more sensitive to such a pharmacological effect, and hence may be the core for generating active expiration. While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype. We can’t exclude that a small fraction of neurons in the pFL area are inhibitory, and that they could influence recruitment of adjacent late-E expiratory neurons. A more selective activation of neuronal populations with different phenotype would be indeed interesting, nonetheless, if local inhibitory neurons have a role in the generation of active expiration, then their disinhibition could have either an inhibitory effect on late-E activity or stimulate expiration in a more indirect fashion.

      While the effect of bicuculline on active expiration has been reported and replicated in multiple manuscripts, the source of inhibition across different phases of the respiratory cycle is still under investigation. Some studies suggest that GABAergic and glycinergic inhibition is not originated in pFL but rather in the BötC and preBötC areas (Flor et al., 2020, PMID: 32621515; Magalhaes et al., 2021; PMID: 34510468) and the effects of this inhibition across the respiratory cycle is debated. Future studies will be key to identify the source of pFL inhibition.

      The manuscript characterizes how bicuculline microinjections affect breathing parameters such as tidal volume, frequency, ventilation, inspiratory and expiratory time, as well as oxygen consumption. Those aspects of the manuscript are a bit tedious and sometimes overanalyzed. Plus, there was no predictive framework established at the outset for how one should expect disinhibition to affect breathing parameters. In other words, if the authors are seeking to map the pFL borders, then why analyze the breathing patterns so much? Does doing so provide more insight into the borders of pFL? I did not think it was compellingly argued.

      We have edited the introduction to address this comment and emphasize the rationale for the study. We also edited the results section to summarize our findings.

      We continue to report our in-depth analysis of the perturbations induced by bicuculline injection over the various respiratory characteristics as this will be fundamental to determine the effects of our experiment not only on the activation of pFL and active expiration, but also on the respiratory network in general. In order to be fair and open about our findings we have reported the results of our analysis in detail. Of note, all sites generated active expiration, but since the objective of the study was to determine the sites with the most significant changes, a finer and multilevel analysis has been used.

      Further, lines 382-386 make a point about decreasing inspiratory time even though the data do not meet the statistical threshold. In lines 386-395, the reporting appears to reach significance (line 388) but not reach significance (line 389). I had trouble making sense of that disparity.

      The statistics were confirmed, and the lines edited as follows: “Interestingly, the duration of inspiration during the response was found to decrease in all groups relative to baseline respiration (Ti response = 0.279 ± 0.034s, Ti baseline = 0.318 ± 0.043s, Wilcoxon rank sum: Z = 3.24, p = 0.001). Contrary to this decrease in inspiratory duration, the total expiratory time was observed to increase in all groups and remained elevated compared to baseline (TE response = 1.313 ± 0.188s, TE baseline = 1.029 ± 0.161s, Wilcoxon rank sum: Z = 4.49, p = 0.001).”

      The other statistical hiccups include "tended towards significance" (line 454), "were found to only reach significance for a short portion of the response" (line 486-7), "did not reach the level of significance" (line 506), which gives one the sense of cherry picking or over-analysis. Frankly, this reviewer finds the paper much more compelling when just asking whether the microinjections evoke active expiration. If yes, then the site is probably part of the pFL.

      Statistical “tendencies” have been eliminated throughout the manuscript.

      We have analyzed in details our results in order to determine changes and differential effects on respiration when comparing the 5 sites of injections. Although the presentation of the results may seem tedious, it has allowed us to highlight some interesting effects: first, the effects on respiratory frequency. It has been shown in the past that optogenetic stimulation of this area causes an increase in respiratory frequency (Pagliardini et al., 2011, PMID: 21414911), whereas a dishinibition with this same approach or stimulation of AMPAreceptor in pFL have shown a reduction in frequency or not a significant change in the response (Pagliardini et al., 2011, PMID: 21414911; Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151). Here, we suggest that the reduction in respiratory frequency is observed only in the caudal sites and could be attributed to BötC effects rather than the stimulation of the core of the pFL since no respiratory change was observe where the effect was more potent (rostral side). Another interesting point was the effects on O2 consumption, although difficult to interpret at this point, we found very interesting that hyperventilation occurred only at the most rostral injection sites.

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size.

      Thank you for the feedback on our description of the statistical results and the suggestion of incorporating effect size. We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      Reviewer #3 (Public Review):

      Summary:

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in controlling active expiration. Stereotactic injections of bicuculline were utilized to map various pFL sites and their impact on respiration. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study indicates that the rostrocaudal organization of the pFL and its influence on breathing is not simple and uniform.

      Strengths:

      The data provide novel insights into the importance of rostral locations in controlling active expiration. The authors use innovative analytic methods to characterize the respiratory effects of bicuculline injections into various areas of the pFL.

      Weaknesses:

      Bicuculline injections increase the excitability of neurons. Aside from blocking GABA receptors, bicuculline also inhibits calcium-activated potassium currents and potentiates NMDA current, thus insights into the role of GABAergic inhibition are limited.

      Increasing the excitability of neurons provides little insights into the activity pattern and function of the activated neurons. Without recording from the activated neurons, it is impossible to know whether an effect on active expiration or any other respiratory phase is caused by bicuculline acting on rhythmogenic neurons or tonic neurons that modulate respiration. While this approach is inappropriate to study the functional extent of the conditional "oscillator" for active expiration, it provides valuable insights into this region's complex role in controlling breathing.

      We have included a reflection of the weaknesses of our studies in the technical consideration section to address the possibility that bicuculline may induce active expiration through other mechanisms. Please note that the use of bicuculline was not to gain further insight on GABAergic inhibition of pFL but to adopt a tool to generate active expiration that has been extensively validated by our group and others.

      Multiple studies have shown recruitment of excitatory late expiratory neurons with bicuculline injections. Although we did not record from late-E neurons in this study, we infer from the body of literature that disinhibition of neurons in this area will activate late-E neurons (as previously demonstrated) and generate active expiration. Although we see value in recording activity of single neurons (especially to study mechanisms of rhythmogenesis), we opted to measure the physiological response from respiratory muscles as an indication of active expiration recruitment in vivo. Recording from single neurons after bicuculline injections in each site would confirm the presence of expiratory neurons along the parafacial area, which is probably not surprising, since every site tested promoted active expiration. The focus of the study though was to determine the site with the strongest physiological response to disinhibition. Future studies will be key to determine whether all neurons along this column have similar electrophysiological rhythmic properties to the ones recently reported (Magalhaes et al., 2021; PMID: 34510468), or some of them simply provide tonic drive to late-E neurons located elsewhere.

      We have discussed the issue as follows:

      “Our experiments focused on determining the area in the pFL that is most effective in generating active expiration as measured by ABD EMG activity and expiratory flow. We did not attempt to record single cell neuronal activity at various locations as previously shown in other studies (Pagliardini et al 2011; Magalhaes et al., 2021), as this approach would most likely find some late-E neurons across the pFL and thus not effectively discriminate between areas of the pFL. Future studies involving multi-unit recordings or imaging of cell population activities will help to determine the firing pattern and population density of bicuculline-activated cells and further determine differences in distribution and function of late-E neurons across the region of the pFL.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall, the manuscript addresses an important question in the field, the anatomical location of the expiratory oscillator. I commend the authors for a well-thought-out and clearly presented study. However, a few small concerns deserve attention to improve the clarity of the report.

      (1) The figures would benefit from a rostral-to-caudal representation of results instead of a caudal-to-rostral orientation. Example, Figure 2.

      We opted for a caudal to rostral representation to progressively move away from the inspiratory oscillator (preBötC) and the anatomical reference point (the caudal tip of the facial nucleus) with our series of injections. 

      (2) A discussion about how expiratory responses generated by these pharmacological approaches would compare to endogenous baseline conditions. The authors mention that bicuculline injections elicited a late-E downward inflection that was absent in baseline conditions. Thus, this raises the point of how these findings compare to awake freely moving animals or during different conditions of increased ventilatory demand.

      This is an interesting question that has not yet been address in the field. As far as we know, there are no recordings of pFL neurons in freely behaving animals although recordings of pFL late-E neurons under elevated PaCO2 have shown a late-E activity in in situ preparations (Britto & Moraes, 2017; PMID: 28004411; Magalhaes et al., 2021; PMID: 34510468).

      We have clarified this in the discussion as follows:

      “At rest, respiratory activity does not present with active expiration (i.e, expiratory flow below its functional residual capacity in conjunction with expiratory-related ABD muscle recruitment) and expiratory flow occurs due to passive recoil of chest wall with no contribution of abdominal activity. Active expiration and abdominal recruitment can be spontaneously observed during sleep (in particular REM sleep, Andrews and Pagliardini, 2015; Pisanski et al., 2019) and can be triggered during increased respiratory drive (e.g. Hypercapnia, RTN stimulation, Abbott et al., 2011). Although never assessed in freely moving, unanesthetized rodents, bicuculline has been extensively used to generate active expiration and late-E neuron activity in both juvenile and adult anesthetized rats (Pagliardini et al., 2011; Huckstepp et al., 2015 Huckstepp et al., 2016; Huckstepp et al., 2018; De Britto and Moraes, 2017; Magalhaes et al., 2021). “

      (3) In Figure 2A, there appears to be an injection site in the top right quadrant of the image, very distant from the intended site. Could the authors confirm if this is an artifact?

      Yes, it is an artifact of image acquisition, we should have marked that in the figure. To avoid confusion and follow other reviewers’ suggestions we have edited he figure.

      (4) A stylistic suggestion would be to include the subpanel of Figure 2C saline control injection as a graph of its own and also include the control anatomical location in 2B.

      Thanks for the suggestion. Because of the complex organization of the figure we opted to leave it as a subpanel in order to not distract the reader from the 5 injection sites, but still provide information about vehicle injection and their lack of changes in respiratory response.

      (5) The authors note that DIAm Area (norm.) during the inspiratory phase is increased in the +6 and +8mm groups. However, Figure 5E shows that the +8mm group is significantly reduced as compared to the +6mm group. Please clarify.

      During the inspiratory phase we did not observe any significant change in the DIA Area (norm.). We realize that the description of this part of the results was confusing and therefore we have eliminated that section.

      Reviewer #2 (Recommendations For The Authors):

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size. There is a valuable editorial in this week's J Physiology (https://doi.org/10.1113/JP285575) that may provide helpful guidance.

      Thanks for this comments and the general assessment. We realized that the results section was dense and with a lot of information. We significantly slimmed the description of the results in order to facilitate the appreciation of the results and avoid confounding statement about significant vs non- significant results.

      We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      The equipment and resources should be clearly identified and use RRIDs whenever possible. Resources like antibodies and other reagents (e.g., cryoprotectants) should be identified, not just by manufacturer, but also by specific part or product numbers or identifiers.

      Manuscript has been edited to add these details.

      The manuscript makes reference to ImageJ and Matlab routines, which must be public through GitHub or another stable repository.

      Thanks for pointing this out. Image J analysis has been performed following scripts already available to users (no custom scripts). The Matlab scripts used for the multivariate analysis is now available at: https://github.com/mprosteb/Pisanski2024

      The way that ABD-DIA coupling was assessed was unclear from the Methods.

      The following text has been added to the methods: “The coupling between ABD and DIA signals was measured as a ratio and analyzed by quantifying the number of bursts of activity observed for the ABD and DIA EMG signals during the first 10 minutes of the response, excluding time bins at end of the response (due to fading and waning of the ABD response in those instances).”

      Fig. 1A was never cited in the text.

      It has been cited now.

      Fig. 1A-C appears to be exactly the same as Fig. 5A-C.

      The reviewer is correct. We have used figure 1 to describe and explain our analytical methods with sample data and Figure 5 describes our results. We have clarified that in: “Figure 5: Rostral injections elicit more prominent changes to respiration in each signal and sub-period. A-C: Is the same as Method Figure 1, has been included here for further clarity when analyzing the results.”

      Late Expiratory airflow is given in units of volts (V) in lines 358-363 (Fig. 4C) but then in units of volts-seconds (V•s) in lines 363-367. Both units are problematic because the voltage is neither an air volume nor an air volume per unit time. Is there some conversion factor left out?

      In this section of the results we describe the changes in expiratory peak amplitude (V) and expiratory peak flow (V•s). Since calibration of airflow was performed on the positive flow and for larger volumes, we prefer to use the original units to guarantee precise assessment of the change and avoid introducing potential errors. Since the analysis considers changes from baseline readings, converting to ml or ml*s would not affect our analysis.

      Reviewer #3 (Recommendations For The Authors):

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in respiratory control, specifically in modulating active expiration. The precise location of this expiratory oscillator within the ventral medulla remains uncertain, with some studies indicating that the caudal tip of the facial nucleus (VIIc) forms the core while others propose more rostral areas. Bicuculline injections were utilized at various pFL sites to explore the impact of these injections on respiration. The authors use innovative and impressive analytic methods to characterize the effect on respiratory activity. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study will contribute to an enhanced understanding of the neural mechanisms controlling active expiration. The main message of the study is that the rostro-caudal organization of the pFL is not simple and uniform. The data provides novel insights into the importance of rostral locations in controlling active expiration (see e.g. lines 738-740).

      The data and results of the paper are intriguing, and it appears that the experiments are well-managed and executed. However, there are several major and minor comments and suggestions that should be addressed by the authors:

      (1) The study relies heavily on local injections into specific areas that are confirmed histologically. One potential concern is the injection volume of 200 nL in such a tiny area. The authors suggest that the drug did not spread to rostral/caudal areas outside the specified coordinate partly based on their cFOS staining. For example, the lack of cFOS activation in TH+ cells and Phox2B cells is interpreted as proof that bicuculline did not spread to these somas (Figure 2). The authors seem to use a similar argument as evidence that the pFL does not include Phox2B neurons in the RTN as discussed in the Discussion section (lines 830-847). However, it is very surprising that bicuculline injections into an area that is known to contain Phox2B and Th+ neurons do not activate these neurons as assessed by the cFOS staining. It seems puzzling to me that none of their injections shown in Figure 2 activated Phox2B or Th neurons. I assume that in targeting the pFL the authors must have sometimes hit areas that included neurons that define the RTN, which would have activated Phox2B or Th+ neurons. Did the authors find that these activations did not activate active expiration? Such negative "controls" would strengthen their argument that pFL is a separate and distinct region that selectively controls active expiration.

      Thanks for the positive feedback on the manuscript. As it has been demonstrated and discussed in several previous publications, PHOX2B expressing neurons in this area of the brain are part of the RTN Neuromedin B positive neurons (more densely located in the ventral paraFacial rather than the lateral parafacial, our site of injection), the TH+ C1 neurons (located in a somewhat more caudal and medial position compared to our sites of injection, around the BötC/ preBötC area) and the large Facial MN (easily identifiable by their large size and compact location). Given this differential spatial distribution, and the controls described below, we believe we have reduced the possibility of the direct activation of these neurons, although we can’t exclude it in full.

      There is now strong evidence about lack of PHOX2B expression in late E neuron in juvenile and adult rats (Magalhaes et al., 2021; PMID: 34510468). We realize that the microinjected solution could potentially diffuse in the brain and hit other areas, but we combined two strategies to verify our intention for a focal injection activating only a restricted area of the brain (i.e., the pFL): i) localization of fluorobeads that were diluted in the Bicuculline solution; ii) expression of cFos combined with anatomical markers, to identify activated cells. Fluorobeads have a very limited spread in the brain and therefore informed us of the site of the injection to differentiate between the five injections locations. Although we can’t assume that Bicuculline will have a similar spread (and it will also be quickly degraded in the tissue), the combination of this analysis with the localized expression of cFos cells has helped us to differentiate between injections site. Because of the proximity of PHOX2B cells in RTN and C1 neurons, we also combined cFos expression with immunohistochemistry to determine whether bicuculline activation was also visible in these two neuronal populations. Our results indicate that there is baseline cfos activity in RTN neurons (see vehicle injection) but the fraction of PHOX2B activated cells did not increase with bicuculline injections suggesting that these neurons were not the target of our injections. Please note that cfos expression has been extensively used to determine RTN neuron activation, especially following chemoreflex responses. 

      (2) The authors refer to "the expiratory oscillator" throughout the manuscript (e.g. lines 58, 62, 65) as if there is only one expiratory oscillator i.e. "the expiratory oscillator". For some reason, the authors avoided citing and mentioning PiCo (Anderson et al. 2016), which is considered the oscillator for postinspiration. Since the present study focuses on the role of expiration, and since the authors describe convincing effects on postinspiration, considering this oscillator which is located dorsomedial to the VRC seems relevant for the present study.

      Due to the limited and controversial literature that is currently present describing Pico as a third oscillator and the fact that our studies do not directly assess the post-inspiratory activity (as measure by the V nerve or laryngeal muscles) or Pico activity and location (which would be even more distant than the RTN, for example), we prefer to avoid commenting on the effects of this injection on Pico or the connectivity between Pico and pFL.

      We have added this to the discussion:

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (3) The authors do not specify what type of bicuculline they injected. Bicuculline is known to have significant effects on potassium channels. Thus, the effects reported here could be due to a non-specific change in excitability, rather than caused by a specific GABAergic blockade.

      The authors also do not know what effects these injections cause in the neurons in vivo, since the injections are not accompanied by recordings from the respiratory neurons that they activate. This together with the non-specific bicuculline effects will affect the interpretation of the results. Thus, the authors need to be more careful when interpreting their effects as "GABAergic". The use of more specific blockers like gabazine could partly address this concern. The authors have to discuss this in a "limitation section".

      Thanks for pointing that out, we have now clarified in the methods section that we used bicuculline methochloride. We can’t exclude that some side- effects could be present due to the use of this drug. For the purpose of this study though, we focused on using bicuculline as a tool to consistently generate active expiration since it has been extensively used by multiple laboratories to induce abdominal muscle recruitment and active expiration, as well as to directly record late-E neurons in this same area.

      We have included in the discussion the following statement:

      “Technical considerations

      Bicuculline methiodide has previously been observed to exhibit inhibitory effects on Ca2+ activated K+ currents inducing non-specific potentiation of NMDA currents (Johnson and Seutin, 1997). Consequently, caution is warranted in attributing our findings solely to the GABAa antagonist properties of bicuculline. Previous work has demonstrated a temporal correlation between the onset of late-E neuron activity in the caudal parafacial region and ABD activity in response to bicuculline (Pagliardini et al., 2011; de Britto and Moraes, 2017; Magalhaes et al., 2021) as well as GABAergic sIPSCs in late-E neurons (Magalhaes et al., 2012). However, it is essential to note that the current study lacks single unit recording, preventing us from definitively confirming whether the observed activity stems from late-E neuronal GABAergic dishinibition or excitation through non GABAergic mechanisms.”

      (4) I also caution the authors when stating that the bicuculline injections will reveal the precise location and functional boundaries of "the" expiratory oscillation within the pFL. Increasing the excitability with bicuculline is inappropriate to study the functional boundaries of an oscillator. It is particularly inappropriate to identify the boundaries of the pFL, a network that is normally inactive and activated only under certain behavioral and metabolic conditions. Because the injections are increasing the neuronal excitability unspecifically, and because the authors are not recording the activity of the neurons in the pFL region it is unclear what kind of neurons are activated. The cFOS staining may help to define whether these neurons are Phox2B or Th positive or negative, but they will not provide insights into the activity patterns of the activated neurons. Thus, it is fair to assume that these injections will likely include also tonic neurons that might indirectly control the activity of pFL neurons under certain metabolic or behavioral conditions without actually being involved in the rhythmogenesis of active expiration. Many of the effects peak after several minutes, and different regions cause differential effects with different time courses, which is difficult to interpret functionally. Thus, the "core" identified in the present study could consist of tonic neurons as opposed to rhythmic neurons generating active expiration.

      We agree with the reviewer that our local injections may have activated an heterogeneous population of neurons. We do not claim that we only activated late-E rhythmogenic neurons but that our multiple sites of injections revealed the area that is generating the strongest excitation of ABD muscles and active expiration.

      While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype

      As suggested by the reviewer, it is possible that the bicuculline injection may have activated some tonic non rhythmogenic neurons which could activate the expiratory oscillator located elsewhere.

      We have edited the introduction as follows:

      “By strategically administering localized volumes of bicuculline at multiple rostrocaudal levels of the ventral brainstem, we aimed to selectively enhance the excitability of neurons driving active expiration, thereby revealing the extension of the pharmacological response and the most efficient site in generating active expiration.”

      We have edited the results as follows:

      “Importantly, the group with injection sites at +0.6 mm from VIIc exhibited the swiftest response onset, suggesting that this area is the most critical for the generation of active expiration, either through direct activation of the expiratory oscillator or, alternatively, for providing a strong tonic drive to late-E neurons located elsewhere.”

      In the introduction, it should also be emphasized that the pharmacological approach used in the present study complements the existing elegant chemogenetic studies, rather than emphasizing primarily the limitations of the chemogenetic inhibitions. The conclusion should be that these studies together provide different, yet complementary insights: The chemogenetic approach by inhibiting neurons, the present study by exciting neurons, and all studies come with their own limitations.

      Thanks for the suggestion, we have updated the manuscript as follows:

      “Although both of these elegant chemogenetic studies have contributed extensively to our understanding of the pFL, the existing evidence suggests that the expiratory oscillator may expand beyond the limits of the viral expression achieved in said studies, as proposed by Huckstepp et al., (2015).”

      Throughout the manuscript, the authors have to be cautious when implying that an excitatory effect relates to the activity of rhythmogenic pFL neurons. For example, on line 710 the authors state that "it is conceivable to infer that the rostral pFL is in the closest proximity to the cells responsible for the generation of active expiration". While it may indeed be "conceivable", the bicuculline injections themselves provide no insights into the location of neurons responsible for rhythmogenesis. It is equally "conceivable" that the excited neurons provide a tonic drive to the neurons without being involved in the generation of active expiration. These tonic neurons could be located at a distance from the presumed rhythmogenic core.

      We have included the possibility of tonic excitation in the technical considerations section:

      “However, our study did not include recording from late-E neurons following bicuculline injections, preventing us from definitively confirming whether the observed activity stems from late-E neuronal excitation or the potentiation of a tonic drive, particularly in the rostral areas.”

      (5) It is intriguing that some of their injections (Fig.2D) evoked postinspiratory activity. This interesting finding should be discussed as it could provide important insights into the coordination of the different phases of expiration.

      Thanks for the suggestion. We have included the following to the discussion:

      “Therefore, although it has previously been described, the exact mechanism by which this post-I ABD activity is generated is unclear. This late-E/post-I pattern of activity is similar to what has been observed in in vitro preparations and in vivo recordings in juvenile rats (Janczewski et al., 2002; Janczewski et al., 2006).

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (6) The authors conducted bilateral disinhibition of the pFL, but only a unilateral photomicrograph was shown. Figure 2 should include a representative bilateral photomicrograph along with a scatter plot for clarity and completeness.

      We have edited figure 2 to include representative images of bilateral injections.

      (7) Regarding the Bicuculline injections in the Methods section: Aside from specifying exactly what type of bicuculline was used, the authors should provide more information about the pFL location and landmarks used, including the missing medial-lateral coordinate. The fluorobead spread of approximately ~300 µm, as observed in Figure 2C, is crucial for the interpretation of the results and should be detailed. An alternative approach could involve e.g. calculating the area covered by fluorobeads in each group.

      We have included the following in the text:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5)”

      “These findings strongly suggest that bicuculline specifically activated cells within the vicinity of the injection sites which spread ~300 ìm (Figure 2C, horizontal lines) and did not activate PHOX2B+ cells in the RTN area, beyond their baseline level of activity.”

      (8) In the Experimental Protocol, the authors should provide more details on how the parameters were determined. For example, specify the number of cycles included for Dia frequency/amplitude, Abd frequency/amplitude, and with regards to the averaging process, the authors should specify over how many cycles they obtained an average for Dia/Abd activity time and AUC. The authors should also provide information on the number of bicuculline injections that they repeated to average these values and they should report the coefficient of variation for repeated injections. Please clarify the method used to calculate AUC, considering the non-linear nature of the activity.

      Only one bicuculline injection per rat was performed and the number of rats used for each injection site is indicated in the methods as follows:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5), and CTRL (n=7). We recorded the physiological responses to the injection for 20-25 min.”

      We have clarified in the methods section the following:

      “Respiratory data was tracked in time bins of 2-minute duration from the baseline period prior to injections and spanned 20 min of recording post-injection. Mean-cycle measurements for each signal were computed by averaging values across all cycles within a given time bin.”

      Additional clarifications have been added:

      “We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “Mean-cycle measurements for each signal were computed by averaging across all cycles within a given time bin. (~300 cycles in baseline, ~100 cycles per response time bin). We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “The Area under the curve (AUC) was measured during baseline and was subtracted from the corresponding AUC of the response for each time bin (Figure 1C). This AUC measure was computed as the sum of the signal in a given respiratory phase as all signals were sampled at the same rate. Note that areas calculated below the zero- (0) line, as would be expected from a negative airflow during expiration, yields negative AUC values.”

      (9) The authors should explain how oxygen consumption was calculated-did it involve the Depocas & Hart (1957) formula? Please provide information on expiratory CO2, whether ventilation was adjusted to achieve consistent CO2 levels across animals, and ideally specify the end-tidal CO2 range for the experiments. Discuss the rationale behind the chosen CO2 levels and whether CO2-dependent pFL activity could have influenced results.

      We have clarified in the measurement in the methods as follows:

      “The gas analyzer measured fractional concentration of O2. Based on this and the flow rate at the level of the trachea (minute ventilation), we calculated O2 consumption according to Depocas and Hart (1957).”

      We have also added to the methods section:

      “During the entire experimental procedure, rats breathed spontaneously and end tidal CO2 was not adjusted through the experimental protocol.”

      In terms of the CO2-dependent pFL activity possibly influencing the results: by inducing active expiration in conditions in which there is no physiological demand for it (i.e. no hypoxia or hypercapnia), it is likely that pCO2 is reduced, overall decreasing the drive for ABD activity which would suggest that our results are likely an underestimation of the response that would have been produced if we maintained the CO2 levels constant.

      (10) The authors should address the discrepancy in fos-activated neurons between the control (44 neurons) and experimental animals (90-120 neurons per hemisection). Please explain the activation in the control group. Please also provide insights into how the authors interpret this difference in cfos-activated neurons between control and experimental groups.

      The following paragraph has been added to the discussion:

      “The assessment of cellular activity, quantified through cFos staining, unveiled the existence of basal activity in control rats. This observed baseline activity is likely emanating from subthreshold physiological processes within the parafacial area which do not culminate in ABD activity. Analysis of the cFos staining confirmed focal activation of neurons in the pFL of rats injected with bicuculline and minimal cFos expression in the PHOX2B+ cells in all groups as compared to the control group. These results confirm the very limited mediolateral spread of the drug from the core site of injection and back previous findings supporting the hypothesis that the majority of PHOX2B+ cells are more ventrally located in the parafacial area (pFV, Huckstepp et al., 2015) and PHOX2B+ cell recruitment is not necessary for active expiration (de Britto & Moraes, 2017; Magalhães et al., 2021).”

      (11) In Figure 8, the authors plotted the relationship of each cycle correlated to the normalized area. Have you also calculated the same late-E, inspiratory, and post-I to fR or VT separately?

      No, we only did the separated breathing phase (late-E, I, Post-I) analysis in the calculations of the DIA, airflow and ABD area, as well as on the Euclidean and Mahalanobis distances.

      Minor comments:

      Is there any specific reason for conducting these experiments exclusively in males?

      No, we usually use male rats for this type of experiments. We use both male and female rats for other studies that concern the effects of sex hormones but in this case, we performed experiments only in male rats.

      Page 13, Line 320: What is the duration of the bicuculline-induced effects?

      This information is included in the results section as follows:

      “Similarly, the ABD response duration was longer at the two most rostral locations (+0.6 mm = 17.6 ± 2.7 min; +0.8 = 17.1 ± 3.3 min) compared to the most caudal group (-0.2 mm = 2.4 ± 1.1 min; One-Way ANOVA p = 0.043; Tukey -0.2 mm vs +0.6 mm: p = 0.048; -0.2 mm vs +0.8 mm: p = 0.041; Figure 3E).”

      Page 16, Line 400: Is there a rationale for the high tidal volume (VT) observed in these animals? A baseline VT of 7 ml/kg appears notably elevated.

      Please note that rats were vagotomised and spontaneously breathing, hence the tidal volume is increased compared to non-vagotomised rats as seen in previous studies (Ouahchi et al., 2011).

      Figure 2D: Could you provide longer recordings? Additionally, incorporating diaphragm (Dia) recordings would enhance the interpretation of abdominal (Abd) recordings.

      Figure 3 A has a representative example of the 20 minute recordings for each location.

      Page 18, Line 458: Please rectify "Dunn: p , 0.001" to the appropriate format, perhaps "Dunn: p < 0.001."

      Thank you, edited.

    1. eLife assessment

      This important study addresses a fundamental question about how wing morphology and kinematics changed as insect species miniaturized. The authors found no significant correlation between body size and wing kinematics across eight hoverfly species, and instead argue that evolutionary changes in wing size and shape enabled flight in smaller species. However, if the integrative approach to animal biomechanics is strong, the evidence supporting the general conclusion that changes in wing morphology, rather than kinematics, correlate with miniaturization is incomplete and would benefit from more detailed biomechanical analysis and improved methods for phylogenetic comparison.

    2. Reviewer #1 (Public Review):

      Summary:

      In "Changes in wing morphology..." Roy et al investigate the potential allometric scaling in wing morphology and wing kinematics in 8 different hoverfly species. Their study nicely combines different new and classic techniques, investigating flight in an important, yet understudied alternative pollinator. I want to emphasize that I have been asked to review this from a hoverfly biology perspective, as I do not work on flight kinematics. I will thus not review that part of the work.

      Strengths:

      The paper is well-written and the figures are well laid out. The methods are easy to follow, and the rationale and logic for each experiment are easy to follow. The introduction sets the scene well, and the discussion is appropriate. The summary sentences throughout the text help the reader.

      Weaknesses:

      The ability to hover is described as useful for either feeding or mating. However, several of the North European species studied here would not use hovering for feeding, as they tend to land on the flowers that they feed from. I would therefore argue that the main selection pressure for hovering ability could be courtship and mating. If the authors disagree with this, they could back up their claims with the literature. On that note, a weakness of this paper is that the data for both sexes are merged. If we agree that hovering may be a sexually dimorphic behaviour, then merging flight dynamics from males and females could be an issue in the interpretation. I understand that separating males from females in the movies is difficult, but this could be addressed in the Discussion, to explain why you do not (or do) think that this could cause an issue in the interpretation.

      The flight arena is not very big. In my experience, it is very difficult to get hoverflies to fly properly in smaller spaces, and definitely almost impossible to get proper hovering. Do you have evidence that they were flying "normally" and not just bouncing between the walls? How long was each 'flight sequence'? You selected the parts with the slowest flight speed, presumably to get as close to hovering as possible, but how sure are you that this represented proper hovering and not a brief slowdown of thrust?

      Your 8 species are evolutionarily well-spaced, but as they were all selected from a similar habitat (your campus), their ecology is presumably very similar. Can this affect your interpretation of your data? I don't think all 6000 species of hoverflies could be said to have similar ecology - they live across too many different habitats. For example, on line 541 you say that wingbeat kinematics were stable across hoverfly species. Could this be caused by their similar habitat?

    3. Reviewer #2 (Public Review):

      Summary

      Le Roy et al quantify wing morphology and wing kinematics across eight hoverfly species that differ in body mass; the aim is to identify how weight support during hovering is ensured. Wing shape and relative wing size vary significantly with body mass, but wing kinematics are reported to be size-invariant. On the basis of these results, it is concluded that weight support is achieved solely through size-specific variations in wing morphology and that these changes enabled hoverflies to decrease in size throughout their phylogenetic history. Adjusting wing morphology may be preferable compared to the alternative strategy of altering wing kinematics, because kinematics may be under strong evolutionary and ecological constraints, dictated by the highly specialised flight and ecology of the hoverflies.

      Strengths

      The study deploys a vast array of challenging techniques, including flight experiments, morphometrics, phylogenetic analysis, and numerical simulations; it so illustrates both the power and beauty of an integrative approach to animal biomechanics. The question is well motivated, the methods appropriately designed, and the discussion elegantly and convincingly places the results in broad biomechanical, ecological, evolutionary, and comparative contexts.

      Weaknesses

      (1) In assessing evolutionary allometry, it is key to identify the variation expected from changes in size alone. The null hypothesis for wing morphology is well-defined (isometry), but the equivalent predictions for kinematic parameters remain unclear. Explicit and well-justified null hypotheses for the expected size-specific variation in angular velocity, angle-of-attack, stroke amplitude, and wingbeat frequency would substantially strengthen the paper, and clarify its evolutionary implications.

      (2) By relating the aerodynamic output force to wing morphology and kinematics, it is concluded that smaller hoverflies will find it more challenging to support their body mass - a scaling argument that provides the framework for this work. This hypothesis appears to stand in direct contrast to classic scaling theory, where the gravitational force is thought to present a bigger challenge for larger animals, due to their disadvantageous surface-to-volume ratios. The same problem ought to occur in hoverflies, for wing kinematics must ultimately be the result of the energy injected by the flight engine: muscle. Much like in terrestrial animals, equivalent weight support in flying animals thus requires a positive allometry of muscle force output. In other words, if a large hoverfly is able to generate the wing kinematics that suffice to support body weight, an isometrically smaller hoverfly should be, too (but not vice versa). Clarifying the relation between the scaling of muscle force input, wing kinematics, and weight support would resolve the conflict between these two contrasting hypotheses, and considerably strengthen the biomechanical motivation and interpretation.

      (3) The main conclusion - that evolutionary miniaturization is enabled by changes in wing morphology - is only weakly supported by the evidence. First, although wing morphology deviates from the null hypothesis of isometry, the difference is small, and hoverflies about an order of magnitude lighter than the smallest species included in the study exist. Including morphological data on these species, likely accessible through museum collections, would substantially enhance the confidence that size-specific variation in wing morphology occurs not only within medium-sized but also in the smallest hoverflies, and has thus indeed played a key role in evolutionary miniaturization. Second, although wing kinematics do not vary significantly with size, clear trends are visible; indeed, the numerical simulations revealed that weight support is only achieved if variations in wing beat frequency across species are included. A more critical discussion of both observations may render the main conclusions less clear-cut, but would provide a more balanced representation of the experimental and computational results.

      In many ways, this work provides a blueprint for work in evolutionary biomechanics; the breadth of both the methods and the discussion reflects outstanding scholarship. It also illustrates a key difficulty for the field: comparative data is challenging and time-consuming to procure, and behavioural parameters are characteristically noisy. Major methodological advances are needed to obtain data across large numbers of species that vary drastically in size with reasonable effort, so that statistically robust conclusions are possible.

    4. Reviewer #3 (Public Review):

      The paper by Le Roy and colleagues seeks to ask whether wing morphology or wing kinematics enable miniaturization in an interesting clade of agile flying insects. Isometry argues that insects cannot maintain both the same kinematics and the same wing morphology as body size changes. This raises a long-standing question of which varies allometrically. The authors do a deep dive into the morphology and kinematics of eight specific species across the hoverfly phylogeny. They show broadly that wing kinematics do not scale strongly with body size, but several parameters of wing morphology do in a manner different from isometry leading to the conclusion that these species have changed wing shape and size more than kinematics. The authors find no phylogenetic signal in the specific traits they analyze and conclude that they can therefore ignore phylogeny in the later analyses. They use both a quasi-steady simplification of flight aerodynamics and a series of CFD analyses to attribute specific components of wing shape and size to the variation in body size observed. However, the link to specific correlated evolution, and especially the suggestion of enabling or promoting miniaturization, is fraught and not as strongly supported by the available evidence.

      The aerodynamic and morphological data collection, modeling, and interpretation are very strong. The authors do an excellent job combining a highly interpretable quasi-steady model with CFD and geometric morphometrics. This allows them to directly parse out the effects of size, shape, and kinematics.

      Despite the lack of a relationship between wing kinematics and size, there is a large amount of kinematic variation across the species and individual wing strokes. The absolute differences in Figure 3F - I could have a very large impact on force production but they do indeed not seem to change with body size. This is quite interesting and is supported by aerodynamic analyses.

      The authors switch between analyzing their data based on individuals and based on species. This creates some pseudoreplication concerns in Figures 4 and S2 and it is confusing why the analysis approach is not consistent between Figures 4 and 5. In general, the trends appear to be robust to this, although the presence of one much larger species weighs the regressions heavily. Care should be taken in interpreting the statistical results that mix intra- and inter-specific variation in the same trend.

      The authors based much of their analyses on the lack of a statistically significant phylogenetic signal. The statistical power for detecting such a signal is likely very weak with 8 species. Even if there is no phylogenetic signal in specific traits, that does not necessarily mean that there is no phylogenetic impact on the covariation between traits. Many comparative methods can test the association of two traits across a phylogeny (e.g. a phylogenetic GLM) and a phylogenetic PCA would test if the patterns of variation in shape are robust to phylogeny.

      The analysis of miniaturization on the broader phylogeny is incomplete. The conclusion that hoverflies tend towards smaller sizes is based on an ancestral state reconstruction. This is difficult to assess because of some important missing information. Specifically, such reconstructions depend on branch lengths and the model of evolution used, which were not specified. It was unclear how the tree was time-calibrated. Most often ancestral state reconstructions utilize a maximum likelihood estimate based on a Brownian motion model of evolution but this would be at odds with the hypothesis that the clade is miniaturizing over time. Indeed such an analysis will be biased to look like it produces a lot of changes towards smaller body size if there is one very large taxa because this will heavily weight the internal nodes. Even within this analysis, there is little quantitative support for the conclusion of miniaturization, and the discussion is restricted to a general statement about more recently diverged species. Such analyses are better supported by phylogenetic tests of directedness in the trait over time, such as fitting a model with an adaptive peak or others.

      Setting aside whether the clade as a whole tends towards smaller size, there is a further concern about the correlation of variation in wing morphology and changes in size (and the corresponding conclusion about lack of co-evolution in wing kinematics). Showing that there is a trend towards smaller size and a change in wing morphology does not test explicitly that these two are correlated with the phylogeny. Moreover, the subsample of species considered does not appear to recapitulate the miniaturization result of the larger ancestral state reconstruction.

      Given the limitations of the phylogenetic comparative methods presented, the authors did not fully support the general conclusion that changes in wing morphology, rather than kinematics, correlate with or enable miniaturization. The aerodynamic analysis across the 8 species does however hold significant value and the data support the conclusion as far as it extends to these 8 species. This is suggestive but not conclusive that the analysis of consistent kinematics and allometric morphology will extend across the group and extend to miniaturization. Nonetheless, hoverflies face many shared ecological pressures on performance and the authors summarize these well. The conclusions of morphological allometry and conserved kinematics are supported in this subset and point to a clade-wide pattern without having to support an explicit hypothesis about miniaturization.

      The data and analyses on these 8 species provide an important piece of work on a group of insects that are receiving growing attention for their interesting behaviors, accessibility, and ecologies. The conclusions about morphology vs. kinematics provide an important piece to a growing discussion of the different ways in which insects fly. Sometimes morphology varies, and sometimes kinematics depending on the clade, but it is clear that morphology plays a large role in this group. The discussion also relates to similar themes being investigated in other flying organisms. Given the limitations of the miniaturization analyses, the impact of this study will be limited to the general question of what promotes or at least correlates with evolutionary trends towards smaller body size and at what phylogenetic scale body size is systematically decreasing.

      In general, there is an important place for work that combines broad phylogenetic comparison of traits with more detailed mechanistic studies on a subset of species, but a lot of care has to be taken about how the conclusions generalize. In this case, since the miniaturization trend does not extend to the 8 species subsample of the phylogeny and is only minimally supported in the broader phylogeny, the paper warrants a narrower conclusion about the connection between conserved kinematics and shared life history/ecology.

    5. Author response:

      We thank the reviewers for their highly valuable comments and recommendations on our manuscript. We particularly appreciate receiving reviews from three distinct points of view, all highly relevant to our study (i.e. from an ecological, biomechanics, and evolutionary biology perspective).

      We will now carefully address all reviewer comments and questions, and resubmit a revised version in due time. Again, we thank the reviewers for their rigorous assessment of our study, which will greatly help us improving our manuscript.

    1. eLife assessment

      This article reports an important bioluminescence-based reporter system to evaluate kinase conformations. This assay is applied to four different kinases that have unique, very special regulatory features, thereby indicating that the assay can be used to provide convincing evidence on the conformational state of a large number of kinases. This paper will be of interest to researchers working on kinases and their conformational states.

    2. Reviewer #1 (Public Review):

      Summary:

      This technical report by Kugler at al., expands the application of a fluorescence-based reporter to study the conformational state of various kinases. This reporter, named KinCon (Kinase Conformation), interrogates the conformational state of a kinase (i.e., active vs. inactive) based on engineering complementary fusion proteins that fluoresce upon interaction. This assay has several advantages as it allows studying full-length kinases, that is, the kinase domain and regulatory domains, inside the cell and under various experimental conditions such as the presence of inhibitors or activator proteins, and in wildtype and mutants involved in disease states.

      Strengths:

      One major strength of this study is that it is quite comprehensive. The authors use KinCon for four different kinases, BRAF, LKB1, RIP and CDK4/6. These kinases have very different regulatory elements and associated proteins, which the authors explore to study their conformational state. Moreover, they use small molecule inhibitors or mutations to further dissect how the conformational state of the kinase in disease states. The collective set of results strongly suggests that KinCon is a versatile tool that can be used to study many kinases of biomedical and fundamental importance. Given that kinases are extensively studied by researchers in academia or industry, KinCon could have a broad impact as well.

      Weaknesses:

      This manuscript, however, also has several weaknesses that I outline below. These weaknesses decrease the overall level of impact on the manuscript, as is.<br /> • The manuscript is exceedingly long. For instance, the introduction provides background information for each kinase that is further expanded in the results section. I think the background information for each kinase in the Introduction and Results sections can be significantly reduced to highlight the major points. Otherwise, not only does the manuscript become too long, but also the main points get diluted.

      • Similarly, the figure legends are very long, providing information that is already in the main text or in Methods. The authors should provide the essential information to understand the figure.

      • A major concern throughout the manuscript is the use of the word "dynamics," which is used in the text in various contexts. The authors should clarify what they understand for dynamics of conformation. Are they measuring how the time-dependent process by which the kinase is interconverting between active and inactive states? It seems to me that the assays in this report evaluate a population of kinases that are in an open or close conformation (i.e., a particular state in each experimental condition) but there is not direct information how the kinase goes from one state to the other. In that sense, the use of dynamics is unclear. Also, the use of dynamics in different sentences in ambiguous. Here are a few examples but this should be revised throughout the manuscript:<br /> - Line 27: dynamics of full-length protein kinases. Is this referred to dynamics of conformational interconversion between inactive and active states?<br /> - Line 138: dynamic functioning of kinases. No clear what that means.<br /> - Line 276: ... alters KinCon dynamics. Not clear if they are measuring time-dependent process or a single point.<br /> - Figure legend 4F: dynamics of CDK4/6 reporters. Again, not clear how the assay is measuring dynamics.<br /> Nonetheless, in my opinion the authors use proper terminology that describes their assay in which the term dynamics is not used: Title (... impact of protein and small molecule interactions on kinase conformations) and Line 89 (... reporter can be used to track conformational changes of kinases...)

      • The authors use the phrase that KinCon has predictive capabilities (abstract and line 142). What do the authors refer to this?

      • The authors indicate that KinCon is a highly sensitive assay. Can the authors elaborate on what high sensitivity means? For example, can they discuss how other fluorescence-based approaches that are less sensitive would not be able to accomplish the same type of results or derive similar conclusions? Can they provide a resolution metric both in space and time? Given that the authors state that this is a technical report, this information is of relevance.

      • The authors nicely describe how KinCon works in Figure 1B and part of 1C. I do think that the bottom of panel 1C needs to be revised, as well as the text describing the potential scenarios of potency, efficacy and synergism.<br /> - One issue with this part of Figure 1C is that it is not clear what the x-axis in the 3 plots refer to. Is this time? Is this concentration of a small molecule, inhibitor or binding partner? This was confusing also in the context of the term dynamics used throughout the text. The terms potency, efficacy and synergism should be subtitles or the panels and the x-axis should be better defined, especially for a non-specialized reader.<br /> - Related to this part of Figure 1C is the text. The authors mention potency, effectiveness and synergy (Line 195). Can the authors use more fundamental terminology related to these three scenarios, for example, changes in activation constant, percent of protein activates? Also, why synergy is only related to effectiveness? Can synergy also be associated to potency?<br /> - Lastly, the use of these three cartoons gives the impression that the experimental results to come will follow a similar representation. Instead, the results are presented in bar plots for many different conditions. I think this will lead to confusion for a broad audience.

      • For a non-expert reader, can the authors clarify the use of tracking basal conformations vs. transient over-expression of the various KinCon constructs? Moreover, the authors use the term transient over-expression for 10, 16, 24 and 48 h (Line 203). This, to a non-expert reader, seems not transient.

      • Regarding Figure 1E and similar graphical representations: Why is the signal (RLU) non-linear with time? If the fluorescence of the KinCon construct is linearly related with its expression or concentration inside the cell, one would expect a linear increase. Have the authors plotted RLU/Expression band intensity to account for changes in protein concentration? For instance, some of the results within Figure 3 are normalized to concentration on the reporter expression level.

      • For the results with LKB1, the authors claim that intermediate fold change in fluorescence (Figure 2E) is due to a partially closed intermediate state (Line 262). Can the authors discard the possibility by which there is a change in populations of active and inactive that on average give intermediate values?

      • The authors claim in Line 274 that mutations located at the interface of the LKB1/STRADalkpha complex affect interactions and hypothesize that allosteric communication between LKB1 and STRADalpha is essential for function. Given that this mutations are at the interaction interface, why would the authors postulate an allosteric mechanism that evokes an effect distant to the interaction/active site? Could it be that function requires surface contacts alone that are disrupted by the mutations?

      • I was unable to find text to explain the following: Figure 2I shows the mutation R74A as n.s., but in the text only W308C is mentioned to not change fluorescence. Could the authors clarify why R74A is not discussed in the text? Maybe this reviewer missed the text in which it was discussed. Similarly, the author states in line 326 that the study included an analysis of RIPK2. However, I was unable to find results, graphs or additional text discussing RIPK2.

      • Some figures of RLU use absolute values, percentages and fold change. Is there a reason why the authors use different Y-axis values? These should be explained and justified in Methods. Similarly, bars for wt in Figures 3D, G, or 4D, E,F show no errors. How are the authors normalizing the data and repeats so that there is no error, and are they treating the rest of the data (i.e., mutants and/or treated with small molecules) in the same way?

      • Lastly, the section starting in Line 472 reads more like a discussion of results from different type of inhibitors used in this study that results on its own. The authors should consider a new subtitle as results or make this section a discussion.

    3. Reviewer #2 (Public Review):

      Summary:

      Protein kinases have been very successfully targeted with small molecules for several decades, with many compounds (including clinical drugs) bringing about conformational changes that are also relevant to broader interactions with the cellular signaling networks that they control. The authors set out to develop a targeted biosensor approach to evaluate distinct kinase conformations in cells for multiple kinases in the context of incoming signals, other proteins and small molecule binding, with a broad goal of using the KinCon assay to confirm (and perhaps predict) how drug binding or signal perception changes conformations and outputs in the presence of cellular complexes; this work will likely impact on the field with cellular reporters of kinase conformations a useful addition to the toolbox.

      Strengths:

      The KinCon reporter platform has previously been validated for well-known kinases; in this study, the team evaluate how to employ a full-length kinase (often containing a known pathological mutation). The sensitive detection method is based on a Renilla luciferase (RLuc)protein fragment complementation assay, where individual RLuc fragments are present at the N and the C terminus of the kinase. This report, which is both technical and practical in nature, co-expresses the kinase with known interactors (at low levels) in a high throughput format and then performs pharmacological evaluation with known small molecule kinase modulators. This is explained nicely in Figure 1, as are the signaling pathways that are being evaluated. Data demonstrate that V600E BRAF iexposed to vemurafenib is converted to the inactive conformation, as expected. In contrast, the more closed STRAD𝛼 and LKB1 KinCon conformations appear to represent the more active state of the complexed kinase, and a W308C mutation (evaluated alongside others) reverses this effect. The authors then evaluated necroptotic signaling in the context of RIPK1/3 under conditions where RIPK1 and RIPK3 are active, confirming that the reporters highlight the active states of both kinases. Exposure to compounds that are known to engage with the RIPK1 arm of the pathway induce bioluminescence changes consistent with the opening (inactivation) of the kinase. Finally, the authors move to an important drug target for which clinical drugs have arrived relatively recently; the CDK4/6 complexes. These are of additional importance because kinase-independent functions also exist for CDK6, and the effects of drugs in cells usually relies on a downstream marker, rather than demonstration of direct protein complex engagement. The data presented are interpreted as the formation of complexes with the CDK inhibitor p16INK4a; reducing the affinity of the interaction through mutations drives an inactive conformation, whilst the application of CDK4/6 inhibitors does not, implying binding to the active conformation.

      Weaknesses:

      (1) The work is very solid, and uses examples from the literature and also extends into new experimental space. An obvious weakness is mentioned by the authors for the CKDK data, in that measurements with Cyclin D (the activating subunit) are not characterised, although Cyclin D might be assumed to be present?<br /> (2) The work with the trimeric LKB1 complex involves pseudokinase, STRADalpha, whose conformation is also examined as a function of LKB1 status; since STRAD is an activator of LKB1, a future goal should be the evaluation of the complex in the presence of STRAD inhibitory/activating small molecules.

    4. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank you and the two Reviewers for the thoughtful evaluation of the manuscript and the support for publication. We have addressed all points raised by the two Reviewers.

      - We have extensively streamlined the manuscript. Repetitive passages regarding the respective kinase cascades have been removed.

      - We improved the presentation of the main Figures (mainly labeling and font size):

      - Figure 1: C, D, E, F o Figure 2: C, E, F, G, I, o Figure 3: D o Figure 4: F

      - Figure 5: A, B, C, D, E

      - We integrated new SI-data related to kinase functions, expression and the ‘cell-type comparisons’ of the KinCon reporter system (Figure Supplement 4, 5).

      Below you will find a detailed point-by-point response.

      Reviewer #1 (Recommendations For The Authors):

      Regarding the issue of the use of the word "dynamics," as described in the public review, here are a few examples of ambiguous use in different sentences: o Line 27: dynamics of full-length protein kinases. Is this referring to the dynamics of conformational interconversion between inactive and active states?

      - Line 138: dynamic functioning of kinases. It is not clear what this means. o Line 276: ... alters KinCon dynamics. Not clear if they are measuring time-dependent process or a single point. 

      - Figure legend 4F: dynamics of CDK4/6 reporters. Again, not clear how the assay is measuring dynamics.

      In my opinion, the authors use proper terminology that describes their assay in which the term dynamics is not used: Title: "... impact of protein and small molecule interactions on kinase conformations" and Line 89 "... reporter can be used to track conformational changes of kinases...".

      We have replaced the “dynamics” sections. 

      - Line 27: The understanding of the structural dynamics of…

      - Line 91: This reporter can be used to track dynamic changes of kinases conformations…

      - Line 139: Conventional methods often fall short in capturing the dynamics of kinases within their native cellular environments…

      - Line 146: Such insights into the molecular structure dynamics of kinases in intact cells…

      - Line 199: In order to enhance our understanding of kinase structure dynamics…

      - Line 276: These findings underline that indeed the trimeric complex formation alters….

      - Figure Legend 4F: Quantification of alterations of CDK4/6 KinCon reporter bioluminescence signals…

      The authors state that KinCon has predictive capabilities (abstract and line 142). What do  the authors mean by this?

      Previously we have benchmarked the suitability of the KinCon reporter for target engagement assays of wt and mutated kinase activities. With this we determined specificities of melanoma drugs for mutated BRAF variants (Mayrhofer 2020, PNAS). 

      The authors indicate that KinCon is a highly sensitive assay. Can the authors elaborate on what high sensitivity means?  

      With sensitivity we mean that we can detect conformation dynamics of the reporter at low expression levels of the hybrid protein expressed in the cell line of choice.

      - Line 209: Immunoblotting of cell lysates following luminescence measurements showed expression levels of the reporters in the range and below the endogenous expressed kinases (Figure 1E).  …

      - Line 219:   Using this readout, we showed that at expression levels of the BRAF KinCon reporter below the immunoblotting detection limit, one hour of drug exposure exclusively converted BRAF-V600E to the more closed conformation (Figure 1F, G, Figure Supplement 1B). 

      - Line 221: These data underline that at expression levels far below the endogenous kinase, protein activity conformations can be tracked in intact cells. …

      For example, can they discuss how other fluorescence-based approaches that are less sensitive would not be able to accomplish the same type of results or derive similar conclusions? Can they provide a resolution metric both in space and time? Given that the authors state that this is a technical report, this information is of relevance.

      We highlight the key pros & cons of the KinCon reporter technology in following sections:

      -Line 529: The KinCon technology, introduced here, seeks to address the previously mentioned challenges. It has the potential to become a valuable asset for tracking kinase functions in living cells which are hard to measure solely via phosphotransferase activities. Overall, it offers an innovative solution for understanding kinase activity conformations, which could pave the way for more novel intervention strategies for kinase entities with limited pharmaceutical targeting potential. So far, this relates to the tracking of kinase-scaffold and pseudo-kinase functions.

      - Line 535: Key advantages of the KinCon reporter technology is the robustness of the system to track kinase conformations at varying expression levels. However, in contrast to fluorescence-based reporter read-outs subcellular analysis and cell sorting are still challenging due to comparable low levels of light emission

      The authors nicely describe how KinCon works in Figure 1B and part of 1C. I do think that the bottom of panel 1C needs to be revised, as well as the text describing the potential scenarios of potency, efficacy, and synergism.

      One issue with this part of Figure 1C is that it is not clear what the x-axis in the 3 plots refers to. Is this time? Is this concentration of a small molecule, inhibitor, or binding partner? This was confusing also in the context of the term dynamics used throughout the text. The terms potency, efficacy, and synergism should be subtitles, or the panels and the x-axis should be better defined, especially for a non-specialized reader.

      Related to this part of Figure 1C is the text. The authors mention potency, effectiveness, and synergy (Line 195). Can the authors use more fundamental terminology related to these three scenarios, for example, changes in activation constant, and percent of protein activates? Also, why synergy is only related to effectiveness? Can synergy also be associated with potency?

      Thank you for bringing this up, we have revised Figure 1C to better reflect the mentioned effects of potency. To avoid confusion, we removed the illustration for drug synergism. Accordingly, we have integrated the axis descriptions for the presented dose-response curves.   

      Thus, we have further streamlined the text in the introduction – examples are shown below:

      - Line 195: Light recordings and subsequent calculations of time-dependent dosage variations of bioluminescence signatures of parallel implemented KinCon configurations aid in establishing dose-response curves. These curves are used for discerning pharmacological characteristics such as drug potency, effectiveness of drug candidates, and potential drug synergies (Figure 1C)

      - Figure 1C:  Shown is the workflow for the KinCon reporter construct engineering and analyses using KinCon technology. The kinase gene of interest is inserted into the multiple cloning site of a mammalian expression vector which is flanked by respective PCA fragments (-F[1], -F[2]) and separated with interjacent flexible linkers. Expression of the genetically encoded reporter in indicated multi-well formats allows to vary expression levels and define a coherent drug treatment plan. Moreover, it is possible to alter the kinase sequence (mutations) or to co-express or knock-down the respective endogenous kinase, interlinked kinases or proteinogenic regulators of the respective pathway. After systematic administration of pathway modulating drugs or drug candidates, analyses of KinCon structure dynamics may reveal alterations in potency, efficacy, and potential synergistic effects of the tested bioactive small molecules (schematic dose response curves are depicted)

      Lastly, the use of these three cartoons gives the impression that the experimental results to come will follow a similar representation. Instead, the results are presented in bar plots for many different conditions. I think this will lead to confusion for a broad audience.

      The bottom panel of Figure 1C is not the depiction of real experiments but rather an illustration of fitted dose-response curves. We would like to present previous demonstrations of doseresponse curves using BRAF KinCon data and ERK phosphorylation (Röck 2019, Sci. Advances) 

      We further agree with the reviewer and have therefore added a new part in the methods section addressing the evaluation of data extensively. 

      - Line 668: In Figure 1 E and F, a representative experiment of n=4 independent experiments is shown. In these cases, absolute bioluminescence values without any normalization are shown. Otherwise, data was indicated as RLU (relative light unit) fold change. This means the data was normalized on the indicated control condition (either with normalization of the western blot or without; as indicated.

      For a non-expert reader, can the authors clarify the use of tracking basal conformations vs. transient over-expression of the various KinCon constructs? Moreover, the authors use the term transient over-expression for 10, 16, 24, and 48 h (Line 203). This, to a non-expert reader, does not seem transient.

      We have revised the manuscript to clarify it:

      - Line 207: We showed that transient over-expression of these KinCon reporters for a time frame of 10h, 16h, 24h or 48h in HEK293T cells delivers consistently increasing signals for all KinCon reporters (Figure 1E, Figure Supplement 1A). 

      - Figure 1E) Representative KinCon experiments of time-dependent expressions of indicated KinCon reporter constructs in HEK293T cells are shown (mean ±SEM). Indicated KinCon reporters were transiently over-expressed in 24-well format in HEK293T cells for 10h, 16h, 24h and 48h each.

      Regarding Figure 1E and similar graphical representations: Why is the signal (RLU) nonlinear with time? If the fluorescence of the KinCon construct is linearly related to its expression or concentration inside the cell, one would expect a linear increase. Have the authors plotted RLU/Expression band intensity to account for changes in protein concentration? For instance, some of the results within Figure 3 are normalized to concentration on reporter expression level.

      Out intention was to show that varying expression levels can be used for the illustrated target engagement assays.Indeed, the represented elevations of RLU might be  due to factors such as: 

      - Doubling times of cells

      - Cell density

      - Media composition (which changes over time)

      - Reporter protein stabilities

      - Abundance of interactors of kinases

      For the results with LKB1, the authors claim that intermediate fold change in fluorescence (Figure 2E) is due to a partially closed intermediate state (Line 262). Can the authors discard the possibility by which there is a change in populations of active and inactive that on average give intermediate values?

      Based on our experience with KinCon reporter conformation states of kinases we tested so far, we assume that the presented data reflects an intermediate state. We agree that it needs further validation. We have changed the text accordingly:

      - Line 264: Upon interaction with LKB1 this conformation shifts to a partially closed intermediate state.

      The authors claim in Line 274 that mutations located at the interface of the LKB1/STRADalpha complex affect interactions and hypothesize that allosteric communication between LKB1 and STRADalpha is essential for function. Given that these mutations are at the interaction interface, why would the authors postulate an allosteric mechanism that evokes an effect distant from the interaction/active site? Could it be that function requires surface contacts alone that are disrupted by the mutations?

      We agree with the reviewer and changed our argumentation for this point:

      - Line 276: These findings underline that indeed the trimeric complex formation alters the opening and closing of the tested full-length kinase structures using the applied KinCon reporter read out

      I was unable to find text to explain the following: Figure 2I shows the mutation R74A as n.s., but in the text, only W308C is mentioned to not change fluorescence. Could the authors clarify why R74A is not discussed in the text?  Maybe this reviewer missed the text in which it was discussed.

      We adapted the manuscript and include the R74A mutation as followed:

      - Line 296: Among these mutations, only the W308C and R74A mutation prevented significant closing of the LKB1 conformation when co-expressed with STRAD𝛼 and MO25 (Figure 2I).

      In Figure 2I where the individual measurements of the LKB1-R74A KinCon are highlighted in red to better emphasize the deviations. In the case of the R74A mutation the effect seen might be due to the high deviation between the experiments (Highlighted in red). These deviations are much higher when compared to either the wt or the W308 mutant, and can also be seen in the LKB1-R74A-KinCon only condition (white). Even though no significant closing of the LKB1 conformation could be observed in the case of R74A, we believe, since the trend of the conformation closing upon complex formation is still visible that the effect is still there. Further replicates would be necessary to validate this theory. 

      Similarly, the authors state in line 326 that the study included an analysis of RIPK2. However, I was unable to find results, graphs, or additional text discussing RIPK2.

      The RIPK2 conformation was analyzed in Figure 3C (page 12).

      Some figures of RLU use absolute values, percentages, and fold change. Is there are reason why the authors use different Y-axis values? These should be explained and justified in Methods. Similarly, bars for wt in Figures 3D, G, or 4D, E, F show no errors. How are the authors normalizing the data and repeats so that there is no error, and are they treating the rest of the data (i.e., mutants and/or treated with small molecules) in the same way?

      We have changed the Y-axis values. Now, throughout the manuscript we show that there is a RLU fold-change. Except are selected experiments when solely absolute RLU values are shown (such as Figure 1E, F). We have also decided to integrate a paragraph into the methods section (Line 655). Figure 3D was changed as well.

      - Line 668: In Figure 1 E and F, a representative experiment of n=4 independent experiments is shown.  In these cases absolute bioluminescence values without any normalisation are shown.  Otherwise, data was indicated as RLU fold change. This means the data was normalized on the indicated control condition (either with normalization of the western blot or without; as indicated).

      The data is generally normalized on wt or untreated conditions, when the cells were treated with small molecules for target engagement assays. 

      Lastly, the section starting in Line 472 reads more like a discussion of results from different types of inhibitors used in this study that results on its own. The authors should consider a new subtitle such as results or make this section a discussion.

      We agree with the reviewer and this part of the results was split into a new section of the result:

      - Line 455: “Effect of different kinase inhibitor types on the KinCon reporter system”.

      Reviewer #2 (Recommendations For The Authors):

      I have a few suggestions, since the paper is a distillation of a vast amount of work and tells a useful story.

      (1) The work is very solid, uses examples from the literature, and also extends into new experimental space. An obvious weakness is mentioned by the authors for the CKD data, in that measurements with Cyclin D (the activating subunit) are not characterized, although Cyclin D might be assumed to be present. 

      We performed experiments with the CDK4/6 KinCon reporters and co-expressed CyclinD with a ratio of 1:3 (HEK293T cells, expression for 48h). However, in the context of inhibitor treatments we could not track conformation changes in these initial experiments. The cells were treated with the indicated CDK4/6i [1µM] for 3h. This seems to not impact the conformation of CDK4/6 wt or mutated KinCon reporters. There is a tendency that CyclinD co-expression promotes CDK4/6 conformation opening (data not shown).

      Author response image 1.

      Bioluminescence signal of CDK4/6 KinCon reporters with co-expressed CyclinD3 (HEK293T, expression for 48h) upon exposure to indicated CDK4/6i [1µM] or DMSO for 3h (mean ±SEM, n=3 ind. experiments). No significant changes using the current setting.

      (2) The work with the trimeric LKB1 complex involves pseudokinase, STRADalpha, whose conformation is also examined as a function of LKB1 status; since STRAD is an activator of LKB1. A future goal should be the evaluation of the complex in the presence of STRAD inhibitory/activating small molecules.

      Thank you for this great idea, we are currently compiling a FWF grant application to get support for such a R&D project.

      Minor points

      • Have any of the data been repeated in a different cell background? This came to mind because HeLa cells lack LKB1, which might be a useful place to test the LKB1 data in a different context.

      This experiment was performed and we show it in Figure Supplement 5. Further, we followed the advice of the reviewer and performed suggested experiments. We integrated the colon cancer cell line SW480 into the experimental setup. Overall, three cell settings showed the same pattern of KinCon reporter analyses for LKB1-STRADα-MO25 complex formation utilizing the LKB1- and STRADα-KinCon reporters.  

      • The study picks up the PKA Cushings Syndrome field, which makes sense, and data are presented for L206R. PMID 35830806 explains how different patient mutations drive different signaling outcomes through distinct complex formations, and it would be interesting to discuss how mutations in KinCon complexes, especially those with mutations, could affect sub-cellular localization. Could the authors explain if this was done for any of the proteins, whose low experimental expression is a clear advantage, but is presumably hard to maintain across experiments?

      The feedback of the reviewer motivated us to perform subcellular fractionation experiments. They were performed with PKAc wt and L206R KinCon reporters as well as BRAF wt and V600E reporters. We were not able to see major differences between the wt and mutated reporter constructs in respect to their nucleus: cytoplasm localizations (Figure Supplement 4). For your information, in a R+D project with the mitochondrial kinase PINK1 we see localization of the reporter as expected almost exclusively at the mitochondria fraction. 

      - Line 495: In this context of activating kinase mutations we showed that using PKAc (wt and L206R) and BRAF (wt and V600E) reporters as example we could not track alterations of cytoplasmic and nuclear localization (Figure Supplement 4). Furthermore, subcellular localization of PKAc KinCon reporters did not change when L206R mutant was introduced (Figure Supplement 4). As a control BRAF wt and V600E KinCon reporters were used and also no changes in localization was observed.

      • I suggest changing PMs (Figure 2 and others) simply to mutation, I read this as plasma membrane constantly.

      We agree and we have changed it to “patient mutation” in Figure 2C, Figure 3E, Figure 4B.

    1. eLife assessment

      This study presents a predictive scoring system in DLBCL based on the expression of three tumour microenvironment-related genes. Such a scoring system seems useful for predicting tumour purity levels in DLBCL. The provided evidence showing an association between worse DLBLC prognosis and high-risk score is solid, but it is incomplete to draw a clear conclusion about the links between risk score and drug sensitivity.

    2. Reviewer #2 (Public Review):

      In this study, Zhenbang Ye and colleagues investigate the links between microenvironment signatures, gene expression profiles, and prognosis in diffuse large B-cell lymphoma (DLBCL). They show that increased tumor purity (ie, a higher proportion of tumor cells relative to surrounding stromal components) is associated with worse prognosis. They then show that three genes associated with tumor purity (VCAN, CD3G, and C1QB) correlate with patterns of immune cell infiltration and can be used to create a risk scoring system that predicts prognosis, which can be replicated by immunohistochemistry (IHC), and response to some therapies.

      (1) The two strengths of the study are its relatively large sample size (n = 190) and the strong prognostic significance of the risk scoring system. It is worth noting that the validation of this scoring with IHC, a simple technique already routinely used for the diagnosis and classification of DLBCL, increases the potential for clinical translation. However, the correlative nature of the study limits the conclusions that can be drawn in regards to links between the risk scoring system, the tumor microenvironment, and the biology of DLBCL.

      (2) The tumor microenvironment has been extensively studied in DLBCL and a prognostic implication has already been established (for instance, Steen et al., Cancer Cell, 2021). In addition, associations have already been established in non-Hodgkin lymphoma between prognosis and expression of C1QB (Rapier-Sharman et al., Journal of Bioinformatics and Systems Biology, 2022), VCAN (S. Hu et al., Blood, 2013), and CD3G (Chen et al., Medical Oncology, 2022). Nevertheless, one of the strengths and novelty aspect of the study is the combination of these 3 genes into a risk score that is also valid by immunohistochemistry (IHC), which substantially facilitates a potential clinical translation.

      (3) Figures 1A-B: tumor purity is calculated using the ESTIMATE (Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data) algorithm (Yoshihara et al., Nature Communications, 2013). The ESTIMATE algorithm is based on two gene signatures ("stromal" and "immune"). It is therefore expected that tumor purity measured by the ESTIMATE algorithm will correlate with the expression of multiple genes. Importantly, C1QB is included in the stromal signature of the ESTIMATE algorithm meaning that, by definition, it will be correlated with tumor purity in that setting.

      (4) Figure 2A: as established in figure 1C, high tumor purity is associated with worse prognosis. Later in the manuscript, it is also shown that C1QB expression is associated with worse prognosis. However, figure 2A shows that C1QB is associated with decreased tumor purity. It therefore makes it less likely that the prognostic role of C1QB expression is related to its impact on tumor purity. The prognostic impact could be related to different patterns of immune cell infiltration, as shown later. However, the evidence presented in the study is correlative and nature and not sufficient to draw this conclusion.

      (5) Figure 3G: although there is a strong prognostic implication of the risk score on prognosis, the correlation between the risk score and tumor purity is significant but not very strong (R = 0.376). It is therefore likely that other important biological factors explain the correlation between the risk score and prognosis, as suggested in the gene set enrichment analysis that is later performed.

      (6) Figure 6: the drug sensitivity analysis includes a wide range of established and investigational drugs with varied mechanisms of action. Although the difference in sensitivity between tumors with low and high risk scores show statistical significance for certain drugs, the absolute difference appears small in most cases and is of unclear biological significance. In addition, even though the risk score is statistically related to drug sensitivity, there is no direct evidence that the differences in drug sensitivity are directly related to tumor purity.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The findings in this study are useful and may have practical implications for predicting DLBCL risk subject to further validating the bioinformatics outcomes. We found the approach and data analysis solid. However, some concerns regarding the drug sensitivity prediction and the links between the selected genes for the risk scores have been raised that need to be addressed by further functional works.

      Thanks for your high recognition for our study. In fact, we have searched the treatment information of DLBCL patients in our own cohort, however, unfortunately all patients were treated strictly according to the guidelines issued by authorities of China, which suit Chinese patients fine but do not include the drugs explored in the present study. Therefore, more further investigations should be designed and conducted to validate our conclusion. Here, we provided a possible direction for future studies base on large cohorts, which could not only provide more reliable conclusions, but gain more attentions to the role of tumor microenvironment in influencing outcome and drug sensitivity.

      Public Reviews:

      Sincere thanks for all reviewers’ positive comments on our study and their helpful recommendations for improving our manuscript. For this part, we have sorted out the comments and recommendations from all reviewers, and made corresponding revisions. And here are our responses.

      (1) How did we determined the three genes (VCAN, C1QB and CD3G) in the prognostic model?

      Just as was mentioned in the “Prognostic model” in Materials and Methods section, the gene was selected by “survival” package in R. After we obtained the nine genes, we input the expression value of them, and analyzed with “survival” package in R. And the function “step” in that package can optimize the model, that is, to construct a model with as less factors as possible, and the finally enrolled factors were representative and presented the least collinearity. Through this way, the prognostic model we got could be more practical in clinical practice.

      (2) Different centers have different protocols of IHC, so how could we put this model into clinical practice under this circumstance?

      Not only did different centers have different protocols, the materials like antibodies also vary. Therefore, there is actually a long way to go in putting our study into clinical practice. As far as we’re concerned, there are at least three problems to solve. First, diagnostic antibodies should be used in clinical practice, which usually manifest better specificity and sensitivity. And this may be the reason why the staining of VCAN and C1QB was strong and difficult to differentiate. Second, a standardized protocol should be made. Last but not least, more precise analyses and studies should be conducted to make it clear which type of cells specifically express these genes (just as was mentioned by Reviewer #2). We are now endeavoring to solve these problems by utilizing as many techniques as possible, like multi-omics and mIHC. From revealing the true expression pattern to developing high quality antibodies and even standardized test kit, we are looking forward to a clinical translation.

      (3) The analyses about immune infiltration and the key genes in DLBCL were superficial, limited within the correlation analyses.

      Due to the model constructed based on tumor purity of DLBCL, the risk score could be associated with the enrichment of cell functions. We conducted GSEA analysis based on the differentially expressed genes between high-risk group and low-risk group in the two datasets (Figure 5H-I). It showed that the extracellular organization and cellular adhesion were different between the two groups, in which way the immune infiltration and activity might be regulated owing to the motility of immune cells. Besides, we have validated the infiltration of M1 macrophages and M2 macrophages with our own cohort (Supplementary Figure 3P).

      (4) The drug sensitivity was just analyzed based on the model, which should be validated in real world research or lab study. And the sensitivity score seemed not different too much in most cases, even though there were statistical significance.

      We tried to search the treatment information of DLBCL patients in our own cohort, however, unfortunately all patients were treated strictly according to the guidelines issued by authorities of China, which suit Chinese patients fine but do not include the drugs explored in the present study. Therefore, more further investigations should be designed and conducted to validate our conclusion. Here, we provided a possible direction for future studies base on large cohorts, which could not only provide more reliable conclusions, but gain more attentions to the role of tumor microenvironment in influencing outcome and drug sensitivity. As for the differences between high- and low-risk group, as a matter of fact, sometimes a little dose of drug could have a huge effect, because the dose-effect curve is usually nonlinear. Therefore, reduce the dose, even just 1%, the adverse effects could be avoided. To sum up, the drug sensitivity analyses in our study could provide more possibility for clinical trial and practice, and we are taking it into consideration to design reasonable clinical research.

      (5) C1QB was associated with decreased tumor purity and worse prognosis, but decreased tumor purity was related to better prognosis. How to elucidate the contradiction?

      Just as discussed in Discussion section, previous studies have revealed the role of C1QB in promoting an immunosuppressive microenvironment in cancer (see reference 22-26). C1QB might recruit the infiltration of pro-tumor immune cells, resulting in a reducing tumor purity on its perspective. However, the immune microenvironment was regulated by multi factors which form a network and combat or synergize each other. The statistical analysis often gives a possible phenomenon, but could not provide mechanism explanation. Therefore, more mechanic studies are needed to reveal the connection and key node. This is exactly what we will explore next.

      (6) Others:

      (1) Line 51 has been rewritten.

      (2) References for ESTIMATE algorithm (reference 16) and CD3G+ T cells has been added (reference 17).

      (3) The illegible figure labels might be caused by the incompatibility between the PDF file we submitted and the submission system. We have provided the TIFF images in this revision, and the EPS file could be submitted to editors upon their requests.

      (4) A supplement description has been added to the Figure legend of Figure 6 to make it clear.

      (5) In order to explore the expression of key genes among different locations of DLBCL we performed analyses in Figure5 and supplementary Figure3. These results might be thought-provoking that the tumor microenvironment differs among DLBCLs even though they share similar histological characteristics.

    1. eLife assessment

      This paper describes an important software framework for the curation, retrieval, and analysis of ancient human genomic data and their associated metadata, overcoming long-standing coordination and harmonization issues in ancient human genomics. The resource is built on compelling and sometimes exceptional principles of software engineering and reproducibility, and the authors make an excellent case that their resource will be of practical use to many researchers studying human history using DNA. The main issues include natural uncertainties regarding future funding and maintenance of this resource, as well as deviation from established standards in other areas of genomics.

    2. Reviewer #1 (Public Review):

      The authors describe a framework for working with genotype data and associated metadata, specifically geared towards ancient DNA. The Poseidon framework aims to address long-standing data coordination issues in ancient population genomics research. These issues can usefully be thought of as two primary, separate problems:

      (1) The genotype merging problem. Often, genotype calls made by a new study are not made publicly available, or they are only made available in an ad-hoc fashion without consistency in formatting between studies. Other users will typically want to combine genotypes from many previously published studies with their own newly produced genotypes, but a lack of coordination and standards means that this is challenging and time-consuming.

      (2) The metadata problem. All genomes need informative metadata to be usable in analyses, and this is even more true for ancient genomes which have temporal and often cultural dimensions to them. In the ancient DNA literature, metadata is often only made available in inconsistently formatted supplementary tables, such that reuse requires painstakingly digging through these to compile, curate and harmonise metadata across many studies.

      Poseidon aims to solve both of these problems at the same time, and additionally provide a bit of population genetics analysis functionality. The framework is a quite impressive effort, that clearly has taken a lot of work and thought. It displays a great deal of attention to important aspects of software engineering and reproducibility. How much usage it will receive beyond the authors themselves remains to be seen, as there is always a barrier to entry for any new sophisticated framework. But in any case, it clearly represents a useful contribution to the human ancient genomics community.

      The paper is quite straightforward in that it mainly describes the various features of the framework, both the way in which data and metadata are organised, and the various little software tools provided to interact with the data. This is all well-described and should serve as a useful introduction for any users of the framework, and I have no concerns with the presentation of the paper. Perhaps it gets a bit too detailed for my taste at times, but it's up to the authors how they want to write the paper.

      I thus have no serious concerns with the paper. I do have some thoughts and comments on the various choices made in the design of the framework, and how these fit into the broader ecosystem of genomics data. I wouldn't necessarily describe much of what follows as criticism of what the authors have done - the authors are of course free to design the framework and software that they want and think will be useful. And the authors clearly have done more than basically anyone else in the field to tackle these issues. But I still put forth the points below to provide some kind of wider discussion within the context of ancient genomics data management and its future.

      * * *

      The authors state that there is no existing archive for genotype data. This is not quite true. There is the European Variation Archive (EVA, https://www.ebi.ac.uk/eva/), which allows archiving of VCFs and is interlinked to raw data in the ENA/SRA/DDBJ. If appropriately used, the EVA and associated mainstream infrastructure could in principle be put to good use by the ancient genomics community. In practice, it's basically not used at all by the ancient genomics community, and partly this is because EVA doesn't quite provide exactly what's needed (in particular with regards to metadata fields). Poseidon aims to provide a much more custom-tailored solution for the most common use cases within the human ancient DNA field, but it could be argued that such a solution is only needed because the ancient genomics community has largely neglected the mainstream infrastructure. In some sense, by providing such a custom-tailored solution that is largely independent of the mainstream infrastructure, I feel like efforts such as Poseidon (and AADR) - while certainly very useful - might risk contributing to further misaligning the ancient genomics community from the rest of the genomics community, rather than bringing it closer. But the authors cannot really be blamed for that - they are simply providing a resource that will be useful to people given the current state of things.

      The BioSamples database (https://www.ebi.ac.uk/biosamples/) is an attempt to provide universal sample IDs across the life sciences and is used by the archives for sequence reads (ENA/SRA/DDBJ). Essentially every published ancient sample already has a BioSample accession, because this is required for the submission of sequence reads to ENA/SRA/DDBJ. It would thus have seemed natural to make BioSamples IDs a central component of Poseidon metadata, so as to anchor Poseidon to the mainstream infrastructure, but this is not really done. There are some links being made to ENA in the .ssf "sequence source" files used by the Poseidon package, including sample accessions, but this seems more ad-hoc.

      The package uses PLINK and EIGENSTRAT file formats to represent genotypes, which in my view are not particularly good formats for long-term and rigorous data management in genomics. These file formats cannot appropriately represent multiallelic loci, haplotype phase, or store information on genotype qualities, coverage, etc. The standard in the rest of genomics is VCF, a much more robust and flexible format with better software built around it. Insisting on keeping using these arguably outdated formats is one way in which the ancient genomics community risks disaligning itself from the mainstream.

      I could not find any discussion of reference genomes: knowing the reference genome coordinate system is essential to using any genotype file. For comparison, in the EVA archive, every VCF dataset has a "Genome Assembly" metadata field specifying the accession number of the reference genome used. It would seem to me like a reference genome field should be part of a Poseidon package too. In practice, the authors likely use some variant of the hg19 / GRCh37 human reference, which is still widely used in ancient genomics despite being over a decade out of date. Insisting on using an outdated reference genome is one way in which the ancient genomics community is disaligning itself from the mainstream, and it complicates comparisons to data from other sub-fields of genomics.

      A fundamental issue contributing to the genome merging problem, not unique to ancient DNA, is that genotype files are typically filtered to remove sites that are not polymorphic within the given study - this means that files from two different studies will often contain different and not fully overlapping sets of sites, greatly complicating systematic merging. I don't see any discussion of how Poseidon deals with this. In practice, it seems the authors are primarily concerned with data on the commonly used 1240k array set, such that the set of SNPs is always well-defined. But does Poseidon deal with the more general problem of non-overlapping sites between studies, or is this issue simply left to the user to worry about? This would be of relevance to whole-genome sequencing data, and there are certainly plenty of whole-genome datasets of great interest to the research community (including archaic human genomes, etc).

      In principle, it seems the framework could be species-agnostic and thus be useful more generally beyond humans (perhaps it would be enough to add just one more "species" metadata field?). It is of course up to the authors to decide how broadly they want to cater.

    3. Reviewer #2 (Public Review):

      Summary:

      Schmid et al. provide details of their new data management tool Poseidon which is intended to standardise archaeogenetic genotype data and combine it with the associated standardised metadata, including bibliographic references, in a way that conforms to FAIR principles. Poseidon also includes tools to perform standard analyses of genotype files, and the authors pitch it as the potential first port of call for researchers who are planning on using archaeogenetic data in their research. In fact, Poseidon is already up and running and being used by researchers working in ancient human population genetics. To some extent, it is already on its way to becoming a fundamental resource.

      Strengths:

      A similar ancient genomics resource (The Ancient Allen Database) exists, but Poseidon is several steps ahead in terms of integration and standardisation of metadata, its intrinsic analytical tools, its flexibility, and its ambitions towards being independent and entirely community-driven. It is clear that a lot of thought has gone into each aspect of what is a large and dynamic package of tools and overall it is systematic and well thought through.

      Weaknesses:

      The main weakness of the plans for Poseidon, which admirably the authors openly acknowledge, is in how to guarantee it is maintained and updated over the long term while also shifting to a fully independent model. The software is currently hosted by the MPI, although the authors do set out plans to move it to a more independent venue. However, the core team comprising the authors is funded by the MPI, and so the MPI is also the main funder of Poseidon. The authors do state their ambition to move towards a community-driven independent model, but the details of how this would happen are a bit vague. The authors imagine that authors of archaeogenetic papers would upload data themselves, thereby making all authors of archaeogenetics papers the voluntary community who would take on the responsibility of maintaining Poseidon. Archaeogeneticists generally are committed enough to their field that there is a good chance such a model would work but it feels haphazard to rely on goodwill alone. Given there needs to be a core team involved in maintaining Poseidon beyond just updating the database, from the paper as it stands it is difficult to see how Poseidon might be weaned off MPI funding/primary involvement and what the alternative is. However, the same anxieties always surround these sorts of resources when they are first introduced. The main aim of the paper is to introduce and explain the resource rather than make explicit plans for its future and so this is a minor weakness of the paper overall.

    4. Author response:

      We thank the editors and reviewers for their thorough engagement with the manuscript and their well-informed comments on the Poseidon framework. We are pleased to note that they consider Poseidon a promising and timely attempt to resolve important issues in the archaeogenetics community. We also agree with the main challenges they raise, specifically the lack of long-term, independent infrastructure funding at the time of writing, and various aspects of Poseidon that bear the potential to further consolidate a de-facto alienation of the aDNA community from the wider field of genomics.

      Poseidon is indeed dependent on the Department of Archaeogenetics at MPI-EVA. For the short to middle-term future (3-5 years) we consider this dependency beneficial, providing a reliable anchor point and direct integration with one of the most proficient data-producing institutions in archaeogenetics. For the long term, as stated in the discussion section of the manuscript, we hope for a snowball effect in the dissemination and adoption of Poseidon to establish it as a valuable community resource that automatically attracts working time and infrastructure donations. To kickstart this process we have already intensified our active community outreach and teach Poseidon explicitly to (early career) practitioners in the field. We are aware of options to apply for independent infrastructure funding, for example through the German National Research Data Infrastructure (NFDI) initiative, and we plan to explore them further.

      As the reviewers have noted, key decisions in Poseidon’s data storage mechanism have been influenced by the special path archaeogenetics has taken compared to other areas of genomics. The founding goal of the framework was to integrate immediately with established workflows in the field. Nevertheless we appreciate the concrete suggestions on how to connect Poseidon better with the good practices that emerged elsewhere. We will explicitly address the European Variation Archive in a revised version of the manuscript, deliberate embedding the BioSamples ID of the INSDC databases more prominently in the .janno file, prioritise support for VCF next to EIGENSTRAT and PLINK and add an option to clearly document the relevant human reference genome on a per-sample level. In the revised version of the text we will also explain the treatment of non-overlapping SNPs between studies by trident’s forge algorithm and how we imagine the interplay of different call sets in the Poseidon framework in general.

      Beyond these bigger concerns we will also consider and answer the various more detailed recommendations thankfully shared by the reviewers, not least the question how we imagine Poseidon to be used by archaeologists and for archaeological data.

    1. eLife assessment

      The study presents valuable findings on the role of RIPK1 in maintaining liver homeostasis under metabolic stress. Strengths include the intriguing findings that RIPK1 deficiency sensitizes the liver to acute liver injury and apoptosis, but because the conclusions require additional experimental support, the evidence is incomplete.

    2. Reviewer #1 (Public Review):

      This study presents an investigation into the physiological functions of RIPK1 within the context of liver physiology, particularly during short-term fasting. Through the use of hepatocyte-specific Ripk1-deficient mice (Ripk1Δhep), the authors embarked on an examination of the consequences of Ripk1 deficiency in hepatocytes under fasting conditions. They discovered that the absence of RIPK1 sensitized the liver to acute injury and hepatocyte apoptosis during fasting, a finding of significant interest given the crucial role of the liver in metabolic adaptation. Employing a combination of transcriptomic profiling and single-cell RNA sequencing techniques, the authors uncovered intricate molecular mechanisms underlying the exacerbated proinflammatory response observed in Ripk1Δhep mice during fasting. While the investigation offers valuable insights into the consequences of Ripk1 deficiency in hepatocytes during fasting conditions, there appears to be a primarily descriptive nature to the study with a lack of clear connection between the experiments. Thus, a stronger focus is warranted, particularly on understanding the dialogue between hepatocytes and macrophages. Moreover, the data would benefit from reinforcement through additional experiments such as Western blotting, flow cytometry, and rescue experiments, which would offer a more quantitative aspect to the findings. By incorporating these enhancements, the study could achieve a more comprehensive understanding of the underlying mechanisms and ultimately strengthen the overall impact of the research.

      Detailed major concerns:

      Related to Figure 1.<br /> It is imperative to ensure consistency in the number of animals analyzed across the different graphs. The current resolution of the images appears to be low, resulting in unsharp visuals that hinder the interpretation of data beyond the presence of "white dots". To address this issue, it is recommended to enhance the resolution of the images and consider incorporating zoom-in features to facilitate a clearer visualization of the observed differences. Moreover, it would be beneficial to include a complete WB analysis for the cell death pathways analyzed. These adjustments will significantly improve the clarity and interpretability of Figure 1.

      Related to Figure 2.<br /> It is essential to ensure consistency in the number of animals analyzed across the different graphs, as indicated by n=6 in the figure legend (similar to Figure 1). Additionally, it is crucial to distinguish between male and female subjects in the dot plots to assess any potential gender-based differences, which should be consistent throughout the paper. To achieve this, the dots plot should be harmonized to clearly differentiate between males and females and investigate if there are any disparities between the genders. Moreover, it is imperative to correlate hepatic inflammation with the activation of Kupffer cells, infiltrating monocytes, and/or hepatic stellate cells (HSCs). Therefore, conducting flow cytometry would be instrumental in achieving this correlation. Additionally, the staining for Ki67 appears to be non-specific, showing a granular pattern reminiscent of bile crystals rather than the expected nuclear staining of hepatocytes or immune cells. It is crucial to ensure specific staining for Ki67, and conducting in vitro experiments on primary hepatocytes could further elucidate the proliferation process. These experiments are relatively straightforward to implement and would provide valuable insights into the mechanisms underlying hepatic inflammation and proliferation.

      Related to Figure 3 & related to Figure 4.<br /> The immunofluorescence data presented are not entirely convincing and are insufficient to conclusively demonstrate the recruitment of monocytes. Previous suggestions for flow cytometry studies remain pertinent and are indeed necessary to bolster the robustness of the data and conclusions. Conducting flow cytometry analyses would provide more accurate and quantitative assessments of monocyte recruitment, ensuring the reliability of the findings and strengthening the overall conclusions of the study. Regarding the single-cell RNA sequencing analysis presented in the manuscript, it's worth questioning its relevance and depth of information provided. While it successfully identifies a quantitative difference in the cellular composition of the liver between control and knockout mice, it may fall short in elucidating the intricate interactions between different cell populations, which are crucial for understanding the underlying mechanisms of hepatic inflammation. Therefore, I propose considering alternative bioinformatic analyses, such as CellPhone-CellChat, which could potentially provide a more comprehensive understanding of the cellular dynamics and interactions within the liver microenvironment. By examining the dialogue between different cell clusters, these analyses could offer deeper insights into the functional consequences of Ripk1 deficiency in hepatocytes and its impact on hepatic inflammation during fasting.

      Related to Figure 5.<br /> What additional insights do the data from Figure 5 provide compared to the study published in Nat Comms, which demonstrated that RIPK1 regulates starvation resistance by modulating aspartate catabolism (PMID: 34686667)?

      Related to Figure 6.<br /> The data presented in Figure 7 are complementary and do not introduce new mechanistic insights.

      Related to Figure 7.<br /> The data from Figure 7 suggest that RIPK1 in hepatocytes is responsible for the observed damage. However, it has been previously demonstrated that inhibition of RIPK1 activity in macrophages protects against the development of MASLD (PMID: 33208891). One possible explanation for these findings could be that the overreaction of macrophages to fasting, coupled with the absence of RIPK1 in hepatocytes (an indirect effect), contributes to the observed damage. Considering this, complementing hepatocytes with a kinase-dead version of RIPK1 could be a valuable approach to further refine the molecular aspect of the study. This would allow for a more precise investigation into the specific role of RIPK1's scaffolding or kinase function in response to starvation in hepatocytes. Such experiments could provide additional insights into the mechanisms underlying the observed effects and help delineate the contributions of RIPK1 in different cell types to metabolic stress responses.

    3. Reviewer #2 (Public Review):

      Summary:

      Zhang et al. analyzed the functional role of hepatocyte RIPK1 during metabolic stress, particularly its scaffold function rather than kinase function. They show that Ripk1 knockout sensitizes the liver to cell death and inflammation in response to short-term fasting, a condition that would not induce obvious abnormality in wild-type mice.

      Strengths:

      The findings are based on a knockout mouse model and supported by bulk RNA-seq and scRNA-seq. The work consolidates the complex role of RIPK1 in metabolic stress.

      Weaknesses:

      However, the findings are not novel enough because the pro-survival role of RIPK1 scaffold is well-established and several similar pieces of research already exist. Moreover, the mechanism is not very clear and needs additional experiments.

    4. Author response:

      We wish to express our sincere acknowledgement to the reviewers and the editors for the time and the effort spent in reviewing our manuscript. We highly appreciate the positive feedback and the thorough and constructive comments.

      We plan to conduct additional experiments to address the reviewers’ concerns.

      (1) We plan to utilize the RIPK1 kinase dead mice to investigate the role of RIPK1 kinase activity in these metabolic stress responses.

      (2) We plan to conduct flow cytometry analysis to detect the percentage or number of different cell types in fasted liver tissue, to provide more accurate and quantitative assessments of monocyte   recruitment.

      (3) We plan to conduct more western blotting to detect the expression of related molecules in the signal transduction pathway, to further clarify the underlying mechanisms.

      (4) Regarding the single-cell RNA sequencing analysis,we plan to conduct CellChat analysis to provide information about the interactions between different cell populations.

      (5) We will fix the issues regarding the data graphs and image resolutions.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      This study is very well framed and the writing is very clear. The manuscript is well organized and easy to follow and overall the previous state of the art of the field is taken into account.  I only have a couple of minor comments 

      (1) There is a preprint that uses single nuclei RNA-Seq and ST on human MS subcortical white matter lesions doi: https://doi.org/10.1101/2022.11.03.514906. This work needs to be included in the discussion of the results. 

      (1.1) We appreciate the reviewer bringing up this important preprint, and we have referenced it in the Discussion section of our updated manuscript. 

      (2) The discussion should include the overall limitations of the study and how much it can be translated to human MS. Specifically, the current work uses EAE and therefore different disease stages are not captured in this study. This point is also raised by other reviewers. 

      (1.2) We thank the reviewer for raising this important point, and we have included additional discussion about the limitations of EAE and its disease relevance to MS.

      Reviewer #2 (Recommendations For The Authors):

      The authors state that this EAE model is better for studying cortical gradients because previous models "such as directly injecting inflammatory cytokines into the meninges/cortex" cause a traumatic injury. It needs to be discussed that these models have now been superseded by more refined models involving long-term overexpression of pro-inflammatory cytokines in the sub-arachnoid space, thereby avoiding traumatic injury. The current results should be discussed in light of these newer models (James et al, 2020; 2022), which are more similar to MS cortical pathology and do exhibit lymphoid-like structures. 

      (2.1) We thank the reviewer for pointing out these relevant studies, and we agree they describe non-traumatic and more MS-relevant models of leptomeningeal inflammation. We have included discussion of these works in the updated manuscript.  

      • The study will be substantially improved if some of the ST data is validated at least partially with some RNAscope or other in situ hybridization using a subset of probes that capture the take-home message of the paper. 

      (2.2) We agree with the reviewer that validation of transcriptomics results is important to support our conclusions. In the updated manuscript Figure 5 and Supplemental Figure 6 we have added RNAscope results for relevant genes. In agreement with the trends noted in the manuscript, expression of genes related to antigen processing and presentation such as B2m decreases gradually with distance from LMI. We also have included a reference to a newly published manuscript from our group (Gupta et al., 2023, J. Neuroinflammation) that characterizes meningeal inflammation and sub-pial changes in the SJL EAE model. In that manuscript, IHC is used to show accumulation of B cells and T cells in the leptomeningeal space, increased microglial and astrocyte reactivity adjacent to leptomeningeal inflammation, and reduction of neuronal markers adjacent to leptomeningeal inflammation.  

      • The lack of change in signaling pathways involved in B-cell/T-cell interaction and cytokine/chemokine signaling, which would be expected in areas of immune cell aggregation in the meninges, needs discussion. 

      (2.3) While we detected significant upregulation in antigen presentation, complement activation, and humoral immune signaling, areas of meningeal inflammation identified as cluster 11 showed upregulation of numerous other GO gene sets associated with immune cell interaction and cytokine signaling, as described in supplementary table 3. These include T-cell receptor binding, CCR chemokine receptor binding, interleukin 8 production, response to interleukin 1, positive regulation of interleukin-6 production, tumor necrosis factor production, leukocyte cell-cell adhesion. Overall, we believe that the collection of enriched gene sets is consistent with peripheral myeloid and lymphoid infiltration and cytokine production, with the most prominent cytokine / pathways being interferon ɣ/antigen processing and presentation, complement, and humoral inflammation.

      • Fig 4 subclusters includes T-cell activation, pos regulation of neuronal death, cellular response to IFNg, neg regulation of neuronal projections, Ig mediated immune response, cell killing, pos regulation of programmed cell death, pos regulation of apoptotic process, but none of these are discussed despite their obvious importance. 

      (2.4) We agree with the reviewer that these upregulated genesets warrant additional discussion and have added additional reference to these genesets in the results section. Also, the genesets ‘positive regulation of programmed cell death’, ‘positive regulation of apoptotic process’, and ‘positive regulation of cell death’ were erroneously included in Figure 4F in the initial manuscript, as they are actually downregulated in cluster 1_4. This has been clarified in the text.

      • Subcluster 11 appears spatially to represent the meninges, but what pathways are expressed there? 330 genes/pathways altered independent of other clusters - immune cell regulation? 

      (2.5) We refer the reviewer to Supplementary Table 3, which contains a complete list of GO genesets enriched within cluster 11 spots.

      • The surprising lack of immunoglobulin genes upregulated in the meninges of the mice, considering these are the genes most upregulated in the MS meninges. Should be pointed out and discussed. 

      (2.6) We appreciate the reviewer bringing up immunoglobulin genes, which previous publications have shown are elevated in MS meninges and cortical grey matter lesions. Consistent with this, several immunoglobulin genes are elevated in cluster 11, including genes encoding IgG2b, IgA, and IgM. While these results were available within the original submission in Supplementary Table 2, we have included the graph in the updated Supplementary Figure 3.

      • Meningeal signature may be poorly represented given the individual slices shown in suppl 3A, which suggests that only 3 of the EAE slices had significant meningeal infiltrates, indicated by cluster 11 genes.  

      (2.7) There was heterogeneity in the location and extent of meningeal infiltrate / cluster 11 in the EAE slices, as the reviewer points out. 2 slices had severe inflammation, 2 had moderate inflammation, and 2 had relatively mild inflammation, but all EAE slices were enriched in inflammation relative to naïve as demonstrated not only through clustering, but also through enriched marker analysis between EAE and Naive and Progeny analysis.  

      • The ST is not resolving the meningeal tissue and the immediate underlying grey matter, as demonstrated by a high signal for both CXCL13 and GFAP in cluster 11. 

      (2.8) We agree that the spatial transcriptomics strategy applied here is inadequate to precisely delineate between meningeal inflammation and the underlying brain parenchyma, and that the elevation of markers such as GFAP in cluster 11 indicates some ‘contamination’ of parenchymal cells into cluster 11. We have clarified this in the text and discussed the limitation of the spatial transcriptomics method used.  

      • More information is required concerning how many animals were used in this study, to meet the requirements for complying with the 3Rs. 

      (2.9) A total of 4 mice were used per group. In the naïve group one mouse contributed two slices, for a total of 5 naïve slices. In the EAE group two mice contributed two slices, for a total of 6 EAE slices. We have clarified this in the methods section of the updated manuscript.

      Reviewer #3 (Recommendations For The Authors):

      The authors should provide a more thorough description of the methodology, and there are a few minor concerns about experimental details, data presentation, and description that need to be addressed. In the next few lines, I will highlight a few important aspects that need to be addressed, propose some changes to the main manuscript, and suggest some additional experiments that, if successful, could confirm/support/further strengthen the conclusions that are at this point purely based on transcriptomic data. 

      Major comments/suggestions: 

      • The main gene expression changes between the control and EAE groups obtained via spatial transcriptomics need to be validated with another technique, at least partially. I suggest performing RNAscope or immunofluorescence imaging using brain sections from a new and independent cohort of animals, where cell-specific markers can also be tested. This type of assessment would work as a validation method and could also inform about the cell-specific contribution to the observed transcriptomic changes. 

      (3.1) Please refer to response 2.2 

      • The representative qualitative spatial expression heatmaps for each gene in Fig. 1F should be accompanied by corresponding graphs with quantitative measurements. Similar to what is done regarding the data in Fig. 2B and D. 

      (3.2) We agree with the reviewer that quantitative graphs were missing, and we have included them in the updated Supplementary Figure 1. 

      • A supplementary table discriminating all the DEGs (132 up and 70 downregulated) between cluster 11 and the other clusters has to be provided. What is the contribution of recruited encephalitogenic adaptive immune cells to this cluster 11 gene signature? 

      (3.3) These unfiltered results are provided in Supplementary Table 2, and to view the up and down regulated genes the reader can sort the table based on fold change and adjusted P value. We believe providing the complete table is more useful to the reader, since the fold change and

      P value thresholds used to determine “significance” are arbitrary. Since the spatial transcriptomics method used in this work does not have single cell resolution, we cannot accurately estimate the contribution of encephalitogenic adaptive immune cells in cluster 11. However, given previously published work of lymphocyte infiltration into the subarachnoid space in SJL EAE (Gupta et al., 2023, J. Neuroinflammation) and the enrichment of Cd3e in cluster 11 (Log2FC 0.31, adjusted P-val 0.005) we assume some contribution of peripheral lymphocytes.

      • The authors mention that there is grey matter pathology in this relapse model, and this has been shown in a previous publication (Bhargava et al., 2021). However, the regions analyzed in the present study are different from the ones shown in the referenced paper. Is there an overexpression of genes involved in, or gene modules indicative of, neuronal stress and/or death that spatially overlap with clusters 1 and 2? If so, it would be important to provide information about those gene modules in the main figures. It would also be quite relevant to show the levels of cell stress/death proteins and of axonal stress/damage, by APP and/or nonphosphorylated SMI-32 staining, in the deep brain regions (like the thalamus), to corroborate the link between these phenomena and the gene signatures of subclusters 1_3, 1_4, and 2_6. 

      (3.4) We thank the review for this insightful comment. We have recently published a manuscript that histologically analyzes leptomeningeal inflammation in the SJL EAE model, specifically assessing the areas looked at in our submitted manuscript (Gupta et al., 2023, J. Neuroinflammation). In that manuscript, IHC is used to show accumulation of B cells and T cells in the leptomeningeal space, increased microglial and astrocyte reactivity adjacent to leptomeningeal inflammation, and reduction of neuronal markers adjacent to leptomeningeal inflammation. To further describe the gene modules in the inflammatory subclusters 1_3/1_4/2_6, we have now provided heatmaps of the selected genesets and their constituent genes (Supplementary Figure 5). 

      • It would be important to provide heatmaps discriminating the DEGs that make the gene modules that are significantly altered in subclusters 1_3, 1_4, and 2_6. The gene ontology terms are sometimes ambiguous. For instance, it would be very informative to the reader (and to the field) to know which altered genes compose the "lysosome", "immune response", "response to stress", or "B cell meditated immunity" pathways that are altered in the EAE subcluster 1_3 (Fig. 4E). The same applies to the gene modules altered in the other subclusters of interest. Authors should also consider generating a Venn diagram with the DEGs from subclusters 1_3, 1_4, and 2_6, to complement the GO term Venn presented in Fig. 4H. Having these pieces of information readily available, either as main or supplementary figures, would be a great addition. 

      (3.5) We agree with the reviewer on this point and have included these heatmaps in Supplementary Figure 5. 

      • The role of IFN-gamma as well as B cells (and Igs) in myelination/remyelination is mentioned in the discussion. However, there is very little evidence that these cells or their cytokines/Igs are mediating the described transcriptomic signatures at the level of the brain parenchyma of EAE mice undergoing relapse. Do the "antigen processing and presentation, cell killing, interleukin 6 production, and interferon gamma response" go terms, which better fitted the trajectory analysis, in fact include genes expressed almost exclusively by T and/or B cells? Are there genes that are downstream of IFN type I or II signaling? 

      (3.6) Pathways including antigen processing / presentation, humoral inflammation, complement, among others were enriched in areas of meningeal inflammation and adjacent areas of parenchyma. These signaling pathways are mediated by effector molecules, many of which are produced by lymphocytes, but that can act on cells within the CNS parenchyma. The heatmaps in Supplementary Figure 5 demonstrate the significant role of MHC and complement genes, which could be expressed by leukocytes as well as glia, on many of the pathways.

      • Is the transcriptomic overlap between meningeal and brain parenchymal regions, or the appearance of signatures similar to the parenchymal subclusters 1_3, 1_4, and 2_6, prevented if the mice are treated with the murine versions of natalizumab or rituximab prior relapse? 

      (3.6) We appreciate the reviewers suggestion. Our future directions for this work includes testing the effects of disease modifying therapies on spatial and single-cell transcriptomic readouts of disease in SJL EAE.

      • Please clarify what control group was used in this study. Naïve mice are mentioned in the Results section, does this mean that control animals were not injected with CFA? Authors should also elaborate on the descriptive methodology employed for the analysis of the spatial

      transcriptomics data - especially regarding the trajectory analysis. As is, overall, the methodology description might not favor reproducibility. 

      (3.7) We appreciate the need for clarification here. Our control group in this study was naïve, not having received any CFA or pertussis toxin. While often used as the control in EAE studies focused on mechanisms of autoimmunity, CFA and pertussis toxin independently induce systemic inflammation. Since in this study we were interested in neuroinflammation broadly, we chose to use a naïve comparison group to maximize our ability to find genes enriched in neuroinflammation. We have elaborated our methods section, including methods related to trajectory analysis. 

      Minor comments/suggestions: 

      In Fig. 1D the indication of the rostral to ventral axis needs to be inverted. 

      Addressed.

      In Fig. 1E the authors should also include a representative H&E staining of the same region in a control animal. 

      Addressed.

      There is inconsistency in the number of clusters obtained after UMAP unbiased clustering of the spatial transcriptomic data: 

      • Fig. 3A-E - twelve clusters are shown (cluster 0 to 11). 

      • In the Results section eleven clusters are mentioned - "we performed unbiased UMAP clustering on the spatial transcriptomic dataset and identified 11 distinct clusters".

      The text was incorrect, there were 12 distinct clusters. This has been corrected.

      Considering the mice strain used was SJL/J mice, the peptide used to induce EAE should be PLP139-151, as mentioned in the Methods section "Induction of SJL EAE". However, the legend of Fig. 1 mentions "post immunization with MOG 35-55". Please correct this. 

      Corrected.

      In the Methods section it is mentioned "At 12 weeks post-immunization, animals were euthanized", however the Results section mentions that tissues were harvested at 11 weeks post-immunization - "Brain slices were collected from four naïve mice and four EAE mice 11 weeks postimmunization". Please correct this. 

      The Methods were incorrect, this has now been fixed. 

      Please clarify the number of animals used for spatial transcriptomic analysis: 

      • Legend of Fig. 1 mentions "Red arrows indicate MRI time points, black arrow indicates time of tissue harvesting (N = 6)." Whilst in the Results section it states "Brain slices were collected from four naïve mice and four EAE mice". 

      The figure one legend has now been corrected (N = 4). Additionally, we have added clarification about the number of animals / slices used in the Methods section (see response 2.9).

      Please be consistent in the way of representing DEGs in the MA plots: 

      • Fig. 3F shows the upregulated genes (in red) on the right and the downregulated genes (in blue) on the left. 

      • Supplemental Fig. 2K shows the upregulated genes (in red) on the left and the downregulated genes (in blue) on the right. 

      • Supplemental Fig. 4 shows the upregulated genes on the right in blue, while the downregulated genes are in red. 

      This has been fixed.

      The letters attributed to each subcluster in panels E-G of Fig. 4 are different from the respective figure legend. 

      This has been fixed.

      Correct the legend of supplemental figure 2: o "(G-H) Representative spatial feature plots of read count (F) and UMI (G) demonstrate expected anatomic variability in transcript amount and diversity.". 

      This has been fixed.

      In Supplemental Fig. 4G there is probably an error with the XX axis, since the significantly up and down-regulated genes are not visible. 

      This has been fixed.

    2. eLife assessment

      Brain inflammation is a hallmark of multiple sclerosis. Using novel spatial transcriptomics methods, the authors provide solid evidence for a gradient of immune genes and inflammatory markers from the meninges toward the adjacent brain parenchyma in a mouse model. This important study advances our understanding of the mechanisms of brain damage in this autoimmune disease. However, the control mouse groups are not well designed to rule out confounding effects, a limitation that needs to be acknowledged and addressed.

    3. Reviewer 1 (Public Review):

      Multiple sclerosis (MS) is a debilitating autoimmune disease that causes loss of myelin in neurons of the central nervous system. MS is characterized by the presence of inflammatory immune cells in several brain regions as well as the brain barriers (meninges). This study aims to understand the local immune hallmarks in regions of the brain parenchyma that are adjacent to the leptomeninges in a mouse model of MS. The leptomeninges are known to be a foci of inflammation in MS and perhaps "bleed" inflammatory cells and molecules to adjacent brain parenchyma regions. To do so, they use novel technology called spatial transcriptomics so that the spatial relationships between the two regions remain intact. The study identifies canonical inflammatory genes and gene sets such as complement and B cells enriched in the parenchyma in close proximity to the leptomeninges in the mouse model of MS but not control. The manuscript is very well written and easy to follow. The results will become a useful resource to others working in the field and can be followed by time series experiments where the same technology can be applied to the different stages of the disease.

      Comments on revised version:

      I agree that the authors successfully addressed most of my comments/critiques.<br /> However, the fact that the control mice were not injected with CFA is somewhat concerning, because it will be hard to interpret the cause of the transcriptomic readouts described in this study. Some of the described effects might be due to CFA (which was used in the EAE but not the "naive" group), and not necessarily to the relapsing-remitting EAE immune features recapitulated in this mouse model. Moreover, this caveat associated with the "naive" control group is not being clearly stated throughout the manuscript and might go unnoticed to readers.<br /> The authors should clearly state, in the methods section (in the section "Induction of SJL EAE"), that the naive control group was not injected with CFA.<br /> Additionally, this potential confounder, of not using a control group injected with the same CFA regimen of the EAE group, should be mentioned in paragraph two of the discussion alongside the other limitations of the study already highlighted by the authors (or in another section of the discussion).

    4. Reviewer 2 (Public Review):

      Accumulating data suggests that the presence of immune cell infiltrates in the meninges of the multiple sclerosis brain contributes to the tissue damage in the underlying cortical grey matter by the release of inflammatory and cytotoxic factors that diffuse into the brain parenchyma. However, little is known about the identity and direct and indirect effects of these mediators at a molecular level. This study addresses the vital link between an adaptive immune response in the CSF space and the molecular mechanisms of tissue damage that drive clinical progression. In this short report the authors use a spatial transcriptomics approach using Visium Gene Expression technology from 10x Genomics, to identify gene expression signatures in the meninges and the underlying brain parenchyma, and their interrelationship, in the PLP-induced EAE model of MS in the SJL mouse. MRI imaging using a high field strength (11.7T) scanner was used to identify areas of meningeal infiltration for further study. They report, as might be expected, the upregulation of genes associated with the complement cascade, immune cell infiltration, antigen presentation, and astrocyte activation. Pathway analysis revealed the presence of TNF, JAK-STAT and NFkB signaling, amongst others, close to sites of meningeal inflammation in the EAE animals, although the spatial resolution is insufficient to indicate whether this is in the meninges, grey matter, or both.

      UMAP clustering illuminated a major distinct cluster of upregulated genes in the meninges and smaller clusters associated with the grey matter parenchyma underlying the infiltrates. The meningeal cluster contained genes associated with immune cell functions and interactions, cytokine production, and action. The parenchymal clusters included genes and pathways related to glial activation, but also adaptive/B-cell mediated immunity and antigen presentation. This again suggests a technical inability to resolve fully between the compartments as immune cells do not penetrate the pial surface in this model or in MS. Finally, a trajectory analysis based on distance from the meningeal gene cluster successfully demonstrated descending and ascending gradients of gene expression, in particular a decline in pathway enrichment for immune processes with distance from the meninges.

      Comments on revised version:

      The authors have addressed all of my comments regarding the lack of spatial resolution between the grey matter and the overlying meninges and also concerning the difficulties in extrapolating from this mouse model to MS itself.<br /> I am however very concerned about the lack of the correct control group. Immunization of rodents with complete freunds adjuvant (albeit with pertussis toxin) gives rise to widespread microglial activation, some immune cell infiltration and also structural changes to axons, particularly at nodes of Ranvier (https://doi.org/10.1097/NEN.0b013e3181f3a5b1). This will inevitably make it difficult to interpret the transcriptomics results, depending on whether these changes are reversible or not and the time frame of the reversal. In the C57Bl6 EAE models adjuvant induced microglial activation becomes chronic, whereas the axonal changes do reverse by 10 weeks. Whether this is the same in SJL EAE model using CFA alone is not clear.

    1. eLife assessment

      This study provides important insight into the mechanisms of proton-coupled oligopeptide transporters. It uses enhanced-sampling molecular dynamics (MD), backed by cell-based assays, revealing the importance of protonation of selected residues for PepT2 function. The simulation approaches are convincing, using long MD simulations, constant-pH MD and free energy calculations. Overall, the work has led to findings that will appeal to structural biologists, biochemists, and biophysicists studying membrane transporters.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides valuable information on the mechanism of PepT2 through enhanced-sampling molecular dynamics, backed by cell-based assays, highlighting the importance of protonation of selected residues for the function of a proton-coupled oligopeptide transporter (hsPepT2). The molecular dynamics approaches are convincing, but with limitations that could be addressed in the manuscript, including lack of incorporation of a protonation coordinate in the free energy landscape, possibility of protonation of the substrate, errors with the chosen constant pH MD method for membrane proteins, dismissal of hysteresis emerging from the MEMENTO method, and the likelihood of other residues being affected by peptide binding. Some changes to the presentation could be considered, including a better description of pKa calculations and the inclusion of error bars in all PMFs. Overall, the findings will appeal to structural biologists, biochemists, and biophysicists studying membrane transporters.

      We would like to express our gratitude to the reviewers for providing their feedback on our manuscript, and also for recognising the variety of computational methods employed, the amount of sampling collected and the experimental validation undertaken. Following the individual reviewer comments, as addressed point-by-point below, we have prepared a revised manuscript, but before that we address some of the comments made above in the general assessment:

      • “lack of incorporation of a protonation coordinate in the free energy landscape”.

      We acknowledge that of course it would be highly desirable to treat protonation state changes explicitly and fully coupled to conformational changes. However, at this point in time, evaluating such a free energy landscape is not computationally feasible (especially considering that the non-reactive approach taken here already amounts to almost 1ms of total sampling time).  Previous reports in the literature tend to focus on either simpler systems or a reduced subset of a larger problem.  As we were trying to obtain information on the whole transport cycle, we decided to focus here on non-reactive methods.

      • “possibility of protonation of the substrate”.

      The reviewers are correct in pointing out this possibility, which we had not discussed explicitly in our manuscript.  Briefly, while we describe a mechanism in which protonation of only protein residues (with an unprotonated ligand) can account for driving all the necessary conformational changes of the transport cycle, there is some evidence for a further intermediate protonation site in our data (as we commented on in the first version of the manuscript as well), which may or may not be the substrate itself. A future explicit treatment of the proton movements through the transporter, when it will become computationally tractable to do so, will have to include the substrate as a possible protonation site; for the present moment, we have amended our discussion to alert the reader to the possibility that the substrate could be an intermediate to proton transport. This has repercussions for our study of the E56 pKa value, where – if protons reside with a significant population at the substrate C-terminus – our calculated shift in pKa upon substrate binding could be an overestimate, although we would qualitatively expect the direction of shift to be unaffected. However, we also anticipate that treating this potential coupling explicitly would make convergence of any CpHMD calculation impractical to achieve and thus it may be the case that for now only a semi-quantitative conclusion is all that can be obtained.

      • “errors with the chosen constant pH MD method for membrane proteins”.

      We acknowledge that – as reviewer #1 has reminded us – the AMBER implementation of hybrid-solvent CpHMD is not rigorous for membrane proteins, and as such added a cautionary note to our paper.  We also explain how the use of the ABFE thermodynamic cycle calculations helps to validate the CpHMD results in a completely orthogonal manner (we have promoted this validation, which was in the supplementary figures, into the main text in the revised version).   We therefore remain reasonably confident in the results presented with regards to the reported pKa shift of E56 upon substrate binding, and suggest that if the impact of neglecting the membrane in the implicit-solvent stage of CpHMD is significant, then there is likely an error cancellation when considering shifts induced by the incoming substrate.

      • “dismissal of hysteresis emerging from the MEMENTO method”.

      We have shown in our method design paper how the use of the MEMENTO method drastically reduces hysteresis compared to steered MD for path generation, and find this improvement again for PepT2 in this study. We address reviewer #3’s concern about our presentation on this point by revising our introduction of the MEMENTO method, as detailed in the response below.

      • “the likelihood of other residues being affected by peptide binding”.

      In this study, we have investigated in detail the involvement of several residues in proton-coupled di-peptide transport by PepT2. Short of the potential intermediate protonation site mentioned above, the set of residues we investigate form a minimal set of sorts within which the important driving forces of alternating access can be rationalised.  We have not investigated in substantial detail here the residues involved in holding the peptide in the binding site, as they are well studied in the literature and ligand promiscuity is not the problem of interest here. It remains entirely possible that further processes contribute to the mechanism of driving conformational changes by involving other residues not considered in this paper. We have now made our speculation that an ensemble of different processes may be contributing simultaneously more explicit in our revision, but do not believe any of our conclusions would be affected by this.

      As for the additional suggested changes in presentation, we provide the requested details on the CpHMD analysis. Furthermore, we use the convergence data presented separately in figures S12 and S16 to include error bars on our 1D-reprojections of the 2D-PMFs in figures 3, 4 and 5. (Note that we have opted to not do so in figures S10 and S15 which collate all 1D PMF reprojections for the OCC ↔ OF and OCC ↔ IF transitions in single reference plots, respectively, to avoid overcrowding those necessarily busy figures). We have also changed the colours schemes of these plots in our revision to improve accessibility. We have additionally taken the opportunity to fix some typos and further clarified some other statements throughout the manuscript, besides the requests from the reviewers.

      Reviewer #1 (Public Review):

      The authors have performed all-atom MD simulations to study the working mechanism of hsPepT2. It is widely accepted that conformational transitions of proton-coupled oligopeptide transporters (POTs) are linked with gating hydrogen bonds and salt bridges involving protonatable residues, whose protonation triggers gate openings. Through unbiased MD simulations, the authors identified extra-cellular (H87 and D342) and intra-cellular (E53 and E622) triggers. The authors then validated these triggers using free energy calculations (FECs) and assessed the engagement of the substrate (Ala-Phe dipeptide). The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cellbased transport assays. An alternating-access mechanism was proposed. The study was largely conducted properly, and the paper was well-organized. However, I have a couple of concerns for the authors to consider addressing.

      We would like to note here that it may be slightly misleading to the reader to state that “The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cell-based transport assays.” The cellbased transport assays confirmed the importance of the extracellular gating trigger residues H87, S321 and D342 (as mentioned in the preceding sentence), not of the substrate-protonation link as this line might be understood to suggest.

      (1) As a proton-coupled membrane protein, the conformational dynamics of hsPepT2 are closely coupled to protonation events of gating residues. Instead of using semi-reactive methods like CpHMD or reactive methods such as reactive MD, where the coupling is accounted for, the authors opted for extensive non-reactive regular MD simulations to explore this coupling. Note that I am not criticizing the choice of methods, and I think those regular MD simulations were well-designed and conducted. But I do have two concerns.

      a) Ideally, proton-coupled conformational transitions should be modelled using a free energy landscape with two or more reaction coordinates (or CVs), with one describing the protonation event and the other describing the conformational transitions. The minimum free energy path then illustrates the reaction progress, such as OCC/H87D342-  →  OCC/H87HD342H →  OF/H87HD342H as displayed in Figure 3.

      We concur with the reviewer that the ideal way of describing the processes studied in our paper would be as a higher-dimensional free energy landscapes obtained from a simulation method that can explicitly model proton-transfer processes. Indeed, it would have been particularly interesting and potentially informative with regards to the movement of protons down into the transporter in the OF → OCC → IF sequence of transitions. As we note in our discussion on the H87→E56 proton transfer: 

      “This could be investigated using reactive MD or QM/MM simulations (both approaches have been employed for other protonation steps of prokaryotic peptide transporters, see Parker et al. (2017) and Li et al. (2022)).  However, the putative path is very long (≈ 1.7 nm between H87 and E56) and may or may not involve a large number of intermediate protonatable residues, in addition to binding site water. While such an investigation is possible in principle, it is beyond the scope of the present study.” 

      Where even sampling the proton transfer step itself in an essentially static protein conformation would be pushing the boundaries of what has been achieved in the field, we believe that considering the current state-of-the-art, a fully coupled investigation of large-scale conformational changes and proton-transfer reaction is not yet feasible in a realistic/practical time frame. We also note this limitation already when we say that:

      “The question of whether proton binding happens in OCC or OF warrants further investigation, and indeed the co-existence of several mechanisms may be plausible here”. 

      Nonetheless, we are actively exploring approaches to treat uptake and movement of protons explicitly for future work.

      In our revision, we have expanded on our discussion of the reasoning behind employing a non-reactive approach and the limitations that imposes on what questions can be answered in this study.

      Without including the protonation as a CV, the authors tried to model the free energy changes from multiple FECs using different charge states of H87 and D342. This is a practical workaround, and the conclusion drawn (the OCC→ OF transition is downhill with protonated H87 and D342) seems valid. However, I don't think the OF states with different charge states (OF/H87D342-, OF/H87HD342-, OF/H87D342H, and OF/H87HD342H) are equally stable, as plotted in Figure 3b. The concern extends to other cases like Figures 4b, S7, S10, S12, S15, and S16. While it may be appropriate to match all four OF states in the free energy plot for comparison purposes, the authors should clarify this to ensure readers are not misled.

      The reviewer is correct in their assessment that the aligning of PMFs in these figures is arbitrary; no relative free energies of the PMFs to each other can be estimated without explicit free energy calculations at least of protonation events at the end state basins. The PMFs in our figures are merely superimposed for illustrating the differences in shape between the obtained profiles in each condition, as discussed in the text, and we now make this clear in the appropriate figure captions.

      b) Regarding the substrate impact, it appears that the authors assumed fixed protonation states. I am afraid this is not necessarily the case. Variations in PepT2 stoichiometry suggest that substrates likely participate in proton transport, like the Phe-Ala (2:1) and Phe-Gln (1:1) dipeptides mentioned in the introduction. And it is not rigorous to assume that the N- and C-termini of a peptide do not protonate/deprotonate when transported. I think the authors should explicitly state that the current work and the proposed mechanism (Figure 8) are based on the assumption that the substrates do not uptake/release proton(s).

      This is indeed an assumption inherent in the current work. While we do “speculate that the proton movement processes may happen as an ensemble of different mechanisms, and potentially occur contemporaneously with the conformational change” we do not in the previous version indicate explicitly that this may involve the substrate. We make clear the assumption and this possibility in the revised version of our paper. Indeed, as we discuss, there is some evidence in our PMFs of an additional protonation site not considered thus far, which may or may not be the substrate. We now make note of this point in the revised manuscript.

      As for what information can be drawn from the given experimental stoichiometries, we note in our paper that “a 2:1 stoichiometry was reported for the neutral di-peptide D-Phe-L-Ala and 3:1 for anionic D-Phe-L-Glu. (Chen et al., 1999) Alternatively, Fei et al. (1999) have found 1:1 stoichiometries for either of D-Phe-L-Gln (neutral), D-Phe-L-Glu (anionic), and D-Phe-L-Lys (cationic).” 

      We do not assume that it is our place to arbit among the apparent discrepancies in the experimental data here, although we believe that our assumed 2:1 stoichiometry is additionally “motivated also by our computational results that indicate distinct and additive roles played by two protons in the conformational cycle mechanism”.

      (2) I have more serious concerns about the CpHMD employed in the study.

      a) The CpHMD in AMBER is not rigorous for membrane simulations. The underlying generalized Born model fails to consider the membrane environment when updating charge states. In other words, the CpHMD places a membrane protein in a water environment to judge if changes in charge states are energetically favorable. While this might not be a big issue for peripheral residues of membrane proteins, it is likely unphysical for internal residues like the ExxER motif. As I recall, the developers have never used the method to study membrane proteins themselves. The only CpHMD variant suitable for membrane proteins is the membrane-enabled hybrid-solvent CpHMD in CHARMM. While I do not expect the authors to redo their CpHMD simulations, I do hope the authors recognize the limitations of their method.

      We discuss the limitations of the AMBER CpHMD implementation in the revised version. However, despite that, we believe we have in fact provided sufficient grounds for our conclusion that substrate binding affects ExxER motif protonation in the following way.

      In addition to CpHMD simulations, we establish the same effect via ABFE calculations, where the substrate affinity is different at the E56 deprotonated vs protonated protein. This was figure S20 before, though in the revised version we have moved this piece of validation into a new panel of figure 6 in the main text, since it becomes more important with the CpHMD membrane problem in mind. Since the ABFE calculations are conducted with an all-atom representation of the lipids and the thermodynamic cycle closes well, it would appear that if the chosen CpHMD method has a systematic error of significant magnitude for this particular membrane protein system, there may be the benefit of error cancellation. While the calculated absolute pKa values may not be reliable, the difference made by substrate binding appears to be so, as judged by the orthogonal ABFE technique.

      Although the reviewer does “not expect the authors to redo their CpHMD simulations”, we consider that it may be helpful to the reader to share in this response some results from trials using the continuous, all-atom constant pH implementation that has recently become available in GROMACS (Aho et al 2022, https://pubs.acs.org/doi/10.1021/acs.jctc.2c00516) and can be used rigorously with membrane proteins, given its all-atom lipid representation.

      Unfortunately, when trying to titrate E56 in this CpHMD implementation, we found few protonationstate transitions taking place, and the system often got stuck in protonation state–local conformation coupled minima (which need to interconvert through rearrangements of the salt bridge network involving slow side-chain dihedral rotations in E53, E56 and R57). Author response image 1 shows this for the apo OF state, Author response image 2 shows how noisy attempts at pKa estimation from this data turn out to be, necessitating the use of a hybrid-solvent method.

      Author response image 1.

      All-atom CpHMD simulations of apo-OF PepT2. Red indicates protonated E56, blue is deprotonated.

      Author response image 2.

      Difficulty in calculating the E56 pKa value from the noisy all-atom CpHMD data shown in Author response image 1.

      b) It appears that the authors did not make the substrate (Ala-Phe dipeptide) protonatable in holosimulations. This oversight prevents a complete representation of ligand-induced protonation events, particularly given that the substrate ion pairs with hsPepT2 through its N- & C-termini. I believe it would be valuable for the authors to acknowledge this potential limitation. 

      In this study, we implicitly assumed from the outset that the substrate does not get protonated, which – as by way of response to the comment above – we now acknowledge explicitly. This potential limitation for the available mechanisms for proton transfer also applies to our investigation of the ExxER protonation states. In particular, a semi-grand canonical ensemble that takes into account the possibility of substrate C-terminus protonation may also sample states in which the substrate is protonated and oriented away from R57, thus leaving the ExxER salt bridge network in an apo-like state. The consequence would be that while the direction of shift in E56 pKa value will be the same, our CpHMD may overestimate its magnitude. It would thus be interesting to make the C-terminus protonatable for obtaining better quantitative estimates of the E56 pKa shift (as is indeed true in general for any other protein protonatable residue, though the effects are usually assumed to be negligible). We do note, however, that convergence of the CpHMD simulations would be much harder if the slow degree of freedom of substrate reorientation (which in our experience takes 10s to 100s of nanoseconds in this binding pocket) needs to be implicitly equilibrated upon protonation state transitions. We discuss such considerations in the revised paper.

      Reviewer #2 (Public Review):

      This is an interesting manuscript that describes a series of molecular dynamics studies on the peptide transporter PepT2 (SLC15A2). They examine, in particular, the effect on the transport cycle of protonation of various charged amino acids within the protein. They then validate their conclusions by mutating two of the residues that they predict to be critical for transport in cell-based transport assays. The study suggests a series of protonation steps that are necessary for transport to occur in Petp2. Comparison with bacterial proteins from the same family shows that while the overall architecture of the proteins and likely mechanism are similar, the residues involved in the mechanism may differ. 

      Strengths: 

      This is an interesting and rigorous study that uses various state-of-the-art molecular dynamics techniques to dissect the transport cycle of PepT2 with nearly 1ms of sampling. It gives insight into the transport mechanism, investigating how the protonation of selected residues can alter the energetic barriers between various states of the transport cycle. The authors have, in general, been very careful in their interpretation of the data. 

      Weaknesses: 

      Interestingly, they suggest that there is an additional protonation event that may take place as the protein goes from occluded to inward-facing but they have not identified this residue.

      We have indeed suggested that there may be an additional protonation site involved in the conformational cycle that we have not been able to capture, which – as we discuss in our paper – might be indicated by the shapes of the OCC ↔ IF PMFs given in Figure S15. One possibility is for this to be the substrate itself (see the response to reviewer #1 above) though within the scope of this study the precise pathway by which protons move down the transporter and the exact ordering of conformational change and proton transfer reactions remains a (partially) open question. We acknowledge this, denote it with question marks in the mechanistic overview we give in Figure 8 and also “speculate that the proton movement processes may happen as an ensemble of different mechanisms, and potentially occur contemporaneously with the conformational change”.

      Some things are a little unclear. For instance, where does the state that they have defined as occluded sit on the diagram in Figure 1a? - is it truly the occluded state as shown on the diagram or does it tend to inward- or outward-facing?

      Figure 1a is a simple schematic overview intended to show which structures of PepT2 homologues are available to use in simulations. This was not meant to be a quantitative classification of states. Nonetheless, we can note that the OCC state we derived has extra- and intracellular gate opening distances (as measured by the simple CVs defined in the methods and illustrated in Figure 2a) that indicate full gate closure at both sides. In particular, although it was derived from the IF state via biased sampling, the intracellular gate opening distance in the OCC state used for our conformational change enhanced sampling was comparable to that of the OF state (ie, full closure of the gate), see Figure S2b and the grey bars therein. Therefore, we would schematically classify the OCC state to lie at the center of the diagram in Figure 1a. Furthermore, it is largely stable over triplicates of 1 μslong unbiased MD, where in 2/3 replicates the gates remain stable, and the remaining replicate there is partial opening of the intracellular gate (as shown in Figure 2 b/c under the “apo standard” condition). We comment on this in the main text by saying that “The intracellular gate, by contrast, is more flexible than the extracellular gate even in the apo, standard protonation state”, and link it to the lower barrier for transition to IF than to OF. We did this by saying that “As for the OCC↔OF transitions, these results explain the behaviour we had previously observed in the unbiased MD of Figure 2c.” We acknowledge this was not sufficiently clear and have added details to the latter sentence to help clarify better the nature of the occluded state.

      The pKa calculations and their interpretation are a bit unclear. Firstly, it is unclear whether they are using all the data in the calculations of the histograms, or just selected data and if so on what basis was this selection done. Secondly, they dismiss the pKa calculations of E53 in the outward-facing form as not being affected by peptide binding but say that E56 is when there seems to be a similar change in profile in the histograms.

      In our manuscript, we have provided two distinct analyses of the raw CpHMD data. Firstly, we analysed the data by the replicates in which our simulations were conducted (Figure 6, shown as bar plots with mean from triplicates +/- standard deviation), where we found that only the effect on E56 protonation was distinct as lying beyond the combined error bars. This analysis uses the full amount of sampling conducted for each replicate. However, since we found that the range of pKa values estimated from 10ns/window chunks was larger than the error bars obtained from the replicate analysis (Figures S17 and S18), we sought to verify our conclusion by pooling all chunk estimates and plotting histograms (Figure S19). We recover from those the effect of substrate binding on the E56 protonation state on both the OF and OCC states. However, as the reviewer has pointed out (something we did not discuss in our original manuscript), there is a shift in the pKa of E53 of the OF state only. In fact, the trend is also apparent in the replicate-based analysis of Figure 6, though here the larger error bars overlap. In our revision, we added more details of these analyses for clarity (including more detailed figure captions regarding the data used in Figure 6) as well as a discussion of the partial effect on the E53 pKa value. 

      We do not believe, however, that our key conclusions are negatively affected. If anything, a further effect on the E53 pKa which we had not previously commented on (since we saw the evidence as weaker, pertaining to only one conformational state) would strengthen the case for an involvement of the ExxER motif in ligand coupling.

      Reviewer #3 (Public Review):

      Summary: 

      Lichtinger et al. have used an extensive set of molecular dynamics (MD) simulations to study the conformational dynamics and transport cycle of an important member of the proton-coupled oligopeptide transporters (POTs), namely SLC15A2 or PepT2. This protein is one of the most wellstudied mammalian POT transporters that provides a good model with enough insight and structural information to be studied computationally using advanced enhanced sampling methods employed in this work. The authors have used microsecond-level MD simulations, constant-PH MD, and alchemical binding free energy calculations along with cell-based transport assay measurements; however, the most important part of this work is the use of enhanced sampling techniques to study the conformational dynamics of PepT2 under different conditions. 

      The study attempts to identify links between conformational dynamics and chemical events such as proton binding, ligand-protein interactions, and intramolecular interactions. The ultimate goal is of course to understand the proton-coupled peptide and drug transport by PepT2 and homologous transporters in the solute carrier family. 

      Some of the key results include:

      (1) Protonation of H87 and D342 initiate the occluded (Occ) to the outward-facing (OF) state transition. 

      (2) In the OF state, through engaging R57, substrate entry increases the pKa value of E56 and thermodynamically facilitates the movement of protons further down. 

      (3) E622 is not only essential for peptide recognition but also its protonation facilitates substrate release and contributes to the intracellular gate opening. In addition, cell-based transport assays show that mutation of residues such as H87 and D342 significantly decreases transport activity as expected from simulations. 

      Strengths: 

      (1) This is an extensive MD-based study of PepT2, which is beyond the typical MD studies both in terms of the sheer volume of simulations as well as the advanced methodology used. The authors have not limited themselves to one approach and have appropriately combined equilibrium MD with alchemical free energy calculations, constant-pH MD, and geometry-based free energy calculations. Each of these 4 methods provides a unique insight regarding the transport mechanism of PepT2.

      (2) The authors have not limited themselves to computational work and have performed experiments as well. The cell-based transport assays clearly establish the importance of the residues that have been identified as significant contributors to the transport mechanism using simulations.

      (3) The conclusions made based on the simulations are mostly convincing and provide useful information regarding the proton pathway and the role of important residues in proton binding, protein-ligand interaction, and conformational changes.

      Weaknesses: 

      (1) Some of the statements made in the manuscript are not convincing and do not abide by the standards that are mostly followed in the manuscript. For instance, on page 4, it is stated that "the K64-D317 interaction is formed in only ≈ 70% of MD frames and therefore is unlikely to contribute much to extracellular gate stability." I do not agree that 70% is negligible. Particularly, Figure S3 does not include the time series so it is not clear whether the 30% of the time where the salt bridge is broken is in the beginning or the end of simulations. For instance, it is likely that the salt bridge is not initially present and then it forms very strongly. Of course, this is just one possible scenario but the point is that Figure S3 does not rule out the possibility of a significant role for the K64-D317 salt bridge. 

      The reviewer is right to point out that the statement and Figure S3 as they were do not adequately support our decision to exclude the K64-D317 salt-bridge in our further investigations. The violin plot shown in Figure S3, visualised as pooled data from unbiased 1 μs triplicates, did indeed not rule out a scenario where the salt bridge only formed late in our simulations (or only in some replicates), but then is stable. Therefore, in our revision, we include the appropriate time-series of the salt bridge distances, showing how K64-D317 is initially stable but then falls apart in replicate 1, and is transiently formed and disengaged across the trajectories in replicates 2 and 3. We have also remade the data for this plot as we discovered a bug in the relevant analysis script that meant the D170-K642 distance was not calculated accurately. The results are however almost identical, and our conclusions remain.

      (2) Similarly, on page 4, it is stated that "whether by protonation or mutation - the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed (Figure S5)." I do not agree with this assessment. The authors need to be aware of the limitations of this approach. Consider "WT H87-prot" and "D342A H87-prot": when D342 residue is mutated, in one out of 3 simulations, we see the opening of the gate within 1 us. When D342 residue is not mutated we do not see the opening in any of the 3 simulations within 1 us. It is quite likely that if rather than 3 we have 10 simulations or rather than 1 us we have 10 us simulations, the 0/3 to 1/3 changes significantly. I do not find this argument and conclusion compelling at all.

      If the conclusions were based on that alone, then we would agree.  However, this section of work covers merely the observations of the initial unbiased simulations which we go on to test/explore with enhanced sampling in the rest of the paper, and which then lead us to the eventual conclusions.

      Figure S5 shows the results from triplicate 1 μs-long trajectories as violin-plot histograms of the extracellular gate opening distance, also indicating the first and final frames of the trajectories as connected by an arrow for orientation – a format we chose for intuitively comparing 48 trajectories in one plot. The reviewer reads the plot correctly when they analyse the “WT H87-prot” vs “D342A H87-prot” conditions. In the former case, no spontaneous opening in unbiased MD is taking place, whereas when D342 is mutated to alanine in addition to H87 protonation, we see spontaneous transition in 1 out of 3 replicates.  However, the reviewer does not seem to interpret the statement in question in our paper (“the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed”) in the way we intended it to be understood. We merely want to note here a correlation in the unbiased dataset we collected at this stage, and indeed the one spontaneous opening in the case comparison picked out by the reviewer is in the condition where both the H87 interaction network and D342-R206 are perturbed. In noting this we do not intend to make statistically significant statements from the limited dataset. Instead, we write that “these simulations show a large amount of stochasticity and drawing clean conclusions from the data is difficult”. We do however stand by our assessment that from this limited data we can “already appreciate a possible mechanism where protons move down the transporter pore” – a hypothesis we investigate more rigorously with enhanced sampling in the rest of the paper. We have revised the section in question to make clearer that the unbiased MD is only meant to give an initial hypothesis here to be investigated in more detail in the following sections. In doing so, we also incorporate, as we had not done before, the case (not picked out by the reviewer here but concerning the same figure) of S321A & H87 prot. In the third replicate, this shows partial gate opening towards the end of the unbiased trajectory (despite D342 not being affected), highlighting further the stochastic nature that makes even clear correlative conclusions difficult to draw.

      (3) While the MEMENTO methodology is novel and interesting, the method is presented as flawless in the manuscript, which is not true at all. It is stated on Page 5 with regards to the path generated by MEMENTO that "These paths are then by definition non-hysteretic." I think this is too big of a claim to say the paths generated by MEMENTO are non-hysteretic by definition. This claim is not even mentioned in the original MEMENTO paper. What is mentioned is that linear interpolation generates a hysteresis-free path by definition. There are two important problems here: (a) MEMENTO uses the linear interpolation as an initial step but modifies the intermediates significantly later so they are no longer linearly interpolated structures and thus the path is no longer hysteresisfree; (b) a more serious problem is the attribution of by-definition hysteresis-free features to the linearly interpolated states. This is based on conflating the hysteresis-free and unique concepts. The hysteresis in MD-based enhanced sampling is related to the presence of barriers in orthogonal space. For instance, one may use a non-linear interpolation of any type and get a unique pathway, which could be substantially different from the one coming from the linear interpolation. None of these paths will be hysteresis-free necessarily once subjected to MD-based enhanced sampling techniques.

      We certainly do not intend to claim that the MEMENTO method is flawless. The concern the reviewer raises around the statement "These paths are then by definition non-hysteretic" is perhaps best addressed by a clarification of the language used and considering how MEMENTO is applied in this work. 

      Hysteresis in the most general sense denotes the dependence of a system on its history, or – more specifically – the lagging behind of the system state with regards to some physical driver (for example the external field in magnetism, whence the term originates). In the context of biased MD and enhanced sampling, hysteresis commonly denotes the phenomenon where a path created by a biased dynamics method along a certain collective variable lags behind in phase space in slow orthogonal degrees of freedom (see Figure 1 in Lichtinger and Biggin 2023, https://doi.org/10.1021/acs.jctc.3c00140). When used to generate free energy profiles, this can manifest as starting state bias, where the conformational state that was used to seed the biased dynamics appears lower in free energy than alternative states. Figure S6 shows this effect on the PepT2 system for both steered MD (heavy atom RMSD CV) + umbrella sampling (tip CV) and metadynamics (tip CV). There is, in essence, a coupled problem: without an appropriate CV (which we did not have to start with here), path generation that is required for enhanced sampling displays hysteresis, but the refinement of CVs is only feasible when paths connecting the true phase space basins of the two conformations are available. MEMENTO helps solve this issue by reconstructing protein conformations along morphing paths which perform much better than steered MD paths with respect to giving consistent free energy profiles (see Figure S7 and the validation cases in the MEMENTO paper), even if the same CV is used in umbrella sampling. 

      There are still differences between replicates in those PMFs, indicating slow conformational flexibility propagated from end-state sampling through MEMENTO. We use this to refine the CVs further with dimensionality reduction (see the Method section and Figure S8), before moving to 2D-umbrella sampling (figure 3). Here, we think, the reviewer’s point seems to bear. The MEMENTO paths are ‘non-hysteretic by definition’ with respect to given end states in the sense that they connect (by definition) the correct conformations at both end-states (unlike steered MD), which in enhanced sampling manifests as the absence of the strong starting-state bias we had previously observed (Figure S7 vs S6). They are not, however, hysteresis-free with regards to how representative of the end-state conformational flexibility the structures given to MEMENTO really were, which is where the iterative CV design and combination of several MEMENTO paths in 2D-PMFs comes in. 

      We also cannot make a direct claim about whether in the transition region the MEMENTO paths might be separated from the true (lower free energy) transition paths by slow orthogonal degrees of freedom, which may conceivably result in overestimated barrier heights separating two free energy basins. We cannot guarantee that this is not the case, but neither in our MEMENTO validation examples nor in this work have we encountered any indications of a problem here.

      We hope that the reviewer will be satisfied by our revision, where we replace the wording in question by a statement that the MEMENTO paths do not suffer from hysteresis that is otherwise incurred as a consequence of not reaching the correct target state in the biased run (in some orthogonal degrees of freedom).

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors): 

      Figure S1: it would be useful to label the panels.

      We have now done this.

      At the bottom of page 4, it is written that "the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed (Figure S5)." But it is hard to interpret that from the figure.  

      See also our response to reviewer #3. We have revised the wording of this statement, and also highlight in Figure S5 the crucial runs we are referring to, in order to make them easier to discern.

      At the bottom of page 5, and top of page 6, there is a lot of "other" information shown, which is inserted for the record - this is a bit glossed over and hard to follow.

      The “other” information refers to further conditions we had calculated PMFs for and that gave some insight, but which were secondary for drawing our key conclusions. We thank the reviewer for their feedback that this section needs clarification. We have revised this paragraph to make it easier to follow and highlight better the conclusions we draw form the data.

      In Figure 7 it looks as though the asterisks have shifted.

      We are indebted to the reviewer for spotting this error, the asterisks are indeed shifted one bar to the right of their intended position. The revised version fixes this issue.

      Reviewer #3 (Recommendations For The Authors):

      Minor points: In Figure 1a, The 7PMY label and arrow are slightly misplaced.

      Figure 1a is a schematic diagram to show the available structures of PepT2 homologues (see also the response to reviewer #2 above). The 7PMY label placement is intentional to indicate a partially occluded inwards-facing state. As we write in the figure caption: “Intermediate positions between states indicate partial gate opening”.

    3. Reviewer #1 (Public Review):

      The authors have performed all-atom MD simulations to study the working mechanism of hsPepT2. It is widely accepted that conformational transitions of proton-coupled oligopeptide transporters (POTs) are linked with gating hydrogen bonds and salt bridges involving protonatable residues, whose protonation triggers gate openings. Through unbiased MD simulations, the authors identified extra-cellular (H87 and D342) and intra-cellular (E53 and E622) triggers. The authors then validated these triggers using free energy calculations (FECs) and assessed the engagement of the substrate (Ala-Phe dipeptide). The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cell-based transport assays. An alternating-access mechanism was proposed. The study was largely conducted properly, and the paper was well-organized. However, I have a couple of concerns for the authors to consider addressing.

      (1) As a proton-coupled membrane protein, the conformational dynamics of hsPepT2 is closely coupled to protonation events of gating residues. Instead of using semi-reactive methods like CpHMD or reactive methods such as reactive MD, where the coupling is accounted for, the authors opted for extensive non-reactive regular MD simulations to explore this coupling. Note that I am not criticizing the choice of methods, and I think those regular MD simulations were well designed and conducted. But I do have two concerns.<br /> (a) Ideally, proton-coupled conformational transitions should be modelled using a free energy landscape with two or more reaction coordinates (or CVs), with one describing the protonation event and the others describing the conformational transitions. The minimum free energy path then illustrates the reaction progress, such as OCC/H87D342- ↔ OCC/H87HD342H ↔ OF/H87HD342H as displayed in Figure 3. Without including the protonation as a CV, the authors tried to model the free energy changes from multiple FECs using different charge states of H87 and D342. This is a practical workaround, and the conclusion drawn (the OCC↔OF transition is downhill with protonated H87 and D342) seems valid. However, I don't think the OF states with different charge states (OF/H87D342-, OF/H87HD342-, OF/H87D342H, and OF/H87HD342H) are equally stable, as plotted in Figure 3b. The concern extends to other cases like Figures 4b, S7, S10, S12, S15, and S16. While it may be appropriate to match all four OF states in the free energy plot for comparison purposes, the authors should clarify this to ensure readers are not misled.<br /> (b) Regarding the substrate impact, it appears that the authors assumed fixed protonation states. I am afraid this is not necessarily the case. Variations in PepT2 stoichiometry suggests that substrates likely participate in proton transport, like the Phe-Ala (2:1) and Phe-Gln (1:1) dipeptides mentioned in the introduction. And it is not rigorous to assume that the N- and C-termini of a peptide do not protonate/deprotonate when transported. I think the authors should explicitly state that the current work and the proposed mechanism (Figure 8) are based on the assumption that the substrates do not uptake/release proton(s).

      (2) I have more serious concerns about the CpHMD employed in the study.<br /> (a) The CpHMD in AMBER is not rigorous for membrane simulations. The underlying generalized Born model fails to consider the membrane environment when updating charge states. In other words, the CpHMD places a membrane protein in a water environment to judge if changes in charge states are energetically favorable. While this might not be a big issue for peripheral residues of membrane proteins, it is likely unphysical for internal residues like the ExxER motif. As I recall, the developers have never used the method to study membrane proteins themselves. The only CpHMD variant suitable for membrane proteins is the membrane-enabled hybrid-solvent CpHMD in CHARMM. While I do not expect the authors to redo their CpHMD simulations, I do hope the authors recognize the limitation of their method.<br /> (b) It appears that the authors did not make the substrate (Ala-Phe dipeptide) protonatable in holo-simulations. This oversight prevents a complete representation of ligand-induced protonation events, particularly given that the substrate ion-pairs with hsPepT2 through its N- & C-termini. I believe it would be valuable for the authors to acknowledge this potential limitation.

    4. Reviewer #2 (Public Review):

      Summary:

      This is an interesting manuscript that describes a series of molecular dynamics studies on the peptide transporter PepT2 (SLC15A2). They examine, in particular, the effect on the transport cycle of protonation of various charged amino acids within the protein. They then validate their conclusions by mutating two of the residues that they predict to be critical for transport in cell-based transport assays. The study suggests a series of protonation steps that are necessary for transport to occur in Petp2. Comparison with bacterial proteins from the same family show that while the overall architecture of the proteins and likely mechanism are similar, the residues involved in the mechanism may differ.

      Strengths:

      This is an interesting and rigorous study that uses various state of the art molecular dynamics techniques to dissect the transport cycle of PepT2 with nearly 1ms of sampling. It gives insight into the transport mechanism, investigating how protonation of selected residues can alter the energetic barriers between various states of the transport cycle. The authors have, in general, been very careful in their interpretation of the data.

      Weaknesses:

      Interestingly, they suggest that there is an additional protonation event that may take place as the protein goes from occluded to inward-facing (clear from Figure 8) but as the authors comment they have not identified this residue(s).

    5. Reviewer #3 (Public Review):

      Summary:

      Lichtinger et al. have used an extensive set of molecular dynamics (MD) simulations to study the conformational dynamics and transport cycle of an important member of the proton-coupled oligopeptide transporters (POTs), namely SLC15A2 or PepT2. This protein is one of the most well-studied mammalian POT transporters that provides a good model with enough insight and structural information to be studied computationally using advanced enhanced sampling methods employed in this work. The authors have used microsecond-level MD simulations, constant-PH MD, and alchemical binding free energy calculations along with cell-based transport assay measurements; however, the most important part of this work is the use of enhanced sampling techniques to study the conformational dynamics of PepT2 under different conditions.

      The study attempts to identify links between conformational dynamics and chemical events such as proton binding, ligand-protein interactions, and intramolecular interactions. The ultimate goal is of course to understand the proton-coupled peptide and drug transport by PepT2 and homologous transporters in the solute carrier family.

      Some of the key results include (1) Protonation of H87 and D342 initiate the occluded (Occ) to the outward-facing (OF) state transition; (2) In the OF state, through engaging R57, substrate entry increases the pKa value of E56 and thermodynamically facilitates the movement of protons further down; (3) E622 is not only essential for peptide recognition but also its protonation facilitates substrate release and contributes to the intracellular gate opening. In addition, cell-based transport assays show that mutation of residues such as H87 and<br /> D342 significantly decrease transport activity as expected from simulations.

      Strengths:

      (1) This is an extensive MD based study of PepT2, which is beyond the typical MD studies both in terms of the sheer volume of simulations as well as the advanced methodology used. The authors have not limited themselves to one approach and have appropriately combined equilibrium MD with alchemical free energy calculations, constant-pH MD and geometry-based free energy calculations. Each of these 4 methods provides a unique insight regarding the transport mechanism of PepT2.

      (2) The authors have not limited themselves to computational work and has performed experiments as well. The cell-based transport assays clearly establish the importance of the residues that have been identified as significant contributors to the transport mechanism using simulations.

      (3) The conclusions made based on the simulations are mostly convincing and provide useful information regarding the proton pathway and the role of important residues in proton binding, protein-ligand interaction, and conformational changes.

      Weaknesses:

      There are inherent limitations with the methodology used such as the MEMENTO and constant pH MD that have been briefly noted in the manuscript.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      In the manuscript by Su et al., the authors present a massively parallel reporter assay (MPRA) measuring the stability of in vitro transcribed mRNAs carrying wild-type or mutant 5' or 3' UTRs transfected into two different human cell lines. The goal presented at the beginning of the manuscript was to screen for effects of disease-associated point mutations on the stability of the reporter RNAs carrying partial human 5' or 3' UTRs. However, the majority of the manuscript is dedicated to identifying sequence components underlying the differential stability of reporter constructs. This shows that TA dinucleotides are the most predictive feature of RNA stability in both cell lines and both UTRs.

      The effect of AU rich elements (AREs) on RNA stability is well established in multiple systems, and the present study confirms this general trend but points out variability in the consequence of seemingly similar motifs on RNA stability. For example, the authors report that a long stretch of Us has extreme opposite effects on RNA stability depending on whether it is preceded by an A (strongly destabilizing) or followed by an A (strongly stabilizing). While the authors interpretation of a context- dependence of the effect is certainly well-founded, it seems counterintuitive that the preceding or following A would be the (only) determining factor. This points to a generally reductionist approach taken by the authors in the analysis of the data and in their attempt to dissect the contribution of "AU rich sequences" to RNA stability, with a general tendency to reduce the size and complexity of the features (e.g. to dinucleotides). While this certainly increases the statistical power of the analysis due to the number of occurrences of these motifs, it limits the interpretability of the results. How do TA dinucleotides per se contribute to destabilizing the RNA, both in 5' and 3' UTRs, but (according to limited data presented) not in coding sequences? What is the mechanism? RBPs binding to TA dinucleotide containing sequences are suggested to "mask" the destabilizing effect, thereby leading to a more stable RNA. Gain of TA dinucleotides is reported to have a destabilizing effect, but again no hypothesis is provided as to the underlying molecular mechanism. In addition to reducing the motif length to dinucleotides, the notion of "context dependence" is used in a very narrow sense; especially when focusing on simple and short motifs, a more extensive analysis of the interdependence of these features (beyond the existing analysis of the relationship between TA- diNTs and GC content) could potentially reveal more of the context dependence underlying the seemingly opposite behavior of very similar motifs.

      The contribution of coding region sequence to RNA stability has been extensively discussed (For example: doi.org/10.1016/j.molcel.2022.03.032; doi.org/10.1186/s13059-020-02251-5; doi.org/10.15252/embr.201948220; doi.org/10.1371/journal.pone.0228730; doi.org/10.7554/eLife.45396). While TA content at the third codon position (wobble position) has been implicated as a pro-degradation signal, codon optimality has emerged as the most prominent determinant for RNA stability. This indicates that the role of coding regions in RNA stability differs from that of UTRs due to the involvement of translation elongation. We did not intend to suggest that TA-dinucleotides in UTRs and coding regions have the same effect.

      We hypothesize that TA-dinucleotide may recruit endonucleases RNase A family, whose catalytic pockets exhibit a strong bias for TA dinucleotide (doi.org/10.1016/j.febslet.2010.04.018). Structures or protein bindings that blocks this recognition might stabilize RNAs. To gain further insight into the motif interactions, we plan to investigate the interactions between TA and other 15 dinucleotides through more detailed analyses.

      The present MPRAs measures the effect of UTR sequences in one specific reporter context and using one experimental approach (following the decay of in vitro transcribed and transfected RNAs). While this approach certainly has its merits compared to other approaches, it also comes with some caveats: RNA is delivered naked, without bound RBPs and no nuclear history, e.g. of splicing (no EJCs), editing and modifications. One way to assess the generalizability of the results as well as the context dependence of the effects is to perform the same analysis on existing datasets of RNA stability measurements obtained through other methods (e.g. transcription inhibition). Are TA dinucleotides universally the most predictive feature of RNA half-lives?

      Our system studies the stability control of RNA synthesized in vitro and delivered into human cells. While we did not intend to generalize our conclusions to endogenous RNAs, our approach contributes to the understanding of in vitro synthesized RNA used for cellular expression, such as in vaccines. It is known that endogenous RNAs undergo very different regulation. The most prominent factors controlling endogenous RNA stability are the density of splice junctions and the length of UTRs (doi.org/10.1186/s13059-022-02811-x; doi.org/10.1186/s12915-021-00949-x). To decipher the sequence regulation, these factors are controlled in our experiments. Therefore we do not expect the dinucleotide features found by our approach to be generalized as the most predictive feature of RNA half-life in vivo.

      The authors conclude their study with a meta-analysis of genes with increased TA dinucleotides in 5' and 3'UTRs, showing that specific functional groups are overrepresented among these genes. In addition, they provide evidence for an effect of disease-associated UTR mutations on endogenous RNA stability. While these elements link back to the original motivation of the study (screening for effects of point mutations in 5' and 3' UTRs), they provide only a limited amount of additional insights.

      We utilized the Taiwan Biobank to investigate whether mutations significantly affecting RNA stability also impact human biochemical measurements. Our findings indicate that these mutations indeed have a significant effect on various biochemical indices. This highlights the importance of our study, as it bridges basic science with potential applications in precision medicine. By linking specific UTR mutations with measurable changes in biochemical indices, our research underscores the potential for these findings to inform targeted medical interventions in the future.

      In summary, this manuscript presents an interesting addition to the long-standing attempts at dissecting the sequence basis of RNA stability in human cells. The analysis is in general very comprehensive and sound; however, at times the goal of the authors to find novelty and specificity in the data overshadows some analyses. One example is the case where the authors try to show that TA-dinucleotides and GC content are decoupled and not merely two sides of the same coin. They claim that the effect of TA dinucleotides is different between high- and low-GC content contexts but do not control for the fact that low GC-content regions naturally will contain more TA dinucleotides and therefore the effect sizes and the resulting correlation between TA-diNT rate and stability will be stronger (Fig. 5A). A more thorough analysis and greater caution in some of the claims could further improve the credibility of the conclusions.

      Low GC content implies a higher TA content but does not directly equate to a high TA-diNT rate. For instance, the sequence ATTGAACCTT has a lower GC content (0.3) compared to TATAGGCCGC (0.6), yet it also has a lower TA-diNT rate (0 vs. 0.22). To address this concern more rigorously, we performed a stratified analysis based on TA-diNT rate. As shown in our Fig. S7C, even after stratifying by TA-diNT rate (upper panel high TA-diNT rate / lower panel low TA-diNT rate), we still observe that the destabilizing effect of TA is stronger in the low GC content group.

      Reviewer #2 (Public Review):

      Summary of goals:

      Untranslated regions are key cis-regulatory elements that control mRNA stability, translation, and translocation. Through interactions with small RNAs and RNA binding proteins, UTRs form complex transcriptional circuitry that allows cells to fine-tune gene expression. Functional annotation of UTR variants has been very limited, and improvements could offer insights into disease relevant regulatory mechanisms. The goals were to advance our understanding of the determinants of UTR regulatory elements and characterize the effects of a set of "disease-relevant" UTR variants.

      Strengths:

      The use of a massively parallel reporter assay allowed for analysis of a substantial set (6,555 pairs) of 5' and 3' UTR fragments compiled from known disease associated variants. Two cell types were used.

      The findings confirm previous work about the importance of AREs, which helps show validity and adds some detailed comparisons of specific AU-rich motif effects in these two cell types.

      Using a Lasso regression, TA-dinucleotide content is identified as a strong regulator of RNA stability in a context dependent manner based on GC content and presence of RNA binding protein binding motifs. The findings have potential importance, drawing attention to a UTR feature that is not well characterized.

      The use of complementary datasets, including from half-life analyses of RNAs and from random sequence library MRPA's, is a useful addition and supports several important findings. The finding the TA dinucleotides have explanatory power separate from (and in some cases interacting with) GC content is valuable.

      The functional enrichment analysis suggests some new ideas about how UTRs may contribute to regulation of certain classes of genes.

      Weaknesses:

      It is difficult to understand how the calculations for half-life were performed. The sequencing approach measures the relative frequency of each sequence at each time point (less stable sequences become relatively less frequent after time 0, whereas more stable sequences become relatively more frequent after time 0). Since there is no discussion of whether the abundance of the transfected RNA population is referenced to some external standard (e.g., housekeeping RNAs), it is not clear how absolute (rather than relative) half-lives were determined.

      We estimated decay constant λ and half-life () by the following equations:

      where Ci(t) and Ci(t=0) are read count values of the ith replicate at time points  and  (see also Methods). The absolute abundance was not required for the half-life calculation.

      Fig. S1A and B are used to assess reproducibility. They show that read counts at a given time point correlate well across replicate experiments. However, this is not a good way to assess reproducibility or accuracy of the measurements of t1/2 are. (The major source of variability in read counts in these plots - especially at early time points - is likely the starting abundance of each RNA sequence, not stability.) This creates concerns about how well the method is measuring t1/2. Also creating concern is the observation that many RNAs are associated with half-lives that are much longer than the time points analyzed in the study. For example, based upon Figure S1 and Table S1 correctly, the median t1/2 for the 5' UTR library in HEK cells appears to be >700 minutes. Given that RNA was collected at 30, 75, and 120 minutes, accurate measurements of RNAs with such long half lives would seem to be very difficult.

      We estimated the half-life based on the following equations:

      Where Ci(t) and Ci(t=0) are read count values of the ith replicate at time points  and  (see also Methods). The calculation of the half-life involves first determining the decay constant 𝜆, which represents a constant rate of decay. Since 𝜆 is a constant, it is possible to accurately calculate it without needing data over the entire decay range. Our experimental design considers this by selecting appropriate time points to ensure a reliable estimation of 𝜆, and thus, the half-life. To determine the most suitable time points, we conducted preliminary experiments using RT-PCR. These experiments indicated that 30, 75, and 120 minutes provided an effective range for capturing the decay dynamics of the transcripts.

      There is no direct comparison of t1/2 between the two cell types studied for the full set of sequences studied. This would be helpful in understanding whether the regulatory effects of UTRs are generally similar across cell lines (as has been shown in some previous studies) or whether there are fundamental differences. The distribution of t1/2's is clearly quite different in the two cell lines, but it is important to know if this reflects generally slow RNA turnover in HEK cells or whether there are a large number of sequence-specific effects on stability between cell lines. A related issue is that it is not clear whether the relatively small number of significant variant effects detected in HEK cells versus SH-SY5Y cells is attributable to real biological differences between cell types or to technical issues (many fewer read counts and much longer half lives in HEK cells).

      For both cell lines, we selected oligonucleotides with R2 > 0.5 and mean squared error (MSE) < 1 for analysis when estimating half-life (λ) by linear regression. This selection criterion was implemented to minimize the effect of experimental noise. Additionally, we will further analyze the MSE distribution to determine if the two cell lines exhibit significantly different levels of experimental noise. We will also provide a direct comparison of half-lives between the two cell lines to assess the similarity in stability regulation.

      The general assertion is made in many places that TA dinucleotides are the most prominent destabilizing element in UTRs (e.g., in the title, the abstract, Fig. 4 legend, and on p. 12). This appears to be true for only one of the two cell lines tested based on Fig. 3.

      TA-dinucleotides and other TA-rich sequences exhibit similar effects on RNA stability, as illustrated in Fig. S5A-C. In two cell lines, TA-dinucleotide and WWWWWW sequences were representatives of the same stability-affecting cluster. While the impact of TA-dinucleotides can be generalized, we will rephrase some statements for clarification to avoid any potential misunderstanding.

      Appraisal and impact:

      The work adds to existing studies that previously identified sequence features, including AREs and other RNA binding protein motifs, that regulate stability and puts a new emphasis on the role of "TA" (better "UA") dinucleotides. It is not clear how potential problems with the RNA stability measurements discussed above might influence the overall conclusions, which may limit the impact unless these can be addressed.

      It is difficult to understand whether the importance of TA dinucleotides is best explained by their occurrence in a related set of longer RBP binding motifs (see Fig 5J, these motifs may be encompassed by the "WWWWWW cluster") or whether some other explanation applies. Further discussion of this would be helpful. Does the LASSO method tend to collapse a more diverse set of longer motifs that are each relatively rare compared to the dinucleotide? It remains unclear whether TA dinucleotides are associated with less stability independent of the presence of the known larger WWWWWWW motif. As noted above, the importance of TA dinucleotides in the HEK experiments appears to be less than is implied in the text.

      To ensure the representativeness of the features entered into the LASSO model, we pre-selected those with an occurrence greater than 10% among all UTRs. There is no evidence to support a preference for dinucleotides by LASSO. To address whether the destabilizing effect of TA dinucleotides is part of the broader WWWWWW motif, we will divide TA dinucleotides into two groups: those within the WWWWWW motif and those outside of it. We will then examine whether TA dinucleotides in these two groups exhibit the same destabilizing effect.

      The inclusion of more than a single cell type is an acknowledgement of the importance of evaluating cell type-specific effects. The work suggests a number of cell type-specific differences, but due to technical issues (especially with the HEK data, as outlined above) and the use of only two cell lines, it is difficult to understand cell type effects from the work.

      The inclusion of both 3' and 5' UTR sequences distinguishes this work from most prior studies in the field. Contrasting the effects of these regions on stability is of interest, although the role of these UTRs (especially the 5' UTR) in translational regulation is not assessed here.

      We examined the role of UTR and UTR variants in translation regulation using polysome profiling. By both univariate analysis and an elastic regression model, we identified motifs of short repeated sequences, including SRSF2 binding sites, as mutation hotspots that lead to aberrant translation. Furthermore, these polysome-shifting mutations had a considerable impact on RNA secondary structures, particularly in upstream AUG-containing 5’ UTRs. Integrating these features, our model achieved high accuracy (AUROC > 0.8) in predicting polysome-shifting mutations in the test dataset. Additionally, metagene analysis indicated that pathogenic variants were enriched at the upstream open reading frame (uORF) translation start site, suggesting changes in uORF usage underlie the translation deficiencies caused by these mutations. Illustrating this, we demonstrated that a pathogenic mutation in the IRF6 5’ UTR suppresses translation of the primary open reading frame by creating a uORF. Remarkably, site-directed ADAR editing of the mutant mRNA rescued this translation deficiency. Because the regulation of translation and stability does not converge, we illustrate these two mechanisms in two separate manuscripts (this one and doi.org/10.1101/2024.04.11.589132).

      Reviewer #3 (Public Review):

      Summary:

      In their manuscript titled "Multiplexed Assays of Human Disease‐relevant Mutations Reveal UTR Dinucleotide Composition as a Major Determinant of RNA Stability" the authors aim to investigate

      the effect of sequence variations in 3'UTR and 5'UTRs on the stability of mRNAs in two different human cell lines.

      To do so, the authors use a massively parallel reporter assay (MPRA). They transfect cells with a set of mRNA reporters that contain sequence variants in their 3' or 5' UTRs, which were previously reported in human diseases. They follow their clearance from cells over time relative to the matching non-variant sequence. To analyze their results, they define a set of factors (RBP and miRNA binding sites, sequence features, secondary structure etc.) and test their association with differences in mRNA stability. For features with a significant association, they use clustering to select a subset of factors for LASSO regression and identify factors that affect mRNA stability.

      They conclude that the TA dinucleotide content of UTRs is the strongest destabilizing sequence feature. Within that context, elevated GC content and protein binding can protect susceptible mRNAs from degradation. They also show that TA dinucleotide content of UTRs affects native mRNA stability, and that it is associated with specific functional groups. Finally, they link disease associated sequence variants with differences in mRNA stability of reporters.

      Strengths:

      (1) This work introduces a different MPRA approach to analyze the effect of genetic variants. While previous works in tissue culture use DNA transfections that require normalization for transcription efficiency, here the mRNA is directly introduced into cells at fixed amounts, allowing a more direct view of the mRNA regulation.

      (2) The authors also introduce a unique analysis approach, which takes into account multiple factors that might affect mRNA stability. This approach allows them to identify general sequence features that affect mRNA stability beyond specific genetic variants, and reach important insights on mRNA stability regulation. Indeed, while the conclusions to genetic variants identified in this work are interesting, the main strength of the work involve general effect of sequence features rather than specific variants.

      (3) The authors provide adequate supports for their claims, and validate their analysis using both their reporter data and native genes. For the main feature identified, TA di-nucleotides, they perform follow-up experiments with modified reporters that further strengthen their claims, and also validate the effect on native cellular transcripts (beyond reporters), demonstrating its validity also within native scenarios.

      (4) The work provides a broad analysis of mRNA stability, across two mRNA regulatory segments (3'UTR and 5'UTR) and is performed in two separate cell-types. Comparison between two different cell-types is adequate, and the results demonstrate, as expected, the dependence of mRNA stability on the cellular context. Analysis of 3'UTR and 5'UTR regulatory effects also shows interesting differences and similarities between these two regulatory regions.

      Weaknesses:

      (1) The authors fail to acknowledge several possible confounding factors of their MPRA approach in the discussion.

      First, while transfection of mRNA directly into cells allows to avoid the need to normalize for differences in transcription, the introduction of naked mRNA molecules is different than native cellular mRNAs and could introduce biases due to differences in mRNA modifications, protein associations etc. that may occur co-transcriptionally.

      Second, along those lines, the authors also use in-vitro polyadenylation. The length of the polyA tail of the transfected transcripts could potentially be very different than that of native mRNAs and also affect stability.

      The transcripts used in our study were polyadenylated in vitro with approximately 100 nucleotides  (Fig. S1C), similar to the polyA tail lengths typically observed in vivo  (dx.doi.org/10.1016/j.molcel.2014.02.007).  Additionally, these transcripts were capped to emulate essential mRNA characteristics and to minimize immune responses in recipient cells. This design allows us to study RNA decay for in vitro-synthesized RNA delivered into human cells, akin to RNA vaccines, but it does not necessarily extend to endogenous RNAs. As mentioned, endogenous RNAs undergo nuclear processing and are decorated by numerous trans factors, resulting in distinct regulatory mechanisms. We will provide a more in-depth discussion on these differences and their implications in the revised manuscript.

      (2) The analysis approach used in this work for identifying regulatory features in UTRs was not previously used. As such, lack of in-depth details of the methodology, and possibly also more general validation of the approach, is a drawback in convincing the reader in the validity of this approach and its results.

      In particular, a main point that is not addressed is how the authors decide on the set of "factors" used in their analysis? As choosing different sets of factors might affect the results of the analysis.

      In our study, we employed the calculation of the Variance Inflation Factor (VIF) as a basis for selecting variables. This well-established method is widely used to detect variables with high collinearity, thus ensuring the robustness and reliability of our analysis. By identifying and excluding highly collinear variables, we aimed to minimize multicollinearity and improve the accuracy of our regression models. For more detailed information on the use of VIF in regression analysis, please refer to Akinwande, M., Dikko, H., and Samson, A. (2015). Variance Inflation Factor: As a Condition for the Inclusion of Suppressor Variable(s) in Regression Analysis. Open Journal of Statistics, 5, 754-767. doi: 10.4236/ojs.2015.57075. We will include the method details in the revised manuscript.

      For example, the choice to use 7-mer sequences within the factors set is not explained, particularly when almost all motifs that are eventually identified (Figure 3B-E) are shorter.

      The known RBP motifs are primarily 6-mer. To explore the possibility of discovering novel motifs that could significantly impact our model, we started with 7-mer sequences. However, our analysis revealed that including these additional variables did not improve the explanatory power of the model; instead, it reduced it. Consequently, our final model focuses on motifs shorter than 7-mer. We will explain the motif selections in the revised manuscript.

      In addition, the authors do not perform validations to demonstrate the validity of their approach on simulated data or well-established control datasets. Such analysis would be helpful to further convince the reader in the usefulness and robustness of the analysis.

      We acknowledge the importance of validating our approach on simulated data or well-established control datasets to demonstrate its robustness and reliability. However, to the best of our knowledge, there are currently no well-established control datasets available that perfectly correspond to our specific study context. Despite this, we will continue to search for any relevant datasets that could be utilized for this purpose in future work. This effort will help to further reinforce the confidence in our methodology and its findings.

      (3) The analysis and regression models built in this work are not thoroughly investigated relative to native genes within cells. The effect of sequence "factors" on native cellular transcripts' stability is not investigated beyond TA di-nucleotides, and it is unclear to what degree do other predicted factors also affect native transcripts.

      Our system studies the stability control of RNA synthesized in vitro and delivered into human cells. While we validated the UTR TA-dinucleotide effect in vivo, we did not intend to conclude that this is the most influential regulation for endogenous RNAs. It is known that endogenous RNAs undergo very different regulation. The most prominent factors controlling endogenous RNA stability are the density of splice junctions and the length of UTRs (doi.org/10.1186/s13059-022-02811-x; doi.org/10.1186/s12915-021-00949-x). To decipher the sequence regulation, we controlled for these factors in our experiments. Therefore, we acknowledge that several endogenous features, which were excluded by our approach, may serve as predictive features of RNA half-life in vivo.

    2. eLife assessment

      This valuable study combines massively parallel reporter assays and regression analysis to identify sequence features in untranslated regions that contribute to mRNA stability. The strength of evidence presented is generally solid, but providing more details about how half lives are calculated and explaining some aspects of the subsequent choices made for analysis would clarify and strengthen the overall approach. Taken together, this study will be of interest to researchers broadly studying post-transcriptional gene regulation and also to scientists using massively parallel reporter assays.

    3. Reviewer #1 (Public Review):

      In the manuscript by Su et al., the authors present a massively parallel reporter assay (MPRA) measuring the stability of in vitro transcribed mRNAs carrying wild-type or mutant 5' or 3' UTRs transfected into two different human cell lines. The goal presented at the beginning of the manuscript was to screen for effects of disease-associated point mutations on the stability of the reporter RNAs carrying partial human 5' or 3' UTRs. However, the majority of the manuscript is dedicated to identifying sequence components underlying the differential stability of reporter constructs. This shows that TA dinucleotides are the most predictive feature of RNA stability in both cell lines and both UTRs.<br /> The effect of AU rich elements (AREs) on RNA stability is well established in multiple systems, and the present study confirms this general trend but points out variability in the consequence of seemingly similar motifs on RNA stability. For example, the authors report that a long stretch of Us has extreme opposite effects on RNA stability depending on whether it is preceded by an A (strongly destabilizing) or followed by an A (strongly stabilizing). While the authors interpretation of a context-dependence of the effect is certainly well-founded, it seems counterintuitive that the preceding or following A would be the (only) determining factor. This points to a generally reductionist approach taken by the authors in the analysis of the data and in their attempt to dissect the contribution of "AU rich sequences" to RNA stability, with a general tendency to reduce the size and complexity of the features (e.g. to dinucleotides). While this certainly increases the statistical power of the analysis due to the number of occurrences of these motifs, it limits the interpretability of the results. How do TA dinucleotides per se contribute to destabilizing the RNA, both in 5' and 3' UTRs, but (according to limited data presented) not in coding sequences? What is the mechanism? RBPs binding to TA dinucleotide containing sequences are suggested to "mask" the destabilizing effect, thereby leading to a more stable RNA. Gain of TA dinucleotides is reported to have a destabilizing effect, but again no hypothesis is provided as to the underlying molecular mechanism. In addition to reducing the motif length to dinucleotides, the notion of "context dependence" is used in a very narrow sense; especially when focusing on simple and short motifs, a more extensive analysis of the interdependence of these features (beyond the existing analysis of the relationship between TA-diNTs and GC content) could potentially reveal more of the context dependence underlying the seemingly opposite behavior of very similar motifs.

      The present MPRAs measures the effect of UTR sequences in one specific reporter context and using one experimental approach (following the decay of in vitro transcribed and transfected RNAs). While this approach certainly has its merits compared to other approaches, it also comes with some caveats: RNA is delivered naked, without bound RBPs and no nuclear history, e.g. of splicing (no EJCs), editing and modifications. One way to assess the generalizability of the results as well as the context dependence of the effects is to perform the same analysis on existing datasets of RNA stability measurements obtained through other methods (e.g. transcription inhibition). Are TA dinucleotides universally the most predictive feature of RNA half-lives?

      The authors conclude their study with a meta-analysis of genes with increased TA dinucleotides in 5' and 3'UTRs, showing that specific functional groups are overrepresented among these genes. In addition, they provide evidence for an effect of disease-associated UTR mutations on endogenous RNA stability. While these elements link back to the original motivation of the study (screening for effects of point mutations in 5' and 3' UTRs), they provide only a limited amount of additional insights.

      In summary, this manuscript presents an interesting addition to the long-standing attempts at dissecting the sequence basis of RNA stability in human cells. The analysis is in general very comprehensive and sound; however, at times the goal of the authors to find novelty and specificity in the data overshadows some analyses. One example is the case where the authors try to show that TA-dinucleotides and GC content are decoupled and not merely two sides of the same coin. They claim that the effect of TA dinucleotides is different between high- and low-GC content contexts but do not control for the fact that low GC-content regions naturally will contain more TA dinucleotides and therefore the effect sizes and the resulting correlation between TA-diNT rate and stability will be stronger (Fig. 5A). A more thorough analysis and greater caution in some of the claims could further improve the credibility of the conclusions.

    4. Reviewer #2 (Public Review):

      Summary of goals:

      Untranslated regions are key cis-regulatory elements that control mRNA stability, translation, and translocation. Through interactions with small RNAs and RNA binding proteins, UTRs form complex transcriptional circuitry that allows cells to fine-tune gene expression. Functional annotation of UTR variants has been very limited, and improvements could offer insights into disease relevant regulatory mechanisms. The goals were to advance our understanding of the determinants of UTR regulatory elements and characterize the effects of a set of "disease-relevant" UTR variants.

      Strengths:

      The use of a massively parallel reporter assay allowed for analysis of a substantial set (6,555 pairs) of 5' and 3' UTR fragments compiled from known disease associated variants. Two cell types were used.

      The findings confirm previous work about the importance of AREs, which helps show validity and adds some detailed comparisons of specific AU-rich motif effects in these two cell types.

      Using a Lasso regression, TA-dinucleotide content is identified as a strong regulator of RNA stability in a context dependent manner based on GC content and presence of RNA binding protein binding motifs. The findings have potential importance, drawing attention to a UTR feature that is not well characterized.

      The use of complementary datasets, including from half-life analyses of RNAs and from random sequence library MRPA's, is a useful addition and supports several important findings. The finding the TA dinucleotides have explanatory power separate from (and in some cases interacting with) GC content is valuable.

      The functional enrichment analysis suggests some new ideas about how UTRs may contribute to regulation of certain classes of genes.

      Weaknesses:

      It is difficult to understand how the calculations for half-life were performed. The sequencing approach measures the relative frequency of each sequence at each time point (less stable sequences become relatively less frequent after time 0, whereas more stable sequences become relatively more frequent after time 0). Since there is no discussion of whether the abundance of the transfected RNA population is referenced to some external standard (e.g., housekeeping RNAs), it is not clear how absolute (rather than relative) half-lives were determined.

      Fig. S1A and B are used to assess reproducibility. They show that read counts at a given time point correlate well across replicate experiments. However, this is not a good way to assess reproducibility or accuracy of the measurements of t1/2 are. (The major source of variability in read counts in these plots - especially at early time points - is likely the starting abundance of each RNA sequence, not stability.) This creates concerns about how well the method is measuring t1/2. Also creating concern is the observation that many RNAs are associated with half-lives that are much longer than the time points analyzed in the study. For example, based upon Figure S1 and Table S1 correctly, the median t1/2 for the 5' UTR library in HEK cells appears to be >700 minutes. Given that RNA was collected at 30, 75, and 120 minutes, accurate measurements of RNAs with such long half lives would seem to be very difficult.

      There is no direct comparison of t1/2 between the two cell types studied for the full set of sequences studied. This would be helpful in understanding whether the regulatory effects of UTRs are generally similar across cell lines (as has been shown in some previous studies) or whether there are fundamental differences. The distribution of t1/2's is clearly quite different in the two cell lines, but it is important to know if this reflects generally slow RNA turnover in HEK cells or whether there are a large number of sequence-specific effects on stability between cell lines. A related issue is that it is not clear whether the relatively small number of significant variant effects detected in HEK cells versus SH-SY5Y cells is attributable to real biological differences between cell types or to technical issues (many fewer read counts and much longer half lives in HEK cells).

      The general assertion is made in many places that TA dinucleotides are the most prominent destabilizing element in UTRs (e.g., in the title, the abstract, Fig. 4 legend, and on p. 12). This appears to be true for only one of the two cell lines tested based on Fig. 3.

      Appraisal and impact:

      The work adds to existing studies that previously identified sequence features, including AREs and other RNA binding protein motifs, that regulate stability and puts a new emphasis on the role of "TA" (better "UA") dinucleotides. It is not clear how potential problems with the RNA stability measurements discussed above might influence the overall conclusions, which may limit the impact unless these can be addressed.

      It is difficult to understand whether the importance of TA dinucleotides is best explained by their occurrence in a related set of longer RBP binding motifs (see Fig 5J, these motifs may be encompassed by the "WWWWWW cluster") or whether some other explanation applies. Further discussion of this would be helpful. Does the LASSO method tend to collapse a more diverse set of longer motifs that are each relatively rare compared to the dinucleotide? It remains unclear whether TA dinucleotides are associated with less stability independent of the presence of the known larger WWWWWWW motif. As noted above, the importance of TA dinucleotides in the HEK experiments appears to be less than is implied in the text.

      The inclusion of more than a single cell type is an acknowledgement of the importance of evaluating cell type-specific effects. The work suggests a number of cell type-specific differences, but due to technical issues (especially with the HEK data, as outlined above) and the use of only two cell lines, it is difficult to understand cell type effects from the work.

      The inclusion of both 3' and 5' UTR sequences distinguishes this work from most prior studies in the field. Contrasting the effects of these regions on stability is of interest, although the role of these UTRs (especially the 5' UTR) in translational regulation is not assessed here.

    5. Reviewer #3 (Public Review):

      Summary:

      In their manuscript titled "Multiplexed Assays of Human Disease‐relevant Mutations Reveal UTR Dinucleotide Composition as a Major Determinant of RNA Stability" the authors aim to investigate the effect of sequence variations in 3'UTR and 5'UTRs on the stability of mRNAs in two different human cell lines.

      To do so, the authors use a massively parallel reporter assay (MPRA). They transfect cells with a set of mRNA reporters that contain sequence variants in their 3' or 5' UTRs, which were previously reported in human diseases. They follow their clearance from cells over time relative to the matching non-variant sequence. To analyze their results, they define a set of factors (RBP and miRNA binding sites, sequence features, secondary structure etc.) and test their association with differences in mRNA stability. For features with a significant association, they use clustering to select a subset of factors for LASSO regression and identify factors that affect mRNA stability.<br /> They conclude that the TA dinucleotide content of UTRs is the strongest destabilizing sequence feature. Within that context, elevated GC content and protein binding can protect susceptible mRNAs from degradation. They also show that TA dinucleotide content of UTRs affects native mRNA stability, and that it is associated with specific functional groups. Finally, they link disease associated sequence variants with differences in mRNA stability of reporters.

      Strengths:

      (1) This work introduces a different MPRA approach to analyze the effect of genetic variants. While previous works in tissue culture use DNA transfections that require normalization for transcription efficiency, here the mRNA is directly introduced into cells at fixed amounts, allowing a more direct view of the mRNA regulation.

      (2) The authors also introduce a unique analysis approach, which takes into account multiple factors that might affect mRNA stability. This approach allows them to identify general sequence features that affect mRNA stability beyond specific genetic variants, and reach important insights on mRNA stability regulation. Indeed, while the conclusions to genetic variants identified in this work are interesting, the main strength of the work involve general effect of sequence features rather than specific variants.

      (3) The authors provide adequate supports for their claims, and validate their analysis using both their reporter data and native genes. For the main feature identified, TA di-nucleotides, they perform follow-up experiments with modified reporters that further strengthen their claims, and also validate the effect on native cellular transcripts (beyond reporters), demonstrating its validity also within native scenarios.

      (4) The work provides a broad analysis of mRNA stability, across two mRNA regulatory segments (3'UTR and 5'UTR) and is performed in two separate cell-types. Comparison between two different cell-types is adequate, and the results demonstrate, as expected, the dependence of mRNA stability on the cellular context. Analysis of 3'UTR and 5'UTR regulatory effects also shows interesting differences and similarities between these two regulatory regions.

      Weaknesses:

      (1) The authors fail to acknowledge several possible confounding factors of their MPRA approach in the discussion.<br /> First, while transfection of mRNA directly into cells allows to avoid the need to normalize for differences in transcription, the introduction of naked mRNA molecules is different than native cellular mRNAs and could introduce biases due to differences in mRNA modifications, protein associations etc. that may occur co-transcriptionally.<br /> Second, along those lines, the authors also use in-vitro polyadenylation. The length of the polyA tail of the transfected transcripts could potentially be very different than that of native mRNAs and also affect stability.

      (2) The analysis approach used in this work for identifying regulatory features in UTRs was not previously used. As such, lack of in-depth details of the methodology, and possibly also more general validation of the approach, is a drawback in convincing the reader in the validity of this approach and its results.<br /> In particular, a main point that is not addressed is how the authors decide on the set of "factors" used in their analysis? As choosing different sets of factors might affect the results of the analysis. For example, the choice to use 7-mer sequences within the factors set is not explained, particularly when almost all motifs that are eventually identified (Figure 3B-E) are shorter.<br /> In addition, the authors do not perform validations to demonstrate the validity of their approach on simulated data or well-established control datasets. Such analysis would be helpful to further convince the reader in the usefulness and robustness of the analysis.

      (3) The analysis and regression models built in this work are not thoroughly investigated relative to native genes within cells. The effect of sequence "factors" on native cellular transcripts' stability is not investigated beyond TA di-nucleotides, and it is unclear to what degree do other predicted factors also affect native transcripts.

    1. eLife assessment

      This fundamental study investigates the transcriptional changes in neurons that underlie loss of learning and memory with age in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. The presented evidence is compelling, utilizing a cutting-edge method to isolate neurons from worms for genomics that is clearly conveyed with a rigorous experimental approach. Overall, this study supports that older daf-2 worms maintain cognitive function via mechanisms that are unique from younger wild type worms, which will be of great interest to neuroscientists and researchers studying ageing.

    1. eLife assessment

      This important study reports a novel mechanism linking DHODH inhibition and subsequent pyrimidine nucleotide depletion with upregulation of cell surface MHC I in cancer cells. The in vitro mechanistic data are compelling, with rigorous methodology and validation across multiple cell lines. The authors also provide in vivo evidence for additive effects of DHODH inhibitors and immune checkpoint blockade. However, the in vivo assessments of the functional relevance of this mechanism remain incomplete, requiring additional analyses to fully substantiate the conclusions made.

    1. Reviewer #1 (Public Review):

      Summary:

      This study offers a new perspective. ACTL7A and ACTL7B play roles in epigenetic regulation in spermiogenesis. Actin-like 7 A (ACTL7A) is essential for acrosome formation, fertilization, and early embryo development. ACTL7A variants cause acrosome detachment responsible for male infertility and early embryonic arrest. It has been reported that ACTL7A is localized on the acrosome in mouse sperms (Boëda et al., 2011). Previous studies have identified ACTL7A mutations (c.1118G>A:p.R373H; c.1204G>A:p.G402S, c.1117C>T:p.R373C), All these variants were located in the actin domain and were predicted to be pathogenic, affecting the number of hydrogen bonds or the arrangement of nearby protein structures (Wang et al., 2023; Xin et al., 2020; Zhao et al., 2023; Zhou et al., 2023). This work used AI to model the role of ACTL7A/B in the nucleosome remodeling complex and proposed a testis-specific conformation of SCRAP complex. This is different from previous studies.

      Strengths:

      This study provides a new perspective to reveal the additional roles of these proteins.

      Weaknesses:

      The results section contains a substantial background description. However, the results and discussion sections require streamlining. There is a lack of mutual support for data between the sections, and direct data to support the authors' conclusions are missing.

    2. eLife assessment

      This valuable study reports that actin-related proteins may be involved in transcriptional regulation during spermatogenesis. The supporting data remain incomplete, and more extensive disentanglement from the canonical role of these actin-related proteins and the experimental validation of in silico predictions are required. This work will be of interest to reproductive biologists and other researchers working on non-canonical roles of actin and actin-related proteins.

    3. Reviewer #2 (Public Review):

      Summary:

      How dynamics of gene expression accompany cell fate and cellular morphological changes is important for our understanding of molecular mechanisms that govern development and diseases. The phenomenon is particularly prominent during spermatogenesis, the process which spermatogonia stem cells develop into sperm through a series of steps of cell division, differentiation, meiosis, and cellular morphogenesis. The intricacy of various aspects of cellular processes and gene expression during spermatogenesis remains to be fully understood. In this study, the authors found that testis-specific actin-related proteins (which usually participate in modifying cells' cytoskeletal systems) ACTL7A and ACTL7B were expressed and localized in the nuclei of mouse spermatocytes and spermatids. Based on this observation, the authors analyzed protein sequence conservations of ACTL7B across dozens of species and identified a putative nuclear localization sequence (NLS) that is often responsible for the nuclear import of proteins that carry them. Using molecular biology experiments in a heterologous cell system, the authors verified the potential role of this internal NLS and found it indeed could facilitate the nuclear localization of marker proteins when expressed in cells. Using gene-deleted mouse models they generated previously, the authors showed that deletion of Actl7b caused changes in gene expression and mis-localization of nucleosomal histone H3 and chromatin regulator histone deacetylase HDAC1 and 2, supporting their proposed roles of ACTL7B in regulating gene expression. The authors further used alpha-Fold 2 to model the potential protein complexes that could be formed between the ARPs (ACTL7A and ACTL7B) and known chromatin modifiers, such as INO80 and SWI/SNF complexes and found that consistent with previous findings, it is likely that ACTL7A and ACTL7B interact with the chromatin-modifying complexes through binding to their alpha-helical HSA domain cooperatively. These results suggest that ACTL7B possesses novel functions in regulating chromatin structure and thus gene expression beyond conventional roles of cytoskeleton regulation, providing alternative pathways for understanding how gene expression is regulated during spermatogenesis and the etiology of relevant infertility diseases.

      Strengths:

      The authors provided sufficient background to the study and discussions of the results. Based on their previous research, this study utilized numerous methods, including protein complex structural modeling method alpha-fold 2 Multimers, to further investigate the functional roles of ACTL7B. The results presented here are in general of good quality. The identification of a potential internal NLS in ACTL7B is mostly convincing, in line with the phenotypes presented in the gene deletion model.

      Weaknesses:

      While the study offered an interesting new look at the functions of ARP proteins during spermatogenesis, some of the study is mainly theoretical speculations, including the protein complex formation. Some of the results may need further experimental verifications, for example, differentially expressed genes that were found in potentially spermatogenic cells at different developmental stages, in order to support the conclusions and avoid undermining the significance of the study.

    4. Reviewer #3 (Public Review):

      In this manuscript, Pierre Ferrer and colleagues explore the exciting possibility that, in the male germ line, the composition and function of deeply conserved chromatin remodeling complexes is fine-tuned by the addition of testis-specific actin-related proteins (ARPs). In this regard, the Authors aim to extend previously reported non-canonical (transcriptional) roles of ARPs in somatic cells to the unique developmental context of the germ line. The manuscript is focused on the potential regulatory role in post-meiotic transcription of two ARPs: ACTL7A and ACTL7B (particularly the latter). The canonical function of both testis-specific ARPs in spermatogenesis is well established, as they have been previously shown to be required for the extensive cellular morphogenesis program driving post-meiotic development (spermiogenesis). Disentangling the actual functions of ACTL7A and ACTL7B as transcriptional regulators from their canonical role in the profound morphological reshaping of post-meiotic cells (a process that also deeply impacts nuclear architecture and regulation) represents a key challenge in terms of interpreting the reported findings (see below).

      The authors begin by documenting, via fluorescence microscopy, the intranuclear localization of ACTL7B. This ARP is convincingly shown to accumulate in the nucleus of spermatocytes and spermatids. Using a series of elegant reporter-based experiments in a somatic cell line, the authors map the driver of this nuclear accumulation to a potential NLS sequence in the ACTL7B actin-like body domain. Ferrer and colleagues then performed a testicular RNA-seq analysis in ACTL7B KO mice to define the putative role of ACTL7B in male germ cell transcription. They report substantial changes to the testicular transcriptome - particularly the upregulation of several classes of genes - in ACTL7B KO mice. However, wild-type testes were used as controls for this experiment, thus introducing a clear confounding effect to the analysis (ACTL7B KO testes have extensive post-meiotic defects due to the canonical role of ACTL7B in spermatid development). Then, the authors employ cutting-edge AI-driven approaches to predict that both ACTL7A and ACTL7B are likely to bind to four key chromatin remodeling complexes. Although these predictions are based on a robust methodology, they would certainly benefit from experimental validation. Finally, the authors associate the loss of ACTL7B with decreased lysine acetylation and lower levels of the HDAC1 and HDAC3 chromatin remodelers in the nucleus of developing spermatids.

      Globally, these data may provide important insight into the unique processes male germ cells employ to sustain their extraordinarily complex transcriptional program. Furthermore, the concept that (comparably younger) testis-specific proteins can be incorporated into ancient chromatin remodeling complexes to modulate their function in the germ line is timely and exciting.

      It is my opinion that the manuscript would benefit from additional experimental validation to better support the authors' conclusions. In particular, I believe that addressing two critical points would substantially strengthen the message of the manuscript:

      (1) The proposed role of ACTL7B in post-meiotic transcriptional regulation temporally overlaps with the protein's previously reported canonical functions in spermiogenesis (PMID: 36617158 and 37800308). Indeed, the canonical functions of ACTL7B have been shown to have a profound effect at the level of spermatid morphology and to impact nuclear organization. This potentially renders the observed transcriptional deregulation in ACTL7B KO testes an indirect consequence of spermatid morphology defects. I acknowledge that it is experimentally difficult to disentangle the proposed intranuclear roles of ACTL7B from the protein's well-documented cytoplasmic function. Perhaps the generation of a NLS-scrambled ACTL7B variant could offer some insight. In light of the substantial investment this approach would represent, I would suggest, as an alternative, that instead of using wild-type testes as controls for the transcriptome and chromatin localization assays, the authors consider the possibility of using testicular tissue from a mutant with similarly abnormal spermiogenesis but due to transcription-independent defects. This would, in my opinion, offer a more suitable baseline to compare ACTL7B KO testes with.

      (2) The manuscript would greatly benefit if experimental validation of the AI-driven predictions were to be provided (in terms of the binding capacity of ACTL7A and ACTL7B to key chromatin remodeling complexes). More so it seems that the authors have the technical expertise / available mass spectrometry data required for this purpose (lines 664-665). Still on this topic, given the predicted interactions of ACTL7A and ACTL7B with the SRCAP, EP400, SMARCA2 and SMARCA4 complexes (Figure 7), it is rather counter-intuitive that the Authors chose for their immunofluorescence assays, in ACTL7B KO testes, to determine the chromatin localization of HDAC1 and HDAC3, rather than that of any of above four complexes.

    1. eLife assessment

      This valuable study presents the design of a new device to use high-density electrophysiological probes ("Neuropixels") in freely moving rodents. The evidence showing that the system is versatile and capable of recording high-quality extracellular data in both mice and rats is compelling. This study will be of interest to neuroscientists performing chronic electrophysiological recordings.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Bimbard et al., a new method to perform stable recordings over long periods of time with neuropixels, as well as the technical details on how the electrodes can be explanted for follow-up reuse, is provided. I think the description of all parts of the method is very clear, and the validation analyses (n of units per day over time, RMS over recording days...) are very convincing. I however missed a stronger emphasis on why this could provide a big impact on the ephys community, by enabling new analyses, new behavior correlation studies, or neurophysiological mechanisms across temporal scales that were previously inaccessible with high temporal resolution (i.e. not with imaging).

      Strengths:

      Open source method. Validation across laboratories. Across species (mice and rats) demonstration of its use and in different behavioral conditions (head-fixed and freely moving).

      Weaknesses:

      Weak emphasis on what can be enabled with this new method that didn't exist before.

    3. Reviewer #2 (Public Review):

      Summary:

      This work by Bimbard et al., introduces a new implant for Neuropixels probes. While Neuropixels probes have critically improved and extended our ability to record the activity of a large number of neurons with high temporal resolution, the use of these expensive devices in chronic experiments has so far been hampered by the difficulty of safely implanting them and, importantly, to explant and reuse them after conclusion of the experiment. The authors present a newly designed two-part implant, consisting of a docking and a payload module, that allows for secure implantation and straightforward recovery of the probes. The implant is lightweight, making it amenable for use in mice and rats, and customizable. The authors provide schematics and files for printing of the implants, which can be easily modified and adapted to custom experiments by researchers with little to no design experience. Importantly, the authors demonstrate the successful use of this implant across multiple use cases, in head-fixed and freely moving experiments, in mice and rats, with different versions of Neuropixels probes, and across 8 different labs. Taken together, the presented implants promise to make chronic Neuropixel recordings and long-term studies of neuronal activity significantly easier and attainable for both current and future Neuropixels users.

      Strengths:

      - The implants have been successfully tested across 8 different laboratories, in mice and rats, in head-fixed and freely moving conditions, and have been adapted in multiple ways for a number of distinct experiments.

      - Implants are easily customizable and the authors provide a straightforward approach for customization across multiple design dimensions even for researchers not experienced in design.

      - The authors provide clear and straightforward descriptions of the construction, implantation, and explant of the described implants.

      - The split of the implant into a docking and payload module makes reuse even in different experiments (using different docking modules) easy.

      - The authors demonstrate that implants can be re-used multiple times and still allow for high-quality recordings.

      - The authors show that the chronic implantations allow for the tracking of individual neurons across days and weeks (using additional software tracking solutions), which is critical for a large number of experiments requiring the description of neuronal activity, e.g. throughout learning processes.

      - The authors show that implanted animals can even perform complex behavioral tasks, with no apparent reduction in their performance.

      Weaknesses:

      - While implanted animals can still perform complex behavioral tasks, the authors describe that the implants may reduce the animals' mobility, as measured by prolonged reaction times. However, the presented data does not allow us to judge whether this effect is specifically due to the presented implant or whether any implant or just tethering of the animals per se would have the same effects.

      - While the authors make certain comparisons to other, previously published approaches for chronic implantation and re-use of Neuropixels probes, it is hard to make conclusive comparisons and judge the advantages of the current implant. For example, while the authors emphasize that the lower weight of their implant allows them to perform recordings in mice (and is surely advantageous), the previously described, heavier implants they mention (Steinmetz et al., 2021; van Daal et al., 2021), have also been used in mice. Whether the weight difference makes a difference in practice therefore remains somewhat unclear.

      - The non-permanent integration of the headstages into the implant, while allowing for the use of the same headstage for multiple animals in parallel, requires repeated connections and does not provide strong protection for the implant. This may especially be an issue for the use in rats, requiring additional protective components as in the presented rat experiments.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Bimbard and colleagues describe a new implant apparatus called "Apollo Implant", which should facilitate recording in freely moving rodents (mice and rats) using Neuropixels probes. The authors collected data from both mice and rats, they used 3 different versions of Neuropixels, multiple labs have already adopted this method, which is impressive. They openly share their CAD designs and surgery protocol to further facilitate the adaptation of their method.

      Strengths:

      Overall, the "Apollo Implant" is easy to use and adapt, as it has been used in other laboratories successfully and custom modifications are already available. The device is reproducible using common 3D printing services and can be easily modified thanks to its CAD design (the video explaining this is extremely helpful). The weight and price are amazing compared to other systems for rigid silicon probes allowing a wide range of use of the "Apollo Implant".

      Weaknesses:

      The "Apollo Implant" can only handle Neuropixels probes. It cannot hold other widely used and commercially available silicon probes. Certain angles and distances are not possible in their current form (distance between probes 1.8 to 4mm, implantation depth 2-6.5 mm, or angle of insertion up to 20 degrees).

    5. Author response:

      Reviewer 1:

      Summary:

      In this manuscript by Bimbard et al., a new method to perform stable recordings over long periods of time with neuropixels, as well as the technical details on how the electrodes can be explanted for follow-up reuse, is provided. I think the description of all parts of the method is very clear, and the validation analyses (n of units per day over time, RMS over recording days...) are very convincing. I however missed a stronger emphasis on why this could provide a big impact on the ephys community, by enabling new analyses, new behavior correlation studies, or neurophysiological mechanisms across temporal scales

      Strengths:

      Open source method. Validation across laboratories. Across species (mice and rats) demonstration of its use and in different behavioral conditions (head-fixed and freely moving).

      Weaknesses:

      Weak emphasis on what can be enabled with this new method that didn't exist before.

      We thank the reviewer for highlighting the limited discussion around scientific impact. Our implant has several advantages which combine to make it much more accessible than previous solutions. This enables a variety of recording configurations that would not have been possible with previous designs, facilitating recordings from a wider range of brain regions, animals, and experimental setups. In short, there are three key advances:

      (1) Adaptability: The CAD files can be readily adapted to a wide range of configurations (implantation depth, angle, position of headstage, etc.). Labs have already, modified the design to optimise for their needs, and re-shared with the community.

      (2) Weight:  Because of the lightweight design, experimenters can i) perform complex and demanding freely moving tasks as we exemplify in the manuscript, and ii) implant female and water restricted mice while respecting animal welfare weight limitations.

      (3) Cost: At ~$10, our implant is significantly cheaper than published alternatives, which makes it affordable to more labs and means that testing modifications is cost-effective.

      We will make these features clearer in the manuscript.

      Reviewer 2:

      Summary:

      This work by Bimbard et al., introduces a new implant for Neuropixels probes. While Neuropixels probes have critically improved and extended our ability to record the activity of a large number of neurons with high temporal resolution, the use of these expensive devices in chronic experiments has so far been hampered by the difficulty of safely implanting them and, importantly, to explant and reuse them after conclusion of the experiment. The authors present a newly designed two-part implant, consisting of a docking and a payload module, that allows for secure implantation and straightforward recovery of the probes. The implant is lightweight, making it amenable for use in mice and rats, and customizable. The authors provide schematics and files for printing of the implants, which can be easily modified and adapted to custom experiments by researchers with little to no design experience. Importantly, the authors demonstrate the successful use of this implant across multiple use cases, in head-fixed and freely moving experiments, in mice and rats, with different versions of Neuropixels probes, and across 8 different labs. Taken together, the presented implants promise to make chronic Neuropixel recordings and long-term studies of neuronal activity significantly easier and attainable for both current and future Neuropixels users.

      Strengths:

      - The implants have been successfully tested across 8 different laboratories, in mice and rats, in head-fixed and freely moving conditions, and have been adapted in multiple ways for a number of distinct experiments.

      - Implants are easily customizable and the authors provide a straightforward approach for customization across multiple design dimensions even for researchers not experienced in design.

      - The authors provide clear and straightforward descriptions of the construction, implantation, and explant of the described implants.

      - The split of the implant into a docking and payload module makes reuse even in different experiments (using different docking modules) easy.

      - The authors demonstrate that implants can be re-used multiple times and still allow for high-quality recordings.

      - The authors show that the chronic implantations allow for the tracking of individual neurons across days and weeks (using additional software tracking solutions), which is critical for a large number of experiments requiring the description of neuronal activity, e.g. throughout learning processes.

      - The authors show that implanted animals can even perform complex behavioral tasks, with no apparent reduction in their performance.

      Weaknesses:

      - While implanted animals can still perform complex behavioral tasks, the authors describe that the implants may reduce the animals' mobility, as measured by prolonged reaction times. However, the presented data does not allow us to judge whether this effect is specifically due to the presented implant or whether any implant or just tethering of the animals per se would have the same effects.

      The reviewer is correct: some of the differences in mouse reaction time could be due to the tether rather than the implant. As these experiments were also performed in water-restricted female mice with the heavier Neuropixels 1.0 implant, our data represent the maximal impact of the implant, and we will highlight this in the revision.

      - While the authors make certain comparisons to other, previously published approaches for chronic implantation and re-use of Neuropixels probes, it is hard to make conclusive comparisons and judge the advantages of the current implant. For example, while the authors emphasize that the lower weight of their implant allows them to perform recordings in mice (and is surely advantageous), the previously described, heavier implants they mention (Steinmetz et al., 2021; van Daal et al., 2021), have also been used in mice. Whether the weight difference makes a difference in practice therefore remains somewhat unclear.

      The reviewer is correct: without a direct comparison, we cannot be certain that our smaller, lighter implant improves behavioural results (although this is supported by the literature, e.g. Newman et al, 2023). However, the reduced weight of our implant is critical for several laboratories represented in this manuscript due to animal welfare requirements. Indeed, in Daal et al the authors “recommend a [mouse] weight of >25 g for implanting Neuropixels 1.0 probes.” This limit precludes using (the vast majority of) female mice, or water-restricted animals. Conversely, our implant can be routinely used with lighter, water-restricted male and female mice. We will emphasise this point in the revision.

      - The non-permanent integration of the headstages into the implant, while allowing for the use of the same headstage for multiple animals in parallel, requires repeated connections and does not provide strong protection for the implant. This may especially be an issue for the use in rats, requiring additional protective components as in the presented rat experiments.

      We apologise for not clarifying the various headstage options in the manuscript and we will address this in the revision. Our repository has headplate holder designs (in the XtraModifications/Mouse_FreelyMoving folder). This allows leaving the headstage on the implant, and thus minimize the number of connections (albeit increasing the weight for the mouse). Indeed, mice recorded while performing the task described in our manuscript had the head-stage semi-permanently integrated to the implant, and we will highlight this in the revision.

      Reviewer 3:

      Summary:

      In this manuscript, Bimbard and colleagues describe a new implant apparatus called "Apollo Implant", which should facilitate recording in freely moving rodents (mice and rats) using Neuropixels probes. The authors collected data from both mice and rats, they used 3 different versions of Neuropixels, multiple labs have already adopted this method, which is impressive. They openly share their CAD designs and surgery protocol to further facilitate the adaptation of their method.

      Strengths:

      Overall, the "Apollo Implant" is easy to use and adapt, as it has been used in other laboratories successfully and custom modifications are already available. The device is reproducible using common 3D printing services and can be easily modified thanks to its CAD design (the video explaining this is extremely helpful). The weight and price are amazing compared to other systems for rigid silicon probes allowing a wide range of use of the "Apollo Implant".

      Weaknesses:

      The "Apollo Implant" can only handle Neuropixels probes. It cannot hold other widely used and commercially available silicon probes. Certain angles and distances are not possible in their current form (distance between probes 1.8 to 4mm, implantation depth 2-6.5 mm, or angle of insertion up to 20 degrees).

      We appreciate the reviewer’s points, but as we will discuss in the revised manuscript, one implant accommodating the diversity of the existing probes is beyond the scope of this project. However, because the design is adaptable, groups should be able to modify the current version of the implant to adapt to their electrodes’ size and format (and can highlight any issues in the Github “Discussions” area).

      With Neuropixels, the current range of depths covers practically all trajectories in the mouse brain. In rats, where deeper penetrations may be useful, the experimenter can attach the probe at a lower point in the payload module to increase the length of exposed shank. We now specify this in the Github repository.

      We have now extended the range of inter-probe distances from a maximum of 4 mm to 6.5 mm, and this will be reflected in the revised manuscript. Distances beyond this may be better served by 2 implants, and smaller distances could be achieved by attaching two probes on the same side of the docking module. In the next revision, we will add these points to the discussion.

    1. Author response:

      eLife assessment

      This study is a detailed investigation of how chromatin structure influences replication origin function in yeast ribosomal DNA, with focus on the role of the histone deacetylase Sir2 and the chromatin remodeler Fun30. Convincing evidence shows that Sir2 does not affect origin licensing but rather affects local transcription and nucleosome positioning which correlates with increased origin firing. However, the evidence remains incomplete as the methods employed do not rigorously establish a key aspect of the mechanism, fully address some alternative models, or sufficiently relate to prior results. Overall, this is a valuable advance for the field that could be improved to establish a more robust paradigm.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper presents a mechanistic study of rDNA origin regulation in yeast by SIR2. Each of the ~180 tandemly repeated rDNA gene copies contains a potential replication origin. Early-efficient initiation of these origins is suppressed by Sir2, reducing competition with origins distributed throughout the genome for rate-limiting initiation factors. Previous studies by these authors showed that SIR2 deletion advances replication timing of rDNA origins by a complex mechanism of transcriptional de-repression of a local PolII promoter causing licensed origin proteins (MCMcomplexes) to re-localize (slide along the DNA) to a different (and altered) chromatin environment. In this study, they identify a chromatin remodeler, FUN30, that suppresses the sir2∆ effect, and remarkably, results in a contraction of the rDNA to about one-quarter it's normal length/number of repeats, implicating replication defects of the rDNA. Through examination of replication timing, MCM occupancy and nucleosome occupancy on the chromatin in sir2, fun30, and double mutants, they propose a model where nucleosome position relative to the licensed origin (MCM complexes) intrinsically determines origin timing/efficiency. While their interpretations of the data are largely reasonable and can be interpreted to support their model, a key weakness is the connection between Mcm ChEC signal disappearance and origin firing. While the cyclical chromatin association-dissociation of MCM proteins with potential origin sequences may be generally interpreted as licensing followed by firing, dissociation may also result from passive replication and as shown here, displacement by transcription and/or chromatin remodeling.

      While it is true that both transcription and passive replication can cause the signal of MCM-ChEC to disappear, neither can cause selective disappearance of the displaced complex without affecting the non-displaced complex.  Indeed, in the case of transcription, RNA polymerase transcribing C-pro would have to first dislodge the normally positioned MCM complex before even reaching the displaced complex.  Furthermore, deletion of FUN30 leads to both more C-pro transcription and less disappearance of the displaced MCM complex.  It is important to keep in mind that this cannot somehow reflect continuous replenishment of displaced MCMs with newly loaded MCMs, since the cells are in S phase and licensing is restricted to G1. 

      Moreover, linking its disappearance from chromatin in the ChEC method with such precise resolution needs to be validated against an independent method to determine the initiation site(s). Differences in rDNA copy number and relative transcription levels also are not directly accounted for, obscuring a clearer interpretation of the results.

      Copy number reduction of the magnitude caused by deletion of SIR2 and FUN30 does not suppress the sir2D effect (i.e. early replication of the rDNA), but rather exacerbates it.  In particular, deletion of SIR2 and FUN30 causes the rDNA to shrink to approximately 35 copies.  Kwan et al., 2023 (PMID: 36842087) have shown that reduction of rDNA copy number to 35 causes a dramatic acceleration of rDNA replication in a SIR2 strain.  Thus, the effect of rDNA size on replication timing reinforces our conclusion that deletion of FUN30 suppresses rDNA replication.

      However, to address this concern directly, in the revision we will include 2 D gels in fob1 strains with equal number of repeats that allows to conclude that the effect of FUN30 deletion in suppressing rDNA origin firing is independent of either rDNA size or FOB1. The figure of the critical 2 D gels is shown below in the reply to reviewer 2.

      Nevertheless, this paper makes a valuable advance with the finding of Fun30 involvement, which substantially reduces rDNA repeat number in sir2∆ background. The model they develop is compelling and I am inclined to agree, but I think the evidence on this specific point is purely correlative and a better method is needed to address the initiation site question. The authors deserve credit for their efforts to elucidate our obscure understanding of the intricacies of chromatin regulation. At a minimum, I suggest their conclusions on these points of concern should be softened and caveats discussed. Statistical analysis is lacking for some claims.

      Strengths are the identification of FUN30 as suppressor, examination of specific mutants of FUN30 to distinguish likely functional involvement. Use of multiple methods to analyze replication and protein occupancies on chromatin. Development of a coherent model.

      Weaknesses are failure to address copy number as a variable; insufficient validation of ChEC method relationship to exact initiation locus; lack of statistical analysis in some cases. 

      The two potential initiation sites that one would monitor (non-displaced and displaced) are separated by less than 150 base pairs, and other techniques simply do not have the resolution necessary to distinguish such differences.  Furthermore, as we suggest in the manuscript, our results are consistent with a model in which it is only the displaced MCM complex that is activated, whether in sir2 or WT.  If no genotype-dependent difference in initiation sites is even expected, it would be hard to interpret even the most precise replication-based assays.  However, the reviewer is correct that this is a novel technique and that confirmation with a well-established technique is comforting, therefore we are performing ChIP experiments to corroborate, to the extent possible, the conclusions that we reached with ChEC. 

      We appreciate the reviewer pointing out that some statistical analyses were lacking, and we will correct this in a revised manuscript.

      Additional background and discussion for public review:

      This paper broadly addresses the mechanism(s) that regulate replication origin firing in different chromatin contexts. The rDNA origin is present in each of ~180 tandem repeats of the rDNA sequence, representing a high potential origin density per length of DNA (9.1kb repeat unit). However, the average origin efficiency of rDNA origins is relatively low (~20% in wild-type cells), which reduces the replication load on the overall genome by reducing competition with origins throughout the genome for limiting replication initiation factors. Deletion of histone deacetylase SIR2, which silences PolII transcription within the rDNA, results in increased early activation or the rDNA origins (and reduced rate of overall genome replication). Previous work by the authors showed that MCM complexes loaded onto the rDNA origins (origin licensing) were laterally displaced (sliding) along the rDNA, away from a well-positioned nucleosome on one side. The authors' major hypothesis throughout this work is that the new MCM location(s) are intrinsically more efficient configurations for origin firing. The authors identify a chromatin remodeling enzyme, FUN30, whose deletion appears to suppress the earlier activation of rDNA origins in sir2∆ cells. Indeed, it appears that the reduction of rDNA origin activity in sir2∆ fun30∆ cells is severe enough to results in a substantial reduction in the rDNA array repeat length (number of repeats); the reduced rDNA length presumably facilitates it's more stable replication and maintenance.

      Analysis of replication by 2D gels is marginally convincing, using 2D gels for this purpose is very challenging and tricky to quantify. The more quantitative analysis by EdU incorporation is more convincing of the suppression of the earlier replication caused by SIR2 deletion.

      To address the mechanism of suppression, they analyze MCM positioning using ChEC, which in G1 cells shows partial displacement of MCM from normal position A to positions B and C in sir2∆ cells and similar but more complete displacement away from A to positions B and C in sir2fun30 cells. During S-phase in the presence of hydroxyurea, which slows replication progression considerably (and blocks later origin firing) MCM signals redistribute, which is interpreted to represent origin firing and bidirectional movement of MCMs (only one direction is shown), some of which accumulate near the replication fork barrier, consistent with their interpretation. They observe that MCMs displaced (in G1) to sites B or C in sir2∆ cells, disappear more rapidly during S-phase, whereas the similar dynamic is not observed in sir2∆fun30∆. This is the main basis for their conclusion that the B and C sites are more permissive than A. While this may be the simplest interpretation, there are limitations with this assay that undermine a rigorous conclusion (additional points below). The main problem is that we know the MCM complexes are mobile so disappearance may reflect displacement by other means including transcription which is high is the sir2∆ background. Indeed, the double mutant has greater level of transcription per repeat unit which might explain more displaced from A in G1. Thus, displacement might not always represent origin firing. Because the sir2 background profoundly changes transcription, and the double mutant has a much smaller array length associated with higher transcription, how can we rule out greater accessibility at site A, for example in sir2∆, leading to more firing, which is suppressed in sir2 fun30 due to greater MCM displacement away from A?

      I think the critical missing data to solidly support their conclusions is a definitive determination of the site(s) of initiation using a more direct method, such as strand specific sequencing of EdU or nascent strand analysis. More direct comparisons of the strains with lower copy number to rule out this facet. As discussed in detail below, copy number reduction is known to suppress at least part of the sir2∆ effect so this looms over the interpretations. I think they are probably correct in their overall model based on the simplest interpretation of the data but I think it remains to be rigorously established. I think they should soften their conclusions in this respect.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors follow up on their previous work showing that in the absence of the Sir2 deacetylase the MCM replicative helicase at the rDNA spacer region is repositioned to a region of low nucleosome occupancy. Here they show that the repositioned displaced MCMs have increased firing propensity relative to non-displaced MCMs. In addition, they show that activation of the repositioned MCMs and low nucleosome occupancy in the adjacent region depend on the chromatin remodeling activity of Fun30.

      Strengths:

      The paper provides new information on the role of a conserved chromatin remodeling protein in the regulation of origin firing and in addition provides evidence that not all loaded MCMs fire and that origin firing is regulated at a step downstream of MCM loading.

      Weaknesses:

      The relationship between the author's results and prior work on the role of Sir2 (and Fob1) in regulation of rDNA recombination and copy number maintenance is not explored, making it difficult to place the results in a broader context. Sir2 has previously been shown to be recruited by Fob1, which is also required for DSB formation and recombination-mediated changes in rDNA copy number. Are the changes that the authors observe specifically in fun30 sir2 cells related to this pathway? Is Fob1 required for the reduced rDNA copy number in fun30 sir2 double mutant cells? 

      Strains lacking SIR2 have unstable rDNA size, and FOB1 deletion stabilizes rDNA size in sir2 background. Likewise, FOB1 deletion influences the kinetics  rDNA size reduction in sir2 fun30 cells. However, the main effect of Fun30 in sir2 cells we were interested in, suppression of rDNA replication, is preserved in fob1 background, arguing that the observed effect is independent of Fob1 (see figure below). Given that the main focus of the paper is regulation of rDNA origins activity and that these changes were independent of Fob1, we had elected not to include these results in the original manuscript but will gladly include them in the revision.

      Besides refuting the possible role of Fob1 in the FUN30-mediated activation of rDNA origin firing in sir2 cells, the use of fob1 background enabled us compare the activation of rDNA origins in the sir2 and sir2 fun30 strains with equally short rDNA size. The 2-D gels demonstrate a dramatic suppression of rDNA origin activity upon deletion of FUN30 in the sir2 fob1 strains with 35 rDNA copies.

      Author response image 1.

      The deletion of FUN30 diminishes the replication bubble signal in a fob1 sir2 strain with 35 rDNA copies by more than tenfold. The single rARS signal, marked with the arrow, originates from the rightmost rDNA repeat. This specific rightmost rDNA NheI fragment is approximately 25 kb in size, distinctly larger than the 4.7 kb NheI 1N rARS-containing fragments that originate from the internal rDNA repeats.

      Reviewer #3 (Public Review):

      Summary:

      Heterochromatin is characterized by low transcription activity and late replication timing, both dependent on the NAD-dependent protein deacetylase Sir2, the founding member of the sirtuins. This manuscript addresses the mechanism by which Sir2 delays replication timing at the rDNA in budding yeast. Previous work from the same laboratory (Foss et al. PLoS Genetics 15, e1008138) showed that Sir2 represses transcription-dependent displacement of the Mcm helicase in the rDNA. In this manuscript, the authors show convincingly that the repositioned Mcms fire earlier and that this early firing partly depends on the ATPase activity of the nucleosome remodeler Fun30. Using read-depth analysis of sorted G1/S cells, fun30 was the only chromatin remodeler mutant that somewhat delayed replication timing in sir2 mutants, while nhp10, chd1, isw1, htl1, swr1, isw2, and irc5 had not effect. The conclusion was corroborated with orthogonal assays including two-dimensional gel electrophoresis and analysis of EdU incorporation at early origins. Using an insightful analysis with an Mcm-MNase fusion (Mcm-ChEC), the authors show that the repositioned Mcms in sir2 mutants fire earlier than the Mcm at the normal position in wild type. This early firing at the repositioned Mcms is partially suppressed by Fun30. In addition, the authors show Fun30 affects nucleosome occupancy at the sites of the repositioned Mcm, providing a plausible mechanism for the effect of Fun30 on Mcm firing at that position. However, the results from the MNAse-seq and ChEC-seq assays are not fully congruent for the fun30 single mutant. Overall, the results support the conclusions providing a much better mechanistic understanding how Sir2 affects replication timing at rDNA.

      The reason that the results for the fun30 single mutant appear incongruent, with a larger signal of the +2 nucleosome in the MNase-seq plot but a negligible signal in the ChEC-seq plot is the paucity of displaced Mcm in the fun30 single mutant. Given the relative absence of displaced MCMs, the MCM-MNase fusion protein can't "light up" the +2 nucleosome.  We will comment on this in the revision to clarify this. 

      Strengths

      (1) The data clearly show that the repositioned Mcm helicase fires earlier than the Mcm in the wild type position.

      (2) The study identifies a specific role for Fun30 in replication timing and an effect on nucleosome occupancy around the newly positioned Mcm helicase in sir2 cells.

      Weaknesses

      (1) It is unclear which strains were used in each experiment.

      (2) The relevance of the fun30 phospho-site mutant (S20AS28A) is unclear.

      (3) For some experiments (Figs. 3, 4, 6) it is unclear whether the data are reproducible and the differences significant. Information about the number of independent experiments and quantitation is lacking. This affects the interpretation, as fun30 seems to affect the +3 nucleosome much more than let on in the description.

      We appreciate the reviewer pointing out places in which our manuscript omitted key pieces of information (items 1 and 3), and we will fix these oversights in our revision. 

      With regard to point 2, we had written: 

      “Fun30 is also known to play a role in the DNA damage response; specifically, phosphorylation of Fun30 on S20 and S28 by CDK1 targets Fun30 to sites of DNA damage, where it promotes DNA resection (Chen et al. 2016; Bantele et al. 2017). To determine whether the replication phenotype that we observed might be a consequence of Fun30's role in the DNA damage response, we tested non-phosphorylatable mutants for the ability to suppress early replication of the rDNA in sir2; these mutations had no effect on the replication phenotype (Figure 2B), arguing against a primary role for Fun30

      in DNA damage repair that somehow manifests itself in replication.”

      We will expand on this to clarify our point in the revision.

    2. eLife assessment

      This study is a detailed investigation of how chromatin structure influences replication origin function in yeast ribosomal DNA, with focus on the role of the histone deacetylase Sir2 and the chromatin remodeler Fun30. Convincing evidence shows that Sir2 does not affect origin licensing but rather affects local transcription and nucleosome positioning which correlates with increased origin firing. However, the evidence remains incomplete as the methods employed do not rigorously establish a key aspect of the mechanism, fully address some alternative models, or sufficiently relate to prior results. Overall, this is a valuable advance for the field that could be improved to establish a more robust paradigm.

    3. Reviewer #1 (Public Review):

      Summary:

      This paper presents a mechanistic study of rDNA origin regulation in yeast by SIR2. Each of the ~180 tandemly repeated rDNA gene copies contains a potential replication origin. Early-efficient initiation of these origins is suppressed by Sir2, reducing competition with origins distributed throughout the genome for rate-limiting initiation factors. Previous studies by these authors showed that SIR2 deletion advances replication timing of rDNA origins by a complex mechanism of transcriptional de-repression of a local PolII promoter causing licensed origin proteins (MCMcomplexes) to re-localize (slide along the DNA) to a different (and altered) chromatin environment. In this study, they identify a chromatin remodeler, FUN30, that suppresses the sir2∆ effect, and remarkably, results in a contraction of the rDNA to about one-quarter it's normal length/number of repeats, implicating replication defects of the rDNA. Through examination of replication timing, MCM occupancy and nucleosome occupancy on the chromatin in sir2, fun30, and double mutants, they propose a model where nucleosome position relative to the licensed origin (MCM complexes) intrinsically determines origin timing/efficiency. While their interpretations of the data are largely reasonable and can be interpreted to support their model, a key weakness is the connection between Mcm ChEC signal disappearance and origin firing. While the cyclical chromatin association-dissociation of MCM proteins with potential origin sequences may be generally interpreted as licensing followed by firing, dissociation may also result from passive replication and as shown here, displacement by transcription and/or chromatin remodeling. Moreover, linking its disappearance from chromatin in the ChEC method with such precise resolution needs to be validated against an independent method to determine the initiation site(s). Differences in rDNA copy number and relative transcription levels also are not directly accounted for, obscuring a clearer interpretation of the results. Nevertheless, this paper makes a valuable advance with the finding of Fun30 involvement, which substantially reduces rDNA repeat number in sir2∆ background. The model they develop is compelling and I am inclined to agree, but I think the evidence on this specific point is purely correlative and a better method is needed to address the initiation site question. The authors deserve credit for their efforts to elucidate our obscure understanding of the intricacies of chromatin regulation. At a minimum, I suggest their conclusions on these points of concern should be softened and caveats discussed. Statistical analysis is lacking for some claims.

      Strengths are the identification of FUN30 as suppressor, examination of specific mutants of FUN30 to distinguish likely functional involvement. Use of multiple methods to analyze replication and protein occupancies on chromatin. Development of a coherent model.

      Weaknesses are failure to address copy number as a variable; insufficient validation of ChEC method relationship to exact initiation locus; lack of statistical analysis in some cases.

      Additional background and discussion for public review:

      This paper broadly addresses the mechanism(s) that regulate replication origin firing in different chromatin contexts. The rDNA origin is present in each of ~180 tandem repeats of the rDNA sequence, representing a high potential origin density per length of DNA (9.1kb repeat unit). However, the average origin efficiency of rDNA origins is relatively low (~20% in wild-type cells), which reduces the replication load on the overall genome by reducing competition with origins throughout the genome for limiting replication initiation factors. Deletion of histone deacetylase SIR2, which silences PolII transcription within the rDNA, results in increased early activation or the rDNA origins (and reduced rate of overall genome replication). Previous work by the authors showed that MCM complexes loaded onto the rDNA origins (origin licensing) were laterally displaced (sliding) along the rDNA, away from a well-positioned nucleosome on one side. The authors' major hypothesis throughout this work is that the new MCM location(s) are intrinsically more efficient configurations for origin firing. The authors identify a chromatin remodeling enzyme, FUN30, whose deletion appears to suppress the earlier activation of rDNA origins in sir2∆ cells. Indeed, it appears that the reduction of rDNA origin activity in sir2∆ fun30∆ cells is severe enough to results in a substantial reduction in the rDNA array repeat length (number of repeats); the reduced rDNA length presumably facilitates it's more stable replication and maintenance.

      Analysis of replication by 2D gels is marginally convincing, using 2D gels for this purpose is very challenging and tricky to quantify. The more quantitative analysis by EdU incorporation is more convincing of the suppression of the earlier replication caused by SIR2 deletion.

      To address the mechanism of suppression, they analyze MCM positioning using ChEC, which in G1 cells shows partial displacement of MCM from normal position A to positions B and C in sir2∆ cells and similar but more complete displacement away from A to positions B and C in sir2fun30 cells. During S-phase in the presence of hydroxyurea, which slows replication progression considerably (and blocks later origin firing) MCM signals redistribute, which is interpreted to represent origin firing and bidirectional movement of MCMs (only one direction is shown), some of which accumulate near the replication fork barrier, consistent with their interpretation. They observe that MCMs displaced (in G1) to sites B or C in sir2∆ cells, disappear more rapidly during S-phase, whereas the similar dynamic is not observed in sir2∆fun30∆. This is the main basis for their conclusion that the B and C sites are more permissive than A. While this may be the simplest interpretation, there are limitations with this assay that undermine a rigorous conclusion (additional points below). The main problem is that we know the MCM complexes are mobile so disappearance may reflect displacement by other means including transcription which is high is the sir2∆ background. Indeed, the double mutant has greater level of transcription per repeat unit which might explain more displaced from A in G1. Thus, displacement might not always represent origin firing. Because the sir2 background profoundly changes transcription, and the double mutant has a much smaller array length associated with higher transcription, how can we rule out greater accessibility at site A, for example in sir2∆, leading to more firing, which is suppressed in sir2 fun30 due to greater MCM displacement away from A?

      I think the critical missing data to solidly support their conclusions is a definitive determination of the site(s) of initiation using a more direct method, such as strand specific sequencing of EdU or nascent strand analysis. More direct comparisons of the strains with lower copy number to rule out this facet. As discussed in detail below, copy number reduction is known to suppress at least part of the sir2∆ effect so this looms over the interpretations. I think they are probably correct in their overall model based on the simplest interpretation of the data but I think it remains to be rigorously established. I think they should soften their conclusions in this respect.

    4. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors follow up on their previous work showing that in the absence of the Sir2 deacetylase the MCM replicative helicase at the rDNA spacer region is repositioned to a region of low nucleosome occupancy. Here they show that the repositioned displaced MCMs have increased firing propensity relative to non-displaced MCMs. In addition, they show that activation of the repositioned MCMs and low nucleosome occupancy in the adjacent region depend on the chromatin remodeling activity of Fun30.

      Strengths:

      The paper provides new information on the role of a conserved chromatin remodeling protein in the regulation of origin firing and in addition provides evidence that not all loaded MCMs fire and that origin firing is regulated at a step downstream of MCM loading.

      Weaknesses:

      The relationship between the author's results and prior work on the role of Sir2 (and Fob1) in regulation of rDNA recombination and copy number maintenance is not explored, making it difficult to place the results in a broader context. Sir2 has previously been shown to be recruited by Fob1, which is also required for DSB formation and recombination-mediated changes in rDNA copy number. Are the changes that the authors observe specifically in fun30 sir2 cells related to this pathway? Is Fob1 required for the reduced rDNA copy number in fun30 sir2 double mutant cells?

    5. Reviewer #3 (Public Review):

      Summary:

      Heterochromatin is characterized by low transcription activity and late replication timing, both dependent on the NAD-dependent protein deacetylase Sir2, the founding member of the sirtuins. This manuscript addresses the mechanism by which Sir2 delays replication timing at the rDNA in budding yeast. Previous work from the same laboratory (Foss et al. PLoS Genetics 15, e1008138) showed that Sir2 represses transcription-dependent displacement of the Mcm helicase in the rDNA. In this manuscript, the authors show convincingly that the repositioned Mcms fire earlier and that this early firing partly depends on the ATPase activity of the nucleosome remodeler Fun30. Using read-depth analysis of sorted G1/S cells, fun30 was the only chromatin remodeler mutant that somewhat delayed replication timing in sir2 mutants, while nhp10, chd1, isw1, htl1, swr1, isw2, and irc5 had not effect. The conclusion was corroborated with orthogonal assays including two-dimensional gel electrophoresis and analysis of EdU incorporation at early origins. Using an insightful analysis with an Mcm-MNase fusion (Mcm-ChEC), the authors show that the repositioned Mcms in sir2 mutants fire earlier than the Mcm at the normal position in wild type. This early firing at the repositioned Mcms is partially suppressed by Fun30. In addition, the authors show Fun30 affects nucleosome occupancy at the sites of the repositioned Mcm, providing a plausible mechanism for the effect of Fun30 on Mcm firing at that position. However, the results from the MNAse-seq and ChEC-seq assays are not fully congruent for the fun30 single mutant. Overall, the results support the conclusions providing a much better mechanistic understanding how Sir2 affects replication timing at rDNA,

      Strengths

      (1) The data clearly show that the repositioned Mcm helicase fires earlier than the Mcm in the wild type position.<br /> (2) The study identifies a specific role for Fun30 in replication timing and an effect on nucleosome occupancy around the newly positioned Mcm helicase in sir2 cells.

      Weaknesses

      (1) It is unclear which strains were used in each experiment.<br /> (2) The relevance of the fun30 phospho-site mutant (S20AS28A) is unclear.<br /> (3) For some experiments (Figs. 3, 4, 6) it is unclear whether the data are reproducible and the differences significant. Information about the number of independent experiments and quantitation is lacking. This affects the interpretation, as fun30 seems to affect the +3 nucleosome much more than let on in the description.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors report that optogenetic inhibition of hippocampal axon terminals in retrosplenial cortex impairs the performance of a delayed non-match to place task. The significance of findings elucidating the role of hippocampal projections to the retrosplenial cortex in memory and decision-making behaviors is important. However, the strength of evidence for the paper's claims is currently incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is a study on the role of the retrosplenial cortex (RSC) and the hippocampus in working memory. Working memory is a critical cognitive function that allows temporary retention of information for task execution. The RSC, which is functionally and anatomically connected to both primary sensory (especially visual) and higher cognitive areas, plays a key role in integrating spatial-temporal context and in goal-directed behaviors. However, the specific contributions of the RSC and the hippocampus in working memory-guided behaviors are not fully understood due to a lack of studies that experimentally disrupt the connection between these two regions during such behaviors.

      In this study, researchers employed eArch3.0 to silence hippocampal axon terminals in the RSC, aiming to explore the roles of these brain regions in working memory. Experiments were conducted where animals with silenced hippocampal axon terminals in the RSC performed a delayed non-match to place (DNMP) task. The results indicated that this manipulation impaired memory retrieval, leading to decreased performance and quicker decision-making in the animals. Notably, the authors observed that the effects of this impairment persisted beyond the light-activation period of the opsin, affecting up to three subsequent trials. They suggest that disrupting the hippocampal-RSC connection has a significant and lasting impact on working memory performance.

      Strengths:

      They conducted a study exploring the impact of direct hippocampal inputs into the RSC, a region involved in encoding spatial-temporal context and transferring contextual information, on spatial working memory tasks. Utilizing eArch3.0 expressed in hippocampal neurons via the viral vector AAV5-hSyn1-eArch3.0, they aimed to bilaterally silence hippocampal terminals located at the RSC in rats pre-trained in a DNMP task. They discovered that silencing hippocampal terminals in the RSC significantly decreased working memory performance in eArch+ animals, especially during task interleaving sessions (TI) that alternated between trials with and without light delivery. This effect persisted even in non-illuminated trials, indicating a lasting impact beyond the periods of direct manipulation. Additionally, they observed a decreased likelihood of correct responses following TI trials and an increased error rate in eArch+ animals, even after incorrect responses, suggesting an impairment in error-corrective behavior. This contrasted with baseline sessions where no light was delivered, and both eArch+ and control animals showed low error rates.

      Weaknesses:

      While I agree with the authors that the role of hippocampal inputs to the RSC in spatial working memory is understudied and merits further investigation, I find that the optogenetic experiment, a core part of this manuscript that includes viral injections, could be improved. The effects were rather subtle, rendering some of the results barely significant and possibly too weak to support major conclusions.

      We thank Reviewer#1 for carefully and critically reading our manuscript, and for the valuable comments provided. The judged “subtlety” of the effects stems from a perspective according to which a quantitatively lower effect bears less biological significance for cognition. We disagree with this perspective and find it rather reductive for several reasons.

      Once seen in the context of the animal’s ecology, subtle impairments can be life-threatening precisely because of their subtlety, leading the animal to confidently rely on a defective capacity, for such events as remembering the habitual location of a predator, or food source.

      Also, studies in animal cognition often undertake complete, rather than graded, suppression of a given mechanism (in the same sense as that of “knocking out” a gene that is relevant for behaviour), leading to a gravelly, rather that gradually, impaired model system, to the point of not allowing a hypothetical causal link to be mechanistically revealed beyond its mere presence. This often hinders a thorough interpretation of the perturbed factor’s role. If a caricatural analogy is allowed, it would be as if we were to study the role of an animal’s legs by chopping them both off and observing the resulting behaviour.

      In our study we conclude that silencing HIPP inputs in RSC perturbs cognition enough to impair behaviour while not disabling the animal entirely, as such allowing for behaviour to proceed, and for our observation of graded, decreased (not absent), proficiency under optogenetic silencing. So rather than weak, we would say the results are statistically significant, and biologically realistic.

      Additionally, no mechanistic investigation was conducted beyond referencing previous reports to interpret the core behavioral phenotypes.

      We fully agree with this being a weakness, as we wish we could have done more mechanistic studies to find out exactly what is Arch activation doing to HIPP-RSC transmission, which neurons are being affected, and perhaps in the future dissect its circuit determinants. We have all these goals very present and hope we can address them soon.

      Reviewer #2 (Public Review):

      The authors examine the impact of optogenetic inhibition of hippocampal axon terminals in the retrosplenial cortex (RSP) during the performance of a working memory T-maze task. Performance on a delayed non-match-to-place task was impaired by such inhibition. The authors also report that inhibition is associated with faster decision-making and that the effects of inhibition can be observed over several subsequent trials. The work seems reasonably well done and the role of hippocampal projections to retrosplenial cortex in memory and decision-making is very relevant to multiple fields. However, the work should be expanded in several ways before one can make firm conclusions on the role of this projection in memory and behavior.

      We thank Reviewer#2 for carefully and critically reading our manuscript, and for the valuable comments provided.

      (1) The work is very singular in its message and the experimentation. Further, the impact of the inhibition on behaviour is very moderate. In this sense, the results do not support the conclusion that the hippocampal projection to retrosplenial cortex is key to working memory in a navigational setting.

      As we have mentioned in response to Reviewer#1, the judged “very moderate” effect stems from a perspective according to which a quantitatively lower effect bears less biological significance for cognition, precluding its consideration as “key” for behaviour. We disagree with this perspective and find it rather reductive for several reasons. Once seen in the context of the animal’s ecology, quantitatively lower impairments in working memory are no less key for this cognitive capacity, and can be life-threatening precisely because of their subtlety, leading the animal to confidently rely on a defective capacity, for such events as remembering the habitual location of a predator, or food source. Furthermore, studies in animal cognition often undertake complete, rather than graded, suppression of a given mechanism (in the same sense as “knocking out” a gene that is relevant for behaviour), leading to a gravelly, rather that gradually, impaired model system, to the point of not allowing a hypothetical causal link to be mechanistically revealed beyond its mere presence. This often hinders a thorough interpretation of its role.

      In our study we conclude that silencing HIPP inputs in RSC perturbs behaviour enough to impair behaviour while not disabling the animal entirely, as such allowing for behaviour to proceed, and our observation of graded, decreased (not absent), proficiency under optogenetic silencing. So rather than weak, we would say the results are statistically significant, and biologically realistic.

      (2) There are no experiments examining other types of behavior or working memory. Given that the animals used in the studies could be put through a large number of different tasks, this is surprising. There is no control navigational task. There is no working memory test that is non-spatial. Such results should be presented in order to put the main finding in context.

      It is hard to gainsay this point. The more thorough and complete a behavioural characterization is, the more informative is the study, from every angle you look at it. While we agree that other forms of WM would be quite interesting in this context, we also cannot ignore the fact that DNMP is widely tested as a WM task, one that is biologically plausible, sensitive to perturbations of neural circuitry know to be at play therein, and fully accepted in the field. Faced with the impossibility of running further studies, for lack of additional funding and human resources, we chose to run this task.

      A control navigational task would, in our understanding, be used to assess whether silencing HIPP projections to RSC would affect (spatial?) navigation, rather than WM, thus explaining the observed impairment. To this we have the following to say: Spatial Navigation is a very basic cognitive function, one that relies on body orientation relative to spatial context, on keeping an updated representation of such spatial context, (“alas”, as memory), and on guiding behaviour according to acquired knowledge about spatial context. Some of these functions are integral to spatial working memory, as such, they might indeed be affected.

      Dissecting the determinants of spatial WM is indeed an ongoing effort, one that was not the intention of the current study, but also one that we have very present, in hope we can address in the future.

      A non-spatial WM task would indeed vastly solidify our claims beyond spatial WM, onto WM. We have, for this reason, changed the title of the manuscript which now reads “spatial working memory”.

      (3) The actual impact of the inhibition on activity in RSP is not provided. While this may not be strictly necessary, it is relevant that the hippocampal projection to RSP includes, and is perhaps dominated by inhibitory inputs. I wonder why the authors chose to manipulate hippocampal inputs to RSP when the subiculum stands as a much stronger source of afferents to RSP and has been shown to exhibit spatial and directional tuning of activity. The points here are that we cannot be sure what the manipulation is really accomplishing in terms of inhibiting RSP activity (perhaps this explains the moderate impact on behavior) and that the effect of inhibiting hippocampal inputs is not an effective means by which to study how RSP is responsive to inputs that reflect environmental locations.

      We fully agree that neural recordings addressing the effect of silencing on RSC neural activity is relevant. We do wish we could have provided more mechanistic studies, to find out exactly what is Arch activation doing to HIPP-RSC transmission, which neurons are being affected, and thus dissecting its circuit determinants. We have all these goals very present and hope we can address them soon. Subiculum, which we mention in the Introduction, is indeed a key player in this complex circuitry, one whose hypothetical influence is the subject of experimental studies which will certainly reveal many other key elements.

      (4) The impact of inhibition on trials subsequent to the trial during which optical stimulation was actually supplied seems trivial. The authors themselves point to evidence that activation of the hyperpolarizing proton pump is rather long-lasting in its action. Further, each sample-test trial pairing is independent of the prior or subsequent trials. This finding is presented as a major finding of the work, but would normally be relegated to supplemental data as an expected outcome given the dynamics of the pump when activated.

      We disagree that this finding is “trivial”, and object to the considerations of “normalcy”, which we are left wondering about.

      In lack of neurophysiological experiments (for the reasons stated above) to address this interesting finding, we chose to interpret it in light of (the few) published observations, such being the logical course of action in scientific reporting, given the present circumstances.

      Evidence for such a prolonged effect in the context of behaviour is scarce (to our knowledge only the one we cite in the manuscript). As such, it is highly relevant to report it, and give it the relevance we do in our manuscript, rather than “relegating it to supplementary data”, as the reviewer considers being “normal”.

      In the DNMP task the consecutive sample-test pairs are explicitly not independent, as they are part of the same behavioural session. This is illustrated by the simple phenomenon of learning, namely the intra-session learning curves, and the well-known behavioral trial-history effects. The brain does not simply erase such information during the ITI.

      (5) In the middle of the first paragraph of the discussion, the authors make reference to work showing RSP responses to "contextual information in egocentric and allocentric reference frames". The citations here are clearly deficient. How is the Nitzan 2020 paper at all relevant here?

      Nitzan 2020 reports the propagation of information from HIPP to CTX via SUB and RSC, thus providing a conduit for mnemonic information between the two structures, alternative to the one we target, thus providing thorough information concerning the HIPP-RSC circuitry at play during behaviour.

      Alexander and Nitz 2015 precisely cite the encoding, and conjunction, of two types of contextual information, internal (ego-) and external (allocentric).

      The subsequent reference is indeed superfluous here.

      We thank the Reviewer#2 for calling our attention to the fact that references for this information are inadequate and lacking. We have now cited (Gill et al., 2011; Miller et al., 2019; Vedder et al., 2017) and refer readers to the review (Alexander et al., 2023)  for the purpose of illustrating the encoding of information in the two reference frames. In addition, we have substantially edited the Introduction and Discussion sections, and suppressed unnecessary passages.

      (6) The manuscript is deficient in referencing and discussing data from the Smith laboratory that is similar. The discussion reads mainly like a repeat of the results section.

      Please see above. We thank Reviewer#2 for this comment, we have now re-written the Discussion such that it is less of a summary of the Results and more focused on their implications and future directions.

      Response to recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major

      Line 101: Even with the tapered lambda fibre optic stub, if the fibre optics were longitudinally staggered by 2 millimetres, they would deliver light to diagonal regions in the horizontal plane rather than covering the full length of the RSC. Is this staggering pattern randomized or fixed? Additionally, Figure 1C is a bit misleading, as the light distribution pattern from the tapered fibre optic is likely to be more concentrated near the surface of the fibre, rather than spreading widely in a large spherical pattern.

      The staggering is fixed. The elliptical (not spherical) contour in Fig 1C is not meant to convey any quantitative information, but rather to visually orient the reader towards the directions into which light will likely propagate, the effects of which we do not attempt to estimate here. We have made this contour smaller.

      Line 119: The authors demonstrate the viral expression pattern of a representative animal and the overall expression patterns of all other animals in Figure 1 and the Supplementary Figures. However, numerous cases in the Supplementary Figures exhibit viral leakages and strong expressions in adjacent cortical and thalamic areas. Although there is a magnified view of the RSC's expression pattern in Figure 1, authors should show the same way in the supplemental data as well. Additionally, the degree of viral expression in the hippocampal subregions varies substantially across animals. This variation is concerning and impacts the interpretation of the results.

      The viral construct was injected in the HIPP at coordinates based on our previous work (Ferreira-Fernandes et al., 2019) wherein injections of a similar vector in mid-dorsal HIPP resulted in widespread expression throughout the medial mesocortex AP extent, RSC through CG, as well as other areas in which HIPP establishes synapses. These were studied in detail then, by estimating the density of axon terminals. In the present work we did not acquire high-mag images of all slices, since they were too expensive, and we had this information from the study above. Still, we have now added further examples of high-mag images taken from eArch and CTRL animals.

      We believe it is important here to mention the fact that the virus we use, AAV5, only travels anterograde and is static (i.e. it does not travel transynaptically).

      Variations in viral expression are to be expected even if injections happen in the exact same way. It is crucial then, that fibre positioning is constant across animals, to guarantee that its relationship with viral expression is thence consistent, and to render irrelevant whatever off-target expression of the viral construct. We have ascertained this condition post-mortem in all our animals.

      Line 124: Another point regarding the viral expressions and optical fibre implants used to inhibit the HIPP-RSC pathway is that the RSC and HIPP extend substantially along the anterior-posterior axis. The authors should demonstrate how the viral expression is distributed along this axis and indicate where the tip of the tapered optical fibre ended by marking it in the histological images. This information is crucial to confirm the authors' claim that the hippocampal projection terminals were indeed modulated by optical light. Also, the manuscript would benefit from details about the power/duration and/or modulation of the light used.

      In both Figures 1 and S1 panels we can clearly see the tracks formed by the fibres. This provides examples of such dual angle placement vis a vis the expression of the construct, demonstrating that the former is fully targeted towards the latter. We have added markers to highlight these tracks and an example of a “full” track in figure S1. We did not have animals deviating from this relative positioning to any significant extent. The methods section mentions illumination power as 240mA, and we have now added estimated illumination time as well.

      Line 141: The authors should include data on task performance during learning and baseline sessions for each animal, to demonstrate that they fully grasped the task rules and that achieving a 75% performance ratio was sufficient.

      DNMP is a standard WM task used for many decades, in which animals reach performances above 75% in 4-8 sessions. We have used it extensively, and never saw any deviations from this learning rate and curve. We ran daily sessions until animals reached 75%, and thereafter until they maintained this performance, or above, for three consecutive sessions (the data points we show). We saw no deviations from what is published, nor from what is our own extensive experience, and thence are fully confident that all animals included in this manuscript grasped task rules.

      Line 146: While the study focused on inhibiting inputs during the test run (retrieval phase), it would be beneficial to also inhibit inputs during the sample run (encoding phase) and the delay period. This would help confirm whether the silencing affects only working memory retrieval, or if it also impacts encoding and maintenance.

      We agree, it would be very interesting to determine if there are any effects of silencing HIPP RSC terminals during Sample. However, since there is a limit to the number of trials per session, and to the total number of sessions, we could not run the three manipulations within each session of our experimental design, as that would lower the number of trials per condition to an extent that would affect statistical power. Silencing HIPP RSC terminals during Sample would best be a separate experiment, asking a different question, and perhaps within an experimental design distinct from the one envisioned.

      A very important point here relates to the fact that the effects of optogenetic manipulation do not limit themselves to the illumination epoch, in fact they extend far beyond onto the 3rd trial post-illumination. The insertion of Sample-illuminated trials interleaved in the same session would fundamentally affect the interpretation of experimental results, as we could not attribute lower performances to the effects in either or both manipulated epochs.

      Line 225: Figure 5 illustrates that silencing the inputs results in an extended impairment of working memory performance. However, it's unclear if there are any behavioural changes during the sample run. The inhibition could potentially affect encoding in the subsequent sample run, considering the inter-trial interval (ITI) is only 20 seconds.

      From the observation of behaviour and the analysis of our data, we saw no overt “behavioural changes during the sample run”, as latencies and speeds were essentially unchanged.

      If what is meant by your comment is the effect of optogenetic manipulation being protracted from the Test towards the Sample epoch, we find this unlikely. Conservatively, we estimate the peak of our optogenetic manipulation to occur around the time light is delivered, the Test phase, rather than 20-30 secs later.

      In theory, any effect of optogenetic silencing of HIPP terminals in RSC can cause disturbances in encoding or Sample, the ITI itself, and the epoch in which mnemonic information retrieved from the Sample epoch is confronted with the contextual information present during Test, leading to a decision. This is regardless of the illumination epoch, and even if the effect of optogenetic manipulation is not prolonged in time. 

      Since in our experiments we specifically target the Test epoch, and there is, in all likelihood, a decaying magnitude of neurophysiological effects, manifest in the reported decaying nature of the manipulation mechanism, and in our observed decrease of behavioural proficiency from subsequent trials 1:4, we are convinced that a conservative interpretation is that our major effect is concentrated in the epoch in which we deliver light - the Test epoch, the consequences of which (possibly related to short term plasticity events taking place within the HIPP-RSC neural circuit) extending further in time.

      Line 410: The methods section on the surgical procedure could be clearer, particularly regarding the coordinates for microinjection and fibre implantation. A more precise description would aid reader comprehension.

      The now-reported injection and implantation coordinates include the numbers corresponding to the distances, in mm, from Bregma to the targets, in the three stereotaxic dimensions considered: antero-posterior, medial-lateral left and right, and dorso-ventral, as well as the angle at which the fibres were positioned. We have added labels to the figures to highlight the fibreoptic track locations. We will be happy to provide further details as deemed necessary.

      Line 461: It would be helpful to know if each animal displayed a preference for the left or right side. Including a description or figure showing that the performance ratio exceeded 75% in both left and right trials would provide a more comprehensive understanding of the animals' behaviour.

      In the DNMP, an extensively used and documented WM task, it is an absolute pre-condition that no animals are biased to either side. As such, we did not use any animal that showed such bias.<br /> We have not observed this to be the case in any of our candidate animals, nor would we use any animal exhibiting such a preference.

      Minor

      Line 25: In the INTRODUCTION section, the authors introduce ego-centric and allocentric variables in the RSC. However, if they intend to discuss this feature, there is no supporting data for ego-centric or allocentric variables in the Results section.

      We agree. The extent of the discussion of ego vs allo-centric variables in our manuscript might venture a bit out of the main subject. It was included to provide wider context to our reporting of the data, considering that spatial working memory is indeed one instance in which egocentric- and allocentric-referenced cognitive mechanisms confront each other, and one in which silencing the HIPP input to a cortical region thence involved would likely disturb ensuing computations. We have now substantially edited the manuscript’s Introduction and Discussion, sections, namely toning down this aspect.

      Line 125: In the section title, DNMT -> DNMP obviously.

      We have corrected this passage.

      Figures: The quality of the figure panels does not meet the expected standards. For example, scale bars are missing in many panels (e.g., Figure 1A bottom, 1B, 1C, S1), figure labels are misaligned (as seen in Figure 3A-B compared to 3C, same with Figure 5), and there is inconsistency in color schemes (e.g., Figure 3C versus Figure 6, where 'Error' versus 'Correct' is depicted using green versus blue, respectively).

      We have now corrected these inconsistencies and mistakes.

    2. eLife assessment

      The authors report that optogenetic inhibition of hippocampal axon terminals in retrosplenial cortex impairs the performance of a delayed non-match to place task. Elucidating the role of hippocampal projections to the retrosplenial cortex in memory and decision-making behaviors is important. However, the strength of evidence for the paper's claims is incomplete.

    3. Reviewer #2 (Public Review):

      The authors examine the impact of optogenetic inhibition of hippocampal axon terminals in the retrosplenial cortex (RSP) during the performance of a working memory T-maze task. Performance on a delayed non-match-to-place task was impaired by such inhibition. The authors also report that inhibition is associated with faster decision-making and that the effects of inhibition can be observed over several subsequent trials. The work seems reasonably well done and the role of hippocampal projections to retrosplenial cortex in memory and decision-making is very relevant to multiple fields. However, the work should be expanded in several ways before one can make firm conclusions on the role of this projection in memory and behavior.

      Comments on revised version:

      The authors have provided their comments on the concerns voiced in my first review. I remain of the opinion that the experiments do not extend beyond determining whether disruption of hippocampal to retrosplenial cortex connections impacts spatial working memory. Given the restricted level of inquiry and the very moderate effect of the manipulation on memory, the work, in my opinion, does not provide significant insight into the processes of spatial working memory nor the function of the hippocampal to retrosplenial cortex connection.

    1. eLife assessment

      The paper reports the important discovery that the mouse dorsal inferior colliculus, an auditory midbrain area, encodes sound location. The evidence supporting the claims is solid, although how the encoding of sound source position in this area relates to localization behaviors in engaged mice remains unclear. The observations described should be of interest to auditory researchers studying the neural mechanisms of sound localization.

    2. Reviewer #1 (Public Review):

      Summary: In this study, the authors address whether the dorsal nucleus of the inferior colliculus (DCIC) in mice encodes sound source location within the front horizontal plane (i.e., azimuth). They do this using volumetric two-photon Ca2+ imaging and high-density silicon probes (Neuropixels) to collect single-unit data. Such recordings are beneficial because they allow large populations of simultaneous neural data to be collected. Their main results and the claims about those results are the following:

      1) DCIC single-unit responses have high trial-to-trial variability (i.e., neural noise);

      2) approximately 32% to 40% of DCIC single units have responses that are sensitive to sound source azimuth;

      3) single-trial population responses (i.e., the joint response across all sampled single units in an animal) encode sound source azimuth "effectively" (as stated in title) in that localization decoding error matches average mouse discrimination thresholds;

      4) DCIC can encode sound source azimuth in a similar format to that in the central nucleus of the inferior colliculus (as stated in Abstract);

      5) evidence of noise correlation between pairs of neurons exists;

      and 6) noise correlations between responses of neurons help reduce population decoding error.

      While simultaneous recordings are not necessary to demonstrate results #1, #2, and #4, they are necessary to demonstrate results #3, #5, and #6.

      Strengths:<br /> - Important research question to all researchers interested in sensory coding in the nervous system.<br /> - State-of-the-art data collection: volumetric two-photon Ca2+ imaging and extracellular recording using high-density probes. Large neuronal data sets.<br /> - Confirmation of imaging results (lower temporal resolution) with more traditional microelectrode results (higher temporal resolution).<br /> - Clear and appropriate explanation of surgical and electrophysiological methods. I cannot comment on the appropriateness of the imaging methods.

      Strength of evidence for claims of the study:

      1) DCIC single-unit responses have high trial-to-trial variability -<br /> The authors' data clearly shows this.

      2) Approximately 32% to 40% of DCIC single units have responses that are sensitive to sound source azimuth -<br /> The sensitivity of each neuron's response to sound source azimuth was tested with a Kruskal-Wallis test, which is appropriate since response distributions were not normal. Using this statistical test, only 8% of neurons (median for imaging data) were found to be sensitive to azimuth, and the authors noted this was not significantly different than the false positive rate. The Kruskal-Wallis test was not performed on electrophysiological data. The authors suggested that low numbers of azimuth-sensitive units resulting from the statistical analysis may be due to the combination of high neural noise and relatively low number of trials, which would reduce statistical power of the test. This may be true, but if single-unit responses were moderately or strongly sensitive to azimuth, one would expect them to pass the test even with relatively low statistical power. At best, if their statistical test missed some azimuth-sensitive units, they were likely only weakly sensitive to azimuth. The authors went on to perform a second test of azimuth sensitivity-a chi-squared test-and found 32% (imaging) and 40% (e-phys) of single units to have statistically significant sensitivity. This feels a bit like fishing for a lower p-value. The Kruskal-Wallis test should have been left as the only analysis. Moreover, the use of a chi-squared test is questionable because it is meant to be used between two categorical variables, and neural response had to be binned before applying the test.

      3) Single-trial population responses encode sound source azimuth "effectively" in that localization decoding error matches average mouse discrimination thresholds -<br /> If only one neuron in a population had responses that were sensitive to azimuth, we would expect that decoding azimuth from observation of that one neuron's response would perform better than chance. By observing the responses of more than one neuron (if more than one were sensitive to azimuth), we would expect performance to increase. The authors found that decoding from the whole population response was no better than chance. They argue (reasonably) that this is because of overfitting of the decoder model-too few trials used to fit too many parameters-and provide evidence from decoding combined with principal components analysis which suggests that overfitting is occurring. What is troubling is the performance of the decoder when using only a handful of "top-ranked" neurons (in terms of azimuth sensitivity) (Fig. 4F and G). Decoder performance seems to increase when going from one to two neurons, then decreases when going from two to three neurons, and doesn't get much better for more neurons than for one neuron alone. It seems likely there is more information about azimuth in the population response, but decoder performance is not able to capture it because spike count distributions in the decoder model are not being accurately estimated due to too few stimulus trials (14, on average). In other words, it seems likely that decoder performance is underestimating the ability of the DCIC population to encode sound source azimuth.<br /> To get a sense of how effective a neural population is at coding a particular stimulus parameter, it is useful to compare population decoder performance to psychophysical performance. Unfortunately, mouse behavioral localization data do not exist. Therefore, the authors compare decoder error to mouse left-right discrimination thresholds published previously by a different lab. However, this comparison is inappropriate because the decoder and the mice were performing different perceptual tasks. The decoder is classifying sound sources to 1 of 13 locations from left to right, whereas the mice were discriminating between left or right sources centered around zero degrees. The errors in these two tasks represent different things. The two data sets may potentially be more accurately compared by extracting information from the confusion matrices of population decoder performance. For example, when the stimulus was at -30 deg, how often did the decoder classify the stimulus to a lefthand azimuth? Likewise, when the stimulus was +30 deg, how often did the decoder classify the stimulus to a righthand azimuth?

      4) DCIC can encode sound source azimuth in a similar format to that in the central nucleus of the inferior colliculus -<br /> It is unclear what exactly the authors mean by this statement in the Abstract. There are major differences in the encoding of azimuth between the two neighboring brain areas: a large majority of neurons in the CNIC are sensitive to azimuth (and strongly so), whereas the present study shows a minority of azimuth-sensitive neurons in the DCIC. Furthermore, CNIC neurons fire reliably to sound stimuli (low neural noise), whereas the present study shows that DCIC neurons fire more erratically (high neural noise).

      5) Evidence of noise correlation between pairs of neurons exists -<br /> The authors' data and analyses seem appropriate and sufficient to justify this claim.

      6) Noise correlations between responses of neurons help reduce population decoding error -<br /> The authors show convincing analysis that performance of their decoder increased when simultaneously measured responses were tested (which include noise correlation) than when scrambled-trial responses were tested (eliminating noise correlation). This makes it seem likely that noise correlation in the responses improved decoder performance. The authors mention that the naïve Bayesian classifier was used as their decoder for computational efficiency, presumably because it assumes no noise correlation and, therefore, assumes responses of individual neurons are independent of each other across trials to the same stimulus. The use of decoder that assumes independence seems key here in testing the hypothesis that noise correlation contains information about sound source azimuth. The logic of using this decoder could be more clearly spelled out to the reader. For example, if the null hypothesis is that noise correlations do not carry azimuth information, then a decoder that assumes independence should perform the same whether population responses are simultaneous or scrambled. The authors' analysis showing a difference in performance between these two cases provides evidence against this null hypothesis.

      Minor weakness:<br /> - Most studies of neural encoding of sound source azimuth are done in a noise-free environment, but the experimental setup in the present study had substantial background noise. This complicates comparison of the azimuth tuning results in this study to those of other studies. One is left wondering if azimuth sensitivity would have been greater in the absence of background noise, particularly for the imaging data where the signal was only about 12 dB above the noise. The description of the noise level and signal + noise level in the Methods should be made clearer. Mice hear from about 2.5 - 80 kHz, so it is important to know the noise level within this band as well as specifically within the band overlapping with the signal.

    3. Reviewer #2 (Public Review):

      In the present study, Boffi et al. investigate the manner in which the dorsal cortex of the of the inferior colliculus (DCIC), an auditory midbrain area, encodes sound location azimuth in awake, passively listening mice. By employing volumetric calcium imaging (scanned temporal focusing or s-TeFo), complemented with high-density electrode electrophysiological recordings (neuropixels probes), they show that sound-evoked responses are exquisitely noisy, with only a small portion of neurons (units) exhibiting spatial sensitivity. Nevertheless, a naïve Bayesian classifier was able to predict the presented azimuth based on the responses from small populations of these spatially sensitive units. A portion of the spatial information was provided by correlated trial-to-trial response variability between individual units (noise correlations). The study presents a novel characterization of spatial auditory coding in a non-canonical structure, representing a noteworthy contribution specifically to the auditory field and generally to systems neuroscience, due to its implementation of state-of-the-art techniques in an experimentally challenging brain region. However, nuances in the calcium imaging dataset and the naïve Bayesian classifier warrant caution when interpreting some of the results.

      Strengths:<br /> The primary strength of the study lies in its methodological achievements, which allowed the authors to collect a comprehensive and novel dataset. While the DCIC is a dorsal structure, it extends up to a millimetre in depth, making it optically challenging to access in its entirety. It is also more highly myelinated and vascularised compared to e.g., the cerebral cortex, compounding the problem. The authors successfully overcame these challenges and present an impressive volumetric calcium imaging dataset. Furthermore, they corroborated this dataset with electrophysiological recordings, which produced overlapping results. This methodological combination ameliorates the natural concerns that arise from inferring neuronal activity from calcium signals alone, which are in essence an indirect measurement thereof.

      Another strength of the study is its interdisciplinary relevance. For the auditory field, it represents a significant contribution to the question of how auditory space is represented in the mammalian brain. "Space" per se is not mapped onto the basilar membrane of the cochlea and must be computed entirely within the brain. For azimuth, this requires the comparison between miniscule differences between the timing and intensity of sounds arriving at each ear. It is now generally thought that azimuth is initially encoded in two, opposing hemispheric channels, but the extent to which this initial arrangement is maintained throughout the auditory system remains an open question. The authors observe only a slight contralateral bias in their data, suggesting that sound source azimuth in the DCIC is encoded in a more nuanced manner compared to earlier processing stages of the auditory hindbrain. This is interesting, because it is also known to be an auditory structure to receive more descending inputs from the cortex.

      Systems neuroscience continues to strive for the perfection of imaging novel, less accessible brain regions. Volumetric calcium imaging is a promising emerging technique, allowing the simultaneous measurement of large populations of neurons in three dimensions. But this necessitates corroboration with other methods, such as electrophysiological recordings, which the authors achieve. The dataset moreover highlights the distinctive characteristics of neuronal auditory representations in the brain. Its signals can be exceptionally sparse and noisy, which provide an additional layer of complexity in the processing and analysis of such datasets. This will be undoubtedly useful for future studies of other less accessible structures with sparse responsiveness.

      Weaknesses:<br /> Although the primary finding that small populations of neurons carry enough spatial information for a naïve Bayesian classifier to reasonably decode the presented stimulus is not called into question, certain idiosyncrasies, in particular the calcium imaging dataset and model, complicate specific interpretations of the model output, and the readership is urged to interpret these aspects of the study's conclusions with caution.

      I remain in favour of volumetric calcium imaging as a suitable technique for the study, but the presently constrained spatial resolution is insufficient to unequivocally identify regions of interest as cell bodies (and are instead referred to as "units" akin to those of electrophysiological recordings). It remains possible that the imaging set is inadvertently influenced by non-somatic structures (including neuropil), which could report neuronal activity differently than cell bodies. Due to the lack of a comprehensive ground-truth comparison in this regard (which to my knowledge is impossible to achieve with current technology), it is difficult to imagine how many informative such units might have been missed because their signals were influenced by spurious, non-somatic signals, which could have subsequently misled the models. The authors reference the original Nature Methods article (Prevedel et al., 2016) throughout the manuscript, presumably in order to avoid having to repeat previously published experimental metrics. But the DCIC is neither the cortex nor hippocampus (for which the method was originally developed) and may not have the same light scattering properties (not to mention neuronal noise levels). Although the corroborative electrophysiology data largely eleviates these concerns for this particular study, the readership should be cognisant of such caveats, in particular those who are interested in implementing the technique for their own research.

      A related technical limitation of the calcium imaging dataset is the relatively low number of trials (14) given the inherently high level of noise (both neuronal and imaging). Volumetric calcium imaging, while offering a uniquely expansive field of view, requires relatively high average excitation laser power (in this case nearly 200 mW), a level of exposure the authors may have wanted to minimise by maintaining a low the number of repetitions, but I yield to them to explain. Calcium imaging is also inherently slow, requiring relatively long inter-stimulus intervals (in this case 5 s). This unfortunately renders any model designed to predict a stimulus (in this case sound azimuth) from particularly noisy population neuronal data like these as highly prone to overfitting, to which the authors correctly admit after a model trained on the entire raw dataset failed to perform significantly above chance level. This prompted them to feed the model only with data from neurons with the highest spatial sensitivity. This ultimately produced reasonable performance (and was implemented throughout the rest of the study), but it remains possible that if the model was fed with more repetitions of imaging data, its performance would have been more stable across the number of units used to train it. (All models trained with imaging data eventually failed to converge.) However, I also see these limitations as an opportunity to improve the technology further, which I reiterate will be generally important for volume imaging of other sparse or noisy calcium signals in the brain.

      Transitioning to the naïve Bayesian classifier itself, I first openly ask the authors to justify their choice of this specific model. There are countless types of classifiers for these data, each with their own pros and cons. Did they actually try other models (such as support vector machines), which ultimately failed? If so, these negative results (even if mentioned en passant) would be extremely valuable to the community, in my view. I ask this specifically because different methods assume correspondingly different statistical properties of the input data, and to my knowledge naïve Bayesian classifiers assume that predictors (neuronal responses) are assumed to be independent within a class (azimuth). As the authors show that noise correlations are informative in predicting azimuth, I wonder why they chose a model that doesn't take advantage of these statistical regularities. It could be because of technical considerations (they mention computing efficiency), but I am left generally uncertain about the specific logic that was used to guide the authors through their analytical journey.

      That aside, there remain other peculiarities in model performance that warrant further investigation. For example, what spurious features (or lack of informative features) in these additional units prevented the models of imaging data from converging? In an orthogonal question, did the most spatially sensitive units share any detectable tuning features? A different model trained with electrophysiology data in contrast did not collapse in the range of top-ranked units plotted. Did this model collapse at some point after adding enough units, and how well did that correlate with the model for the imaging data? How well did the form (and diversity) of the spatial tuning functions as recorded with electrophysiology resemble their calcium imaging counterparts? These fundamental questions could be addressed with more basic, but transparent analyses of the data (e.g., the diversity of spatial tuning functions of their recorded units across the population). Even if the model extracts features that are not obvious to the human eye in traditional visualisations, I would still find this interesting.

      Finally, the readership is encouraged to interpret certain statements by the authors in the current version conservatively. How the brain ultimately extracts spatial neuronal data for perception is anyone's guess, but it is important to remember that this study only shows that a naïve Bayesian classifier could decode this information, and it remains entirely unclear whether the brain does this as well. For example, the model is able to achieve a prediction error that corresponds to the psychophysical threshold in mice performing a discrimination task (~30 {degree sign}). Although this is an interesting coincidental observation, it does not mean that the two metrics are necessarily related. The authors correctly do not explicitly claim this, but the manner in which the prose flows may lead a non-expert into drawing that conclusion. Moreover, the concept of redundancy (of spatial information carried by units throughout the DCIC) is difficult for me to disentangle. One interpretation of this formulation could be that there are non-overlapping populations of neurons distributed across the DCIC that each could predict azimuth independently of each other, which is unlikely what the authors meant. If the authors meant generally that multiple neurons in the DCIC carry sufficient spatial information, then a single neuron would have been able to predict sound source azimuth, which was not the case. I have the feeling that they actually mean "complimentary", but I leave it to the authors to clarify my confusion, should they wish.

      In summary, the present study represents a significant body of work that contributes substantially to the field of spatial auditory coding and systems neuroscience. However, limitations of the imaging dataset and model as applied in the study muddles concrete conclusions about how the DCIC precisely encodes sound source azimuth and even more so to sound localisation in a behaving animal. Nevertheless, it presents a novel and unique dataset, which, regardless of secondary interpretation, corroborates the general notion that auditory space is encoded in an extraordinarily complex manner in the mammalian brain.

    4. Reviewer #3 (Public Review):

      Summary: Boffi and colleagues sought to quantify the single-trial, azimuthal information in the dorsal cortex of the inferior colliculus (DCIC), a relatively understudied subnucleus of the auditory midbrain. They used two complementary recording methods while mice passively listened to sounds at different locations: a large volume but slow sampling calcium-imaging method, and a smaller volume but temporally precise electrophysiology method. They found that neurons in the DCIC were variable in their activity, unreliably responding to sound presentation and responding during inter-sound intervals. Boffi and colleagues used a naïve Bayesian decoder to determine if the DCIC population encoded sound location on a single trial. The decoder failed to classify sound location better than chance when using the raw single-trial population response but performed significantly better than chance when using intermediate principal components of the population response. In line with this, when the most azimuth dependent neurons were used to decode azimuthal position, the decoder performed equivalently to the azimuthal localization abilities of mice. The top azimuthal units were not clustered in the DCIC, possessed a contralateral bias in response, and were correlated in their variability (e.g., positive noise correlations). Interestingly, when these noise correlations were perturbed by inter-trial shuffling decoding performance decreased. Although Boffi and colleagues display that azimuthal information can be extracted from DCIC responses, it remains unclear to what degree this information is used and what role noise correlations play in azimuthal encoding.

      Strengths: The authors should be commended for collection of this dataset. When done in isolation (which is typical), calcium imaging and linear array recordings have intrinsic weaknesses. However, those weaknesses are alleviated when done in conjunction with one another - especially when the data largely recapitulates the findings of the other recording methodology. In addition to the video of the head during the calcium imaging, this data set is extremely rich and will be of use to those interested in the information available in the DCIC, an understudied but likely important subnucleus in the auditory midbrain.

      The DCIC neural responses are complex; the units unreliably respond to sound onset, and at the very least respond to some unknown input or internal state (e.g., large inter-sound interval responses). The authors do a decent job in wrangling these complex responses: using interpretable decoders to extract information available from population responses.

      Weaknesses:<br /> The authors observe that neurons with the most azimuthal sensitivity within the DCIC are positively correlated, but they use a Naïve Bayesian decoder which assume independence between units. Although this is a bit strange given their observation that some of the recorded units are correlated, it is unlikely to be a critical flaw. At one point the authors reduce the dimensionality of their data through PCA and use the loadings onto these components in their decoder. PCA incorporates the correlational structure when finding the principal components and constrains these components to be orthogonal and uncorrelated. This should alleviate some of the concern regarding the use of the naïve Bayesian decoder because the projections onto the different components are independent. Nevertheless, the decoding results are a bit strange, likely because there is not much linearly decodable azimuth information in the DCIC responses. Raw population responses failed to provide sufficient information concerning azimuth for the decoder to perform better than chance. Additionally, it only performed better than chance when certain principal components or top ranked units contributed to the decoder but not as more components or units were added. So, although there does appear to be some azimuthal information in the recoded DCIC populations - it is somewhat difficult to extract and likely not an 'effective' encoding of sound localization as their title suggests.

      Although this is quite a worthwhile dataset, the authors present relatively little about the characteristics of the units they've recorded. This may be due to the high variance in responses seen in their population. Nevertheless, the authors note that units do not respond on every trial but do not report what percent of trials that fail to evoke a response. Is it that neurons are noisy because they do not respond on every trial or is it also that when they do respond they have variable response distributions? It would be nice to gain some insight into the heterogeneity of the responses. Additionally, is there any clustering at all in response profiles or is each neuron they recorded in the DCIC unique? They also only report the noise correlations for their top ranked units, but it is possible that the noise correlations in the rest of the population are different. It would also be worth digging into the noise correlations more - are units positively correlated because they respond together (e.g., if unit x responds on trial 1 so does unit y) or are they also modulated around their mean rates on similar trials (e.g., unit x and y respond and both are responding more than their mean response rate). A large portion of trial with no response can occlude noise correlations. More transparency around the response properties of these populations would be welcome.

      It is largely unclear what the DCIC is encoding. Although the authors are interested in azimuth, sound location seems to be only a small part of DCIC responses. The authors report responses during inter-sound interval and unreliable sound-evoked responses. Although they have video of the head during recording, we only see a correlation to snout and ear movements (which are peculiar since in the example shown it seems the head movements predict the sound presentation). Additional correlates could be eye movements or pupil size. Eye movement are of particular interest due to their known interaction with IC responses - especially if the DCIC encodes sound location in relation to eye position instead of head position (though much of eye-position-IC work was done in primates and not rodent). Alternatively, much of the population may only encode sound location if an animal is engaged in a localization task. Ideally, the authors could perform more substantive analyses to determine if this population is truly noisy or if the DCIC is integrating un-analyzed signals.

      Although this critique is ubiquitous among decoding papers in the absence of behavioral or causal perturbations, it is unclear what - if any - role the decoded information may play in neuronal computations. The interpretation of the decoder means that there is some extractable information concerning sound azimuth - but not if it is functional. This information may just be epiphenomenal, leaking in from inputs, and not used in computation or relayed to downstream structures. This should be kept in mind when the authors suggest their findings implicate the DCIC functionally in sound localization.

      It is unclear why positive noise correlations amongst similarly tuned neurons would improve decoding. A toy model exploring how positive noise correlations in conjunction with unreliable units that inconsistently respond may anchor these findings in an interpretable way. It seems plausible that inconsistent responses would benefit from strong noise correlations, simply by units responding together. This would predict that shuffling would impair performance because you would then be sampling from trials in which some units respond, and trials in which some units do not respond - and may predict a bimodal performance distribution in which some trials decode well (when the units respond) and poor performance (when the units do not respond).

      Significance: Boffi and colleagues set out to parse the azimuthal information available in the DCIC on a single trial. They largely accomplish this goal and are able to extract this information when allowing the units that contain more information about sound location to contribute to their decoding (e.g., through PCA or decoding on top unit activity specifically). The dataset will be of value to those interested in the DCIC and also to anyone interested in the role of noise correlations in population coding. Although this work is first step into parsing the information available in the DCIC, it remains difficult to interpret if/how this azimuthal information is used in localization behaviors of engaged mice.

    1. eLife assessment

      This valuable study provides convincing evidence that mutant hair cells with abnormal, reversed polarity of their hair bundles in mouse otolith organs retain wild-type localization, mechanoelectrical transduction and receptor field of their afferent innervation, leading to mild behavioral dysfunction. It thus demonstrates that the bimodal pattern of afferent nerve projections in this organ is not causally related to the bimodal distribution of hair-bundle orientations, as also confirmed in the zebrafish lateral line. The work will be of interest to scientists interested in the development and function of the vestibular system as well as in planar-cell polarity.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors aim at dissecting the relationship between hair-cell directional mechanosensation and orientation-linked synaptic selectivity, using mice and the zebrafish. They find that Gpr156 mutant animals homogenize the orientation of hair cells without affecting the selectivity of afferent neurons, suggesting that hair-cell orientation is not the feature that determines synaptic selectivity. Therefore, the process of Emx2-dependent synaptic selectivity bifurcates downstream of Gpr156.

      Strengths:

      This is an interesting and solid paper. It solves an interesting problem and establishes a framework for the following studies. That is, to ask what are the putative targets of Emx2 that affect synaptic selectivity.<br /> The quality of the data is generally excellent.

      Weaknesses:

      The feeling is that the advance derived from the results is very limited.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors inquire in particular whether the receptor Gpr156, which is necessary for hair cells to reverse their polarities in the zebrafish lateral line and mammalian otolith organs downstream of the differential expression of the transcription factor Emx2, also controls the mechanosensitive properties of hair cells and ultimately an animal's behavior. This study thoroughly addresses the issue by analyzing the morphology, electrophysiological responses, and afferent connections of hair cells found in different regions of the mammalian utricle and the Ca2+ responses of lateral line neuromasts in both wild-type animals and gpr156 mutants. Although many features of hair cell function are preserved in the mutants-such as development of the mechanosensory organs and the Emx2-dependent, polarity-specific afferent wiring and synaptic pairing-there are a few key changes. In the zebrafish neuromast, the magnitude of responses of all hair cells to water flow resembles that of the wild-type hair cells that respond to flow arriving from the tail. These responses are larger than those observed in hair cells that are sensitive to flow arriving from the head and resemble effects previously observed in Emx2 mutants. The authors note that this behavior suggests that the Emx2-GPR156 signaling axis also impinges on hair cell mechanotransduction. Although mutant mice exhibit normal posture and balance, they display defects in swimming behavior. Moreover, their vestibulo-ocular reflexes are perturbed. The authors note that the gpr156 mutant is a good model to study the role of opposing hair cell polarity in the vestibular system, for the wiring patterns follow the expression patterns of Emx2, even though hair cells are all of the same polarity. This paper excels at describing the effects of gpr156 perturbation in mouse and zebrafish models and will be of interest to those studying the vestibular system, hair cell polarity, and the role of inner-ear organs in animal behavior.

      Strengths:

      The study is exceptional in including, not only morphological and immunohistochemical indices of cellular identity but also electrophysiological properties. The mutant hair cells of murine maculæ display essentially normal mechanoelectrical transduction and adaptation-with two or even three kinetic components-as well as normal voltage-activated ionic currents.

    1. eLife assessment:

      This important study investigates the contribution of cytosolic S100A/8 to neutrophil migration to inflamed tissues. The authors provide convincing evidence for how the loss of cytosolic S100A/8 specifically affects the ability of neutrophils to crawl and subsequently adhere under shear stress. This study will be of interest in fields where inflammation is implicated, such as autoimmunity or sepsis.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Napoli et al, the authors study the intracellular function of Cytosolic S100A8/A9 a myeloid cell soluble protein that operates extracellularly as an alarmin, whose intracellular function is not well characterized. Here, the authors utilize state-of-the-art intravital microscopy to demonstrate that adhesion defects observed in cells lacking S100A8/A9 (Mrp14-/-) are not rescued by exogenous S100A8/A9, thus highlighting an intrinsic defect. Based on this result subsequent efforts were employed to characterize the nature of those adhesion defects.

      Strengths:

      The authors convincingly show that Mrp14-/- neutrophils have normal rolling but defective adhesion caused by impaired CD11b activation (deficient ICAM1 binding). Analysis of cellular spreading (defective in Mrp14-/- cells) is also sound. The manuscript then focuses on selective signaling pathways and calcium measurements. Overall, this is a straightforward study of biologically important proteins and mechanisms.

      Weaknesses:

      Some suggestions are included below to improve this manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      Napoli et al. provide a compelling study showing the importance of cytosolic S100A8/9 in maintaining calcium levels at LFA-1 nanoclusters at the cell membrane, thus allowing the successful crawling and adherence of neutrophils under shear stress. The authors show that cytosolic S100A8/9 is responsible for retaining stable and high concentrations of calcium specifically at LFA-1 nanoclusters upon binding to ICAM-1, and imply that this process aids in facilitating actin polymerisation involved in cell shape and adherence. The authors show early on that S100A8/9 deficient neutrophils fail to extravasate successfully into the tissue, thus suggesting that targeting cytosolic S100A8/9 could be useful in settings of autoimmunity/acute inflammation where neutrophil-induced collateral damage is unwanted.

      Strengths:

      Using multiple complementary methods from imaging to western blotting and flow cytometry, including extracellular supplementation of S100A8/9 in vivo, the authors conclusively prove a defect in intracellular S100A8/9, rather than extracellular S100A8/9 was responsible for the loss in neutrophil adherence, and pinpointed that S100A8/9 aided in calcium stabilisation and retention at the plasma membrane.

      Weaknesses:

      (1) Extravasation is shown to be a major defect of Mrp14-/- neutrophils, but the Giemsa staining in Figure 1H seems to be quite unspecific to me, as neutrophils were determined by nuclear shape and granularity. It would have perhaps been more clear to use immunofluorescence staining for neutrophils instead as seen in Supplementary Figure 1A (staining for Ly6G or other markers instead of S100A9).

      (2) The representative image for Mrp14-/- neutrophils used in Figure 4K to demonstrate Ripley's K function seems to be very different from that shown above in Figures 4C and 4F.

      (3) Although the authors have done well to draw a path linking cytosolic S100A8/9 to actin polymerisation and subsequently the arrest and adherence of neutrophils in vitro, the authors can be more explicit with the analysis - for example, is the F-actin co-localized with the LFA-1 nanoclusters? Does S100A8/9 localise to the membrane with LFA-1 upon stimulation? Lastly, I think it would have been very useful to close the loop on the extravasation observation with some in vitro evidence to show that neutrophils fail to extravasate under shear stress.

    1. eLife assessment

      This study presents an important finding on the influence of visual uncertainty and Bayesian cue combination on implicit motor adaptation in young healthy participants, hereby linking perception and action during implicit adaptation. The evidence supporting the claims of the authors is convincing. The normative approach of the proposed PEA model, which combines ideas from separate lines of research, including vision research and motor learning, opens avenues for future developments. This work will be of interest to researchers in sensory cue integration and motor learning.

    2. Reviewer #1 (Public Review):

      I appreciate the normative approach of the PEA model and am eager to examine this model in the future. However, two minor issues remain:

      (1) Clarification on the PReMo Model:

      The authors state, "The PReMo model proposes that this drift comprises two phases: initial proprioceptive recalibration and subsequent visual recalibration." This description could misinterpret the intent of PReMo. According to PReMo, the time course of the reported hand position is merely a read-out of the *perceived hand position* (x_hat in your paper). Early in adaptation, the perceived hand position is biased by the visual cursor (x_hat in the direction of the cursor); towards the end, due to implicit adaptation, x_hat reduces to zero. This is the same as PEA. I recommend that the authors clarify PReMo's intent to avoid confusion.

      Note, however, the observed overshoot of 1 degree in the reported hand position. In the PReMo paper, we hypothesized that this effect is due to the recalibration of the perceived visual target location (inspired by studies showing that vision is also recalibrated by proprioception, but in the opposite direction). If the goal of implicit adaptation is to align the perceived hand position (x_hat) with the perceived target position (t_hat), then there would be an overshoot of x_hat over the actual target position.

      PEA posits a different account for the overshoot. It currently suggests that the reported hand position combines x_hat (which takes x_p as input) with x_p itself. What is reasoning underlying the *double occurrence* of x_p?

      There seem to be three alternatives that seem more plausible (and could lead to the same overshooting): 1) increasing x_p's contribution (assuming visual uncertainty increases when the visual cursor is absent during the hand report phase), 2) decreasing sigma_p (assuming that participants pay more attention to the hand during the report phase), 3) it could be that the perceived target position undergoes recalibration in the opposite direction to proprioceptive recalibration. All these options, at least to me, seem equally plausible and testable in the future.

      (2) Effect of Visual Uncertainty on Error Size:

      I appreciate the authors' response about methodological differences between the cursor cloud used in previous studies and the Gaussian blob used in the current study. However, it is still not clear to me how the authors reconcile previous studies showing that visual uncertainty reduced implicit adaptation for small but not large errors (Tsay et al, 2021; Makino, et al 2023) with the current findings, where visual uncertainty reduced implicit adaptation for large but not small errors.

      Could the authors connect the dots here: I could see that the cursor cloud increases potential overlap with the visual target when the visual error is small, resulting in intrinsic reward-like mechanisms (Kim et al, 2019), which could potentially explain attenuated implicit adaptation for small visual errors. However, why would implicit adaptation in response to large visual errors remain unaffected by the cursor cloud? Note that we did verify that sigma_v is increased in (Tsay et al. 2021), so it is unlikely due to the cloud simply failing as a manipulation of visual uncertainty.

      In addition, we also reasoned that testing individuals with low vision could offer a different test of visual uncertainty (Tsay et al, 2023). The advantage here is that both control and patients with low vision are provided with the same visual input-a single cursor. Our findings suggest that uncertainty due to low vision also shows reduced implicit adaptation in response to small but not large errors, contrary to the findings in the current paper. Missing in the manuscript is a discussion related to why the authors' current findings contradict those of previous results.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors present the Perceptual Error Adaptation (PEA) model, a computational approach offering a unified explanation for behavioral results that are inconsistent with standard state-space models. Beginning with the conventional state-space framework, the paper introduces two innovative concepts. Firstly, errors are calculated based on the perceived hand position, determined through Bayesian integration of visual, proprioceptive, and predictive cues. Secondly, the model accounts for the eccentricity of vision, proposing that the uncertainty of cursor position increases with distance from the fixation point. This elegantly simple model, with minimal free parameters, effectively explains the observed plateau in motor adaptation under the implicit motor adaptation paradigm using the error-clamp method. Furthermore, the authors experimentally manipulate visual cursor uncertainty, a method established in visuomotor studies, to provide causal evidence. Their results show that the adaptation rate correlates with perturbation sizes and visual noise, uniquely explained by the PEA model and not by previous models. Therefore, the study convincingly demonstrates that implicit motor adaptation is a process of Bayesian cue integration

      Strengths:

      In the past decade, numerous perplexing results in visuomotor rotation tasks have questioned their underlying mechanisms. Prior models have individually addressed aspects like aiming strategies, motor adaptation plateaus, and sensory recalibration effects. However, a unified model encapsulating these phenomena with a simple computational principle was lacking. This paper addresses this gap with a robust Bayesian integration-based model. Its strength lies in two fundamental assumptions: motor adaptation's influence by visual eccentricity, a well-established vision science concept, and sensory estimation through Bayesian integration. By merging these well-founded principles, the authors elucidate previously incongruent and diverse results with an error-based update model. The incorporation of cursor feedback noise manipulation provides causal evidence for their model. The use of eye-tracking in their experimental design, and the analysis of adaptation studies based on estimated eccentricity, are particularly elegant. This paper makes a significant contribution to visuomotor learning research.

      The authors discussed in the revised version that the proposed model can capture the general implicit motor learning process in addition to the visuomotor rotation task. In the discussion, they emphasize two main principles: the automatic tracking of effector position and the combination of movement cues using Bayesian integration. These principles are suggested as key to understanding and modeling various motor adaptations and skill learning. The proposed model could potentially become a basis for creating new computational models for skill acquisition, especially where current models fall short.

      Weaknesses:

      The proposed model is described as elegant. In this paper, the authors test the model within a limited example condition, demonstrating its relevance to the sensorimotor adaptation mechanisms of the human brain. However, the scope of the model's applicability remains unclear. It has shown the capacity to explain prior data, thereby surpassing previous models that rely on elementary mathematics. To solidify its credibility in the field, the authors must gather more supporting evidence.

    4. Reviewer #3 (Public Review):

      (2.1) Summary

      In this paper, the authors model motor adaptation as a Bayesian process that combines visual uncertainty about the error feedback, uncertainty about proprioceptive sense of hand position, and uncertainty of predicted (=planned) hand movement with a learning and retention rate as used in state space models. The model is built with results from several experiments presented in the paper and is compared with the PReMo model (Tsay, Kim et al., 2022) as well as a cue combination model (Wei & Körding, 2009). The model and experiments demonstrate the role of visual uncertainty about error feedback in implicit adaptation.

      In the introduction, the authors notice that implicit adaptation (as measured in error-clamp based paradigms) does not saturate at larger perturbations, but decreases again (e.g. Moorehead et al., 2017 shows no adaptation at 135{degree sign} and 175{degree sign} perturbations). They hypothesized that visual uncertainty about cursor position increases with larger perturbations since the cursor is further from the fixated target. This could decrease importance assigned to visual feedback which could explain lower asymptotes.

      The authors characterize visual uncertainty for 3 rotation sizes in a first experiment, and while this experiment could be improved, it is probably sufficient for the current purposes. Then the authors present a second experiment where adaptation to 7 clamped errors are tested in different groups of participants. The models' visual uncertainty is set using a linear fit to the results from experiment 1, and the remaining 4 parameters are then fit to this second data set. The 4 parameters are 1) proprioceptive uncertainty, 2) uncertainty about the predicted hand position, 3) a learning rate and 4) a retention rate. The authors' Perceptual Error Adaptation model ("PEA") predicts asymptotic levels of implicit adaptation much better than both the PReMo model (Tsay, Kim et al., 2022), which predicts saturated asymptotes, or a causal inference model (Wei & Körding, 2007) which predicts no adaptation for larger rotations. In a third experiment, the authors test their model's predictions about proprioceptive recalibration, but unfortunately compare their data with an unsuitable other data set (Tsay et al. 2020, instead of Tsay et al. 2021). Finally, the authors conduct a fourth experiment where they put their model to the test. They measure implicit adaptation with increased visual uncertainty, by adding blur to the cursor, and the results are again better in line with their model (predicting overall lower adaptation), than with the PReMo model (predicting equal saturation but at larger perturbations) or a causal inference model (predicting equal peak adaptation, but shifted to larger rotations). In particular the model fits for experiment 2 and the results from experiment 4 show that the core idea of the model has merit: increased visual uncertainty about errors dampens implicit adaptation.

      (2.2) Strengths

      In this study the authors propose a Perceptual Error Adaptation model ("PEA") and the work combines various ideas from the field of cue combination, Bayesian methods and new data sets, collected in four experiments using various techniques that test very different components of the model. The central component of visual uncertainty is assessed in a first experiment. The model uses 4 other parameters to explain implicit adaptation. These parameters are: 1) a learning and 2) a retention rate, as used in popular state space models and the uncertainty (variance) of 3) predicted and 4) proprioceptive hand position. In particular, the authors observe that asymptotes for implicit learning do not saturate, as claimed before, but decrease again when rotations are very large and that this may have to do with visual uncertainty (e.g. Tsay et al., 2021, J Neurophysiol 125, 12-22). The final experiment confirms predictions of the fitted model about what happens when visual uncertainty is increased (overall decrease of adaptation). By incorporating visual uncertainty depending on retinal eccentricity, the predictions of the PEA model for very large perturbations are notably different from, and better than, the predictions of the two other models it is compared to. That is, the paper provides strong support for the idea that visual uncertainty of errors matters for implicit adaptation.

      (2.3) Weaknesses

      Although the authors don't say this, the "concave" function that shows that adaptation does not saturate for larger rotations has been shown before, including in papers cited in this manuscript.

      The first experiment, measuring visual uncertainty for several rotation sizes in error-clamped paradigms has several shortcomings, but these might not be so large as to invalidate the model or the findings in the rest of the manuscript. There are two main issues we highlight here. First, the data is not presented in units that allow comparison with vision science literature. Second, the 1 second delay between movement endpoint and disappearance of the cursor, and the presentation of the reference marker, may have led to substantial degradation of the visual memory of the cursor endpoint. That is, the experiment could be overestimating the visual uncertainty during implicit adaptation.

      The paper's third experiment relies to a large degree on reproducing patterns found in one particular paper, where the reported hand positions - as a measure of proprioceptive sense of hand position - are given and plotted relative to an ever present visual target, rather than relative to the actual hand position. That is, 1) since participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to, and 2) if the reports are converted to a difference between the real and reported hand position (rather than the difference between the target and the report), those would be on the order of ~20{degree sign} which is roughly two times larger than any previously reported proprioceptive recalibration, and an order of magnitude larger than what the authors themselves find (1-2{degree sign}) and what their model predicts. Experiment 3 is perhaps not crucial to the paper, but it nicely provides support for the idea that proprioceptive recalibration can occur with error-clamped feedback.

      Perhaps the largest caveat to the study is that it assumes that people do not look at the only error feedback available to them (and can explicitly suppress learning from it). This was probably true in the experiments used in the manuscript, but unlikely to be the case in most of the cited literature. Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world, such that our brains may not be very good at this. So the question remains to what degree - if any - the ideas behind the model generalize to experiments without fixation control, and more importantly, to real life situations.

    5. Author response:

      The following is the authors’ response to the current reviews.

      eLife assessment

      This study presents an important finding on the influence of visual uncertainty and Bayesian cue combination on implicit motor adaptation in young healthy participants, hereby linking perception and action during implicit adaptation. The evidence supporting the claims of the authors is convincing. The normative approach of the proposed PEA model, which combines ideas from separate lines of research, including vision research and motor learning, opens avenues for future developments. This work will be of interest to researchers in sensory cue integration and motor learning.

      Thank you for the updated assessment. We are also grateful for the insightful and constructive comments from the reviewers, which have helped us improve the manuscript again. We made necessary changes following their comments (trimmed tests, new analysis results, etc) and responded to the comments in a point-by-point fashion below. We hope to publish these responses alongside the public review. Thank you again for fostering the fruitful discussion here.

      Public Reviews:

      Reviewer #1 (Public Review):

      I appreciate the normative approach of the PEA model and am eager to examine this model in the future. However, two minor issues remain:

      (1) Clarification on the PReMo Model:

      The authors state, "The PReMo model proposes that this drift comprises two phases: initial proprioceptive recalibration and subsequent visual recalibration." This description could misinterpret the intent of PReMo. According to PReMo, the time course of the reported hand position is merely a read-out of the *perceived hand position* (x_hat in your paper). Early in adaptation, the perceived hand position is biased by the visual cursor (x_hat in the direction of the cursor); towards the end, due to implicit adaptation, x_hat reduces to zero. This is the same as PEA. I recommend that the authors clarify PReMo's intent to avoid confusion.

      Note, however, the observed overshoot of 1 degree in the reported hand position. In the PReMo paper, we hypothesized that this effect is due to the recalibration of the perceived visual target location (inspired by studies showing that vision is also recalibrated by proprioception, but in the opposite direction). If the goal of implicit adaptation is to align the perceived hand position (x_hat) with the perceived target position (t_hat), then there would be an overshoot of x_hat over the actual target position.

      PEA posits a different account for the overshoot. It currently suggests that the reported hand position combines x_hat (which takes x_p as input) with x_p itself. What is reasoning underlying the *double occurrence* of x_p?

      There seem to be three alternatives that seem more plausible (and could lead to the same overshooting): 1) increasing x_p's contribution (assuming visual uncertainty increases when the visual cursor is absent during the hand report phase), 2) decreasing sigma_p (assuming that participants pay more attention to the hand during the report phase), 3) it could be that the perceived target position undergoes recalibration in the opposite direction to proprioceptive recalibration. All these options, at least to me, seem equally plausible and testable in the future.

      For clarification of the PReMo model’s take on Fig4A, we now write:

      “The PReMo model proposes that the initial negative drift reflects a misperceived hand location, which gradually reduces to zero, and the late positive drift reflects the influence of visual calibration of the target (Tsay, Kim, Saxena, et al., 2022). ”

      However, we would like to point out that the PEA model does not predict a zero (perceived hand location) even at the late phase of adaptation: it remains negative, though not as large as during initial adaptation (see Figure 4A, red line). Furthermore, we have not seen any plausible way to use a visually biased target to explain the overshoot of the judged hand location (see below when we address the three alternative hypotheses the reviewer raised).

      We don’t think the “double” use of xp is a problem, simply because there are TWO tasks under investigation when the proprioceptive changes are measured along with adaptation. The first is the reaching adaptation task itself: moving under the influence of the clamped cursor. This task is accompanied by a covert estimation of hand location after the movement (). Given the robustness of implicit adaptation, this estimation appears mandatory and automatic. The second task is the hand localization task, during which the subject is explicitly asked to judge where the hand is. Here, the perceived hand is based on the two available cues, one is the actual hand location xp, and the other is the influence from the just finished reaching movement (i.e., ). For Bayesian modeling from a normative perspective, sensory integration is based on the available cues to fulfill the task. For the second task of reporting the hand location, the two cues are xp and (with a possible effect of the visual target, which is unbiased since it is defined as 0 in model simulation; thus, its presence does not induce any shift effect). xp is used sequentially in this sense. Thus, its dual use is well justified.

      Our hypothesis is that the reported hand position results from a combination of from the previous movement and the current hand position xp. However, specifically for the overshoot of the judged hand location in the late part of the adaptation (Fig4A), the reviewer raised three alternative explanations by assuming that the PReMo model is correct. Under the PReMo model, the estimated hand location is only determined by , and xp is not used in the hand location report phase. In addition, (with xp used once) and a visual recalibration of the target can explain away the gradual shift from negative to positive (overshoot).

      We don’t think any of them can parsimoniously explain our findings here, and we go through these three hypotheses one by one:

      (1) increasing xp's contribution (assuming visual uncertainty increases when the visual cursor is absent during the hand report phase)

      (2) decreasing σp (assuming that participants pay more attention to the hand during the report phase)

      The first two alternative explanations basically assume that xp has a larger contribution (weighting in Bayesian terms) in the hand location report phase than in the adaptation movement phase, no matter due to an increase in visual uncertainty (alternative explanation 1) or a reduction in proprioceptive uncertainty (alternative explanation 2). Thus, we assume that the reviewer suggests that a larger weight for xp can explain why the perceived hand location changes gradually from negative to positive. However, per the PReMo model, a larger weight for the xp will only affect , which is already assumed to change from negative to zero. More weight in  in the hand report phase (compared to the adaptation movement phase) would not explain away the reported hand location from negative to positive. This is because no matter how much weight the xp has, the PReMo model assumes a saturation for the influence of xp on . Thus would not exceed zero in the late adaptation. Then, the PReMo model would rely on the so-called visual shift of the target to explain the overshoot. This leads us to the third alternative the reviewer raised:

      (3) it could be that the perceived target position undergoes recalibration in the opposite direction to proprioceptive recalibration.

      The PReMo model originally assumed that the perceived target location was biased in order to explain away the positive overshoot of the reported hand location. We assume that the reviewer suggests that the perceived target position, which is shifted to the positive direction, also “biases” the perceived hand position. We also assume that the reviewer suggests that the perceived hand location after a clamp trial () is zero, and somehow the shifted perceived target position “biases” the reported hand location after a clamp trial. Unfortunately, we did not see any mathematical formulation of this biasing effect in the original paper (Tsay, Kim, Haith, et al., 2022). We are not able to come up with any formulation of this hypothesized biasing effect based on Bayesian cue integration principles. Target and hand are two separate perceived items; how one relates to another needs justification from a normative perspective when discussing Bayesian models. Note this is not a problem for our PEA models, in which both cues used are about hand localization, one is and the other is xp.

      We believe that mathematically formulating the biasing effect (Figure 4A) is non-trivial since the reported hand location changes continuously from negative to positive. Thus, quantitative model predictions, like the ones our PEA model presents here, are needed.

      To rigorously test the possible effect of visual recalibration of the target, there are two things to do: 1) use the psychometric method to measure the biased perception of the target, and 2) re-do Tsay et al. 2020 experiment without the target. For 2), compared to the case with the target, the PEA model would predict a larger overshoot, while the PReMo would predict a smaller overshoot or even zero overshoot. This can be left for future studies.

      (2) Effect of Visual Uncertainty on Error Size:

      I appreciate the authors' response about methodological differences between the cursor cloud used in previous studies and the Gaussian blob used in the current study. However, it is still not clear to me how the authors reconcile previous studies showing that visual uncertainty reduced implicit adaptation for small but not large errors (Tsay et al, 2021; Makino, et al 2023) with the current findings, where visual uncertainty reduced implicit adaptation for large but not small errors.

      Could the authors connect the dots here: I could see that the cursor cloud increases potential overlap with the visual target when the visual error is small, resulting in intrinsic reward-like mechanisms (Kim et al, 2019), which could potentially explain attenuated implicit adaptation for small visual errors. However, why would implicit adaptation in response to large visual errors remain unaffected by the cursor cloud? Note that we did verify that sigma_v is increased in (Tsay et al. 2021), so it is unlikely due to the cloud simply failing as a manipulation of visual uncertainty.

      In addition, we also reasoned that testing individuals with low vision could offer a different test of visual uncertainty (Tsay et al, 2023). The advantage here is that both control and patients with low vision are provided with the same visual input-a single cursor. Our findings suggest that uncertainty due to low vision also shows reduced implicit adaptation in response to small but not large errors, contrary to the findings in the current paper. Missing in the manuscript is a discussion related to why the authors' current findings contradict those of previous results.

      For connecting the dots for two previous studies (Tsay et al., 2021, 2023); Note Makino et al., 2023 is not in this discussion since it investigated the weights of multiple cursors, as opposed to visual uncertainty associated with a cursor cloud):

      First, we want to re-emphasize that using the cursor cloud to manipulate visual uncertainty brings some confounds, making it not ideal for studying visuomotor adaptation. For example, in the error clamp paradigm, the error is defined as angular deviation. The cursor cloud consists of multiple cursors spanning over a range of angles, which affects both the sensory uncertainty (the intended outcome) and the sensory estimate of angles (the error estimate, the undesired outcome). In Bayesian terms, the cursor cloud aims to modulate the sigma of a distribution (σv) in our model), but it additionally affects the mean of the distribution (µ). This unnecessary confound is neatly avoided by using cursor blurring, which is still a cursor with its center (µ) unchanged from a single cursor. Furthermore, as correctly pointed out in the original paper by Tsay et al., 2020, the cursor cloud often overlaps with the visual target; this "target hit" would affect adaptation, possibly via a reward learning mechanism (Kim et al., 2019). This is a second confound that accompanies the cursor cloud. Yes, the cursor cloud was verified as associated with high visual uncertainty (Tsay et al., 2021); this verification was done with a psychophysics method with a clean background, not in the context of a hand reaching a target that is needed. Thus, despite the cursor cloud having a sizeable visual uncertainty, our criticisms for it still hold when used in error-clamp adaptation.

      Second, bearing these confounds of the cursor cloud in mind, we postulate one important factor that has not been considered in any models thus far that might underlie the lack of difference between the single-cursor clamp and the cloud-cursor clamp when the clamp size is large: the cursor cloud might be harder to ignore than a single cursor. For Bayesian sensory integration, the naive model is to consider the relative reliability of cues only. Yes, the cloud is more uncertain in terms of indicating the movement direction than a single cursor. However, given its large spread, it is probably harder to ignore during error-clamp movements. Note that ignoring the clamped cursor is the task instruction, but the large scatter of the cursor cloud is more salient and thus plausible and harder to ignore. This might increase the weighting of the visual cue despite its higher visual uncertainty. This extra confound is arguably minimized by using the blurred cursor as in our Exp4 since the blurred cursor did not increase the visual angle much (Figure 5D; blurred vs single cursor: 3.4mm vs 2.5mm in radius, 3.90o vs  2.87o in spread). In contrast, the visual angle of the dot cloud is at least a magnitude larger (cursor cloud vs. single cursor: at least 25o vs. 2.15o in the spread, given a 10o standard deviation of random sampling).

      Third, for the low-vision study (Tsay et al., 2023), the patients indeed show reduced implicit adaptation for a 3 o clamp (consistent with our PEA model) but an intact adaptation for 30-degree clamp (not consistent). Though this pattern appears similar to what happens for normal people whose visual uncertainty is upregulated by cursor cloud (Tsay et al., 2021), we are not completely convinced that the same underlying mechanism governs these two datasets. Low-vision patients indeed have higher visual uncertainty about color, brightness, and object location, but their visual uncertainty about visual motion is still unknown. Due to the difference in impairment among low vision people (e.g., peripheral or central affected) and the different roles of peripheral and central vision in movement planning and control (Sivak & Mackenzie, 1992), it is unclear about the overall effect of visual uncertainty in low vision people. The direction of cursor movement that matters for visuomotor rotation here is likely related to visual motion perception. Unfortunately, the original study did not measure this uncertainty in low-vision patients. We believe our Exp1 offers a valid method for this purpose for future studies. More importantly, we should not expect low-vision patients to integrate visual cues in the same way as normal people, given their long-term adaptation to their vision difficulties. Thus, we are conservative about interpreting the seemingly similar findings across the two studies (Tsay et al., 2021, 2023) as revealing the same mechanism.

      A side note: these two previous studies proposed a so-called mis-localization hypothesis, i.e., the cursor cloud was mislocated for small clamp size (given its overlapping with the target) but not for large clamp size. They suggested that the lack of uncertainty effect at small clamp sizes is due to mislocalization, while the lack of uncertainty effect at large clamp sizes is because implicit adaptation is not sensitive to uncertainty at large angles. Thus, these two studies admit that cursor cloud not only upregulates uncertainty but also generates an unwanted effect of so-called “mis-localization” (overlapping with the target). Interestingly, their hypothesis about less sensitivity to visual uncertainty for large clamps is not supported by a model or theory but merely a re-wording of the experiment results.

      In sum, our current study cannot offer an easy answer to "connect the dots" in the aforementioned two studies due to methodology issues and the specialty of the population. However, for resolving conflicting findings, our study suggests solutions include using a psychometric test to quantify visual uncertainty for cursor motion (Exp1), a better uncertainty-manipulation method to avoid a couple of confounds (Exp4, blurred cursor), and a falsifiable model. Future endeavors can solve the difference between studies based on the new insights from the current.

      Reviewer #2 (Public Review):

      Summary:

      The authors present the Perceptual Error Adaptation (PEA) model, a computational approach offering a unified explanation for behavioral results that are inconsistent with standard state-space models. Beginning with the conventional state-space framework, the paper introduces two innovative concepts. Firstly, errors are calculated based on the perceived hand position, determined through Bayesian integration of visual, proprioceptive, and predictive cues. Secondly, the model accounts for the eccentricity of vision, proposing that the uncertainty of cursor position increases with distance from the fixation point. This elegantly simple model, with minimal free parameters, effectively explains the observed plateau in motor adaptation under the implicit motor adaptation paradigm using the error-clamp method. Furthermore, the authors experimentally manipulate visual cursor uncertainty, a method established in visuomotor studies, to provide causal evidence. Their results show that the adaptation rate correlates with perturbation sizes and visual noise, uniquely explained by the PEA model and not by previous models. Therefore, the study convincingly demonstrates that implicit motor adaptation is a process of Bayesian cue integration

      Strengths:

      In the past decade, numerous perplexing results in visuomotor rotation tasks have questioned their underlying mechanisms. Prior models have individually addressed aspects like aiming strategies, motor adaptation plateaus, and sensory recalibration effects. However, a unified model encapsulating these phenomena with a simple computational principle was lacking. This paper addresses this gap with a robust Bayesian integration-based model. Its strength lies in two fundamental assumptions: motor adaptation's influence by visual eccentricity, a well-established vision science concept, and sensory estimation through Bayesian integration. By merging these well-founded principles, the authors elucidate previously incongruent and diverse results with an error-based update model. The incorporation of cursor feedback noise manipulation provides causal evidence for their model. The use of eye-tracking in their experimental design, and the analysis of adaptation studies based on estimated eccentricity, are particularly elegant. This paper makes a significant contribution to visuomotor learning research.

      The authors discussed in the revised version that the proposed model can capture the general implicit motor learning process in addition to the visuomotor rotation task. In the discussion, they emphasize two main principles: the automatic tracking of effector position and the combination of movement cues using Bayesian integration. These principles are suggested as key to understanding and modeling various motor adaptations and skill learning. The proposed model could potentially become a basis for creating new computational models for skill acquisition, especially where current models fall short.

      Weaknesses:

      The proposed model is described as elegant. In this paper, the authors test the model within a limited example condition, demonstrating its relevance to the sensorimotor adaptation mechanisms of the human brain. However, the scope of the model's applicability remains unclear. It has shown the capacity to explain prior data, thereby surpassing previous models that rely on elementary mathematics. To solidify its credibility in the field, the authors must gather more supporting evidence.

      Indeed, our model here is based on one particular experimental paradigm, i.e., the error-clamp adaptation. We used it simply because 1) this paradigm is one rare example that implicit motor learning can be isolated in a clean way, and 2) there are a few conflicting findings in the literature for us to explain away by using a unified model.

      For our model’s broad impact, we believe that as long as people need to locate their effectors during motor learning, the general principle laid out here will be applicable. In other words, repetitive movements with a Bayesian cue combination of movement-related cues can underlie the implicit process of various motor learning. To showcase its broad impact, in upcoming studies, we will extend this model to other motor learning paradigms, starting from motor adaptation paradigms that involve both explicit and implicit processes.

      Reviewer #3 (Public Review):

      (2.1) Summary

      In this paper, the authors model motor adaptation as a Bayesian process that combines visual uncertainty about the error feedback, uncertainty about proprioceptive sense of hand position, and uncertainty of predicted (=planned) hand movement with a learning and retention rate as used in state space models. The model is built with results from several experiments presented in the paper and is compared with the PReMo model (Tsay, Kim et al., 2022) as well as a cue combination model (Wei & Körding, 2009). The model and experiments demonstrate the role of visual uncertainty about error feedback in implicit adaptation.

      In the introduction, the authors notice that implicit adaptation (as measured in error-clamp based paradigms) does not saturate at larger perturbations, but decreases again (e.g. Moorehead et al., 2017 shows no adaptation at 135{degree sign} and 175{degree sign} perturbations). They hypothesized that visual uncertainty about cursor position increases with larger perturbations since the cursor is further from the fixated target. This could decrease importance assigned to visual feedback which could explain lower asymptotes.

      The authors characterize visual uncertainty for 3 rotation sizes in a first experiment, and while this experiment could be improved, it is probably sufficient for the current purposes. Then the authors present a second experiment where adaptation to 7 clamped errors are tested in different groups of participants. The models' visual uncertainty is set using a linear fit to the results from experiment 1, and the remaining 4 parameters are then fit to this second data set. The 4 parameters are 1) proprioceptive uncertainty, 2) uncertainty about the predicted hand position, 3) a learning rate and 4) a retention rate. The authors' Perceptual Error Adaptation model ("PEA") predicts asymptotic levels of implicit adaptation much better than both the PReMo model (Tsay, Kim et al., 2022), which predicts saturated asymptotes, or a causal inference model (Wei & Körding, 2007) which predicts no adaptation for larger rotations. In a third experiment, the authors test their model's predictions about proprioceptive recalibration, but unfortunately compare their data with an unsuitable other data set (Tsay et al. 2020, instead of Tsay et al. 2021). Finally, the authors conduct a fourth experiment where they put their model to the test. They measure implicit adaptation with increased visual uncertainty, by adding blur to the cursor, and the results are again better in line with their model (predicting overall lower adaptation), than with the PReMo model (predicting equal saturation but at larger perturbations) or a causal inference model (predicting equal peak adaptation, but shifted to larger rotations). In particular the model fits for experiment 2 and the results from experiment 4 show that the core idea of the model has merit: increased visual uncertainty about errors dampens implicit adaptation.

      (2.2) Strengths

      In this study the authors propose a Perceptual Error Adaptation model ("PEA") and the work combines various ideas from the field of cue combination, Bayesian methods and new data sets, collected in four experiments using various techniques that test very different components of the model. The central component of visual uncertainty is assessed in a first experiment. The model uses 4 other parameters to explain implicit adaptation. These parameters are: 1) a learning and 2) a retention rate, as used in popular state space models and the uncertainty (variance) of 3) predicted and 4) proprioceptive hand position. In particular, the authors observe that asymptotes for implicit learning do not saturate, as claimed before, but decrease again when rotations are very large and that this may have to do with visual uncertainty (e.g. Tsay et al., 2021, J Neurophysiol 125, 12-22). The final experiment confirms predictions of the fitted model about what happens when visual uncertainty is increased (overall decrease of adaptation). By incorporating visual uncertainty depending on retinal eccentricity, the predictions of the PEA model for very large perturbations are notably different from, and better than, the predictions of the two other models it is compared to. That is, the paper provides strong support for the idea that visual uncertainty of errors matters for implicit adaptation.

      (2.3) Weaknesses

      Although the authors don't say this, the "concave" function that shows that adaptation does not saturate for larger rotations has been shown before, including in papers cited in this manuscript.

      For a proper citation of the “concave” adaptation function: we assume the reviewer is referring to the study by Morehead, 2017 which tested large clamp sizes up to 135 o and 175 o. Unsurprisingly, the 135 o and 175 o conditions lead to nearly zero adaptation, possibly due to the trivial fact that people cannot even see the moving cursor. We have quoted this seminar study from the very beginning. All other error-clamp studies with a block design emphasized an invariant or saturated implicit adaptation with large rotations (e.g., Kim, et al., 2019).

      The first experiment, measuring visual uncertainty for several rotation sizes in error-clamped paradigms has several shortcomings, but these might not be so large as to invalidate the model or the findings in the rest of the manuscript. There are two main issues we highlight here. First, the data is not presented in units that allow comparison with vision science literature. Second, the 1 second delay between movement endpoint and disappearance of the cursor, and the presentation of the reference marker, may have led to substantial degradation of the visual memory of the cursor endpoint. That is, the experiment could be overestimating the visual uncertainty during implicit adaptation.

      For the issues related to visual uncertainty measurement in Exp1:

      First, our visual uncertainty is about cursor motion direction in the display plane, and the measurement in Exp1 has never been done before. Thus, we do not think our data is comparable to any findings in visual science about fovea/peripheral comparison. We quoted Klein and others’ work (Klein & Levi, 1987; Levi et al., 1987) in vision science since their studies showed that the deviation from the fixation is associated with an increase in visual uncertainty. Their study thus inspired us to conduct Exp1 to probe how our concerned visual uncertainty (specifically for visual motion direction) changes with an increasing deviation from the fixation. Any model and its model parameters should be specifically tailored to the task or context it tries to emulate. In our case, motion direction in a center-out-reaching setting is the modeled context, and all the relevant model parameters should be specified in movement angles. This is particularly important since we need to estimate parameters from one experiment to predict behaviors in another experiment.

      Second, the 1s delay of the reference cursor has minimal impact on the estimate of visual uncertainty based on previous vision studies. Our Exp1 used a similar visual paradigm by (White et al., 1992), which shows that delay does not lead to an increase in visual uncertainty over a broad range of values (from 0.2s to >1s, see their Figure 5-6).

      These two problems have been addressed in the revised manuscript, with proper citations listed.

      The paper's third experiment relies to a large degree on reproducing patterns found in one particular paper, where the reported hand positions - as a measure of proprioceptive sense of hand position - are given and plotted relative to an ever present visual target, rather than relative to the actual hand position. That is, 1) since participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to, and 2) if the reports are converted to a difference between the real and reported hand position (rather than the difference between the target and the report), those would be on the order of ~20° which is roughly two times larger than any previously reported proprioceptive recalibration, and an order of magnitude larger than what the authors themselves find (1-2°) and what their model predicts. Experiment 3 is perhaps not crucial to the paper, but it nicely provides support for the idea that proprioceptive recalibration can occur with error-clamped feedback.

      Reviewer 3 thinks Tsay 2020 dataset is not appropriate for our theorization, but we respectfully disagree. For the three points raised here, we would like to elaborate:

      (1) As we addressed in the previous response, the reported hand location in Figure 4A (Tsay et al., 2020) is not from a test of proprioceptive recalibration as conventionally defined. In the revision, we explicitly state that this dataset is not about proprioceptive recalibration and also delete texts that might mislead people to think so (see Results section). Instead, proprioceptive recalibration is measured by passive movement, as in our Exp3 (Figure 4E). For error-clamp adaptation here, "the remembered position of the target" is the target. Clearly, the participants did not report the target position, which is ever-present. Instead, their reported hand location shows an interestingly continuous change with ongoing adaptation.

      (2) Since the Tsay 2020 dataset is not a so-called proprioceptive recalibration, we need not take the difference between the reported location and the actual hand location. Indeed, the difference would be ~20 degrees, but comparing it to the previously reported proprioceptive recalibration is like comparing apples to oranges. In fact, throughout the paper, we refer to the results in Fig 4A as “reported hand location”, not proprioceptive recalibration. The target direction is defined as zero degree thus its presence will not bias the reported hand in the Bayesian cue combination (as this visual cue has a mean value of 0). Using the target as the reference also simplifies our modeling.

      (3) Exp3 is crucial for our study since it shows our model and its simple Bayesian cue combination principle are applicable not only to implicit adaptation but also to proprioceptive measures during adaptation. Furthermore, it reproduced the so-called proprioceptive recalibration and explained it away with the same Bayesian cue combination as the adaptation. We noticed that this field has accumulated an array of findings on proprioceptive changes induced by visuomotor adaptation. However, currently, there is a lack of a computational model to quantitatively explain them. Our study at least made an initial endeavor to model these changes.

      Perhaps the largest caveat to the study is that it assumes that people do not look at the only error feedback available to them (and can explicitly suppress learning from it). This was probably true in the experiments used in the manuscript, but unlikely to be the case in most of the cited literature. Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world, such that our brains may not be very good at this. So the question remains to what degree - if any - the ideas behind the model generalize to experiments without fixation control, and more importantly, to real life situations.

      The largest caveat raised by the reviewer appears to be directed to the error-clamp paradigm in general, not only to our particular study. In essence, this paradigm indeed requires participants to ignore the clamped error; thus, its induced adaptive response can be attributed to implicit adaptation. The original paper that proposed this paradigm (Morehead et al., 2017) has been cited 220 times (According to Google Scholar, at the time of this writing, 06/2024), indicating that the field has viewed this paradigm in a favorable way.

      Furthermore, we agree that this kind of instruction and feedback (invariant clamp) differ from daily life experience, but it does not prevent us from gaining theoretical insights by studying human behaviors under this kind of "artificial" task setting. Thinking of the saccadic adaptation (Deubel, 1987; Kojima et al., 2004): jumping the target while the eye moves towards it, and this somewhat artificial manipulation again makes people adapt implicitly, and the adaptation itself is a "disastrous" strategy for real-life situations. However, scientists have gained an enormous understanding of motor adaptation using this seemingly counterproductive adaptation in real life. Also, think of perceptual learning of task-irrelevant stimuli (Seitz & Watanabe, 2005, 2009): when participants are required to learn to discriminate one type of visual stimuli, the background shows another type of stimuli, which people gradually learn even though they do not even notice its presence. This "implicit" learning can be detrimental to our real life, too, but the paradigm itself has advanced our understanding of the inner workings of the cognitive system.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      L101: There is a typo: (Tsay et al., 2020), 2020) should be corrected to (Tsay et al., 2020).

      Thanks for pointing it out, we corrected this typo.

      L224-228: It would be beneficial to evaluate the validity of the estimated sigma_u and sigma_p based on previous reports.

      We can roughly estimate σu by evaluating the variability of reaching angles during the baseline phase when no perturbation is applied. The standard deviation of the reaching angle in Exp 2 is 5.128o±0.190o, which is close to the σu estimated by the model (5.048o). We also used a separate perceptual experiment to test the proprioceptive uncertainty (n = 13, See Figure S6), σp from this experiment is 9.737o±5.598o, also close to the σp extracted by the model (11.119o). We added these new analysis results to the final version of the paper.

      L289-298: I found it difficult to understand the update equations of the proprioceptive calibration based on the PEA model. Providing references to the equations or better explanations would be helpful.

      We expanded the process of proprioceptive calibration in Supplementary Text 1 with step-by-step equations and more explanations. 

      Reviewer #3 (Recommendations For The Authors):

      Suggestions (or clarification of previous suggestions) for revisions

      The authors persist on using the Tsay et al 2020 paper despite its many drawbacks which the authors attempt to address in their reply. But the main drawback is that the results in the 2020 paper is NOT relative to the unseen hand but to the visual target the participants were supposed to move their hand to. If the results were converted so to be relative to the unseen hand, the localization biases would be over 20 deg in magnitude.

      The PEA simulations are plotted relative to the unseen hand which makes sense. If the authors want to persist using the Tsay 2020 dataset despite any issues, they at least need to make sure that the simulations are mimicking the same change. That is, the data from Tsay 2020 needs to be converted to the same variable used in the current paper.

      If the main objection for using the Tsay 2021 is that the design would lead to forgetting, we found that active localization (or any intervening active movements like no-cursor reach) does lead to some interference or forgetting (a small reduction in overall magnitude of adaptation) this is not the case for passive localization, see Ruttle et al, 2021 (data on osf). This was also just a suggestion, there may of course also be other, more suitable data sets.

      As stated above, changing the reference system is not necessary, nor does it affect our results. Tsay et al 2020 dataset is unique since it shows the gradual change of reported hand location along with error-clamp adaptation. The forgetting (or reduction in proprioceptive bias), even if it exists, would not affect the fitting quality of our model for the Tsay 2020 dataset: if we assume that forgetting is invariant over the adaptation process, the forgetting would only reduce the proprioceptive bias uniformly across trials. This can be accounted for by a smaller weight on . The critical fact is that the model can explain the gradual drift of the proprioceptive judgment of the hand location.

      By the way, Ruttle et al.'s 2021 dataset is not for error-clamp adaptation, and thus we will leave it to test our model extension in the future (after incorporating an explicit process in the model).

      References

      Deubel, H. (1987). Adaptivity of gain and direction in oblique saccades. Eye Movements from Physiology to Cognition. https://www.sciencedirect.com/science/article/pii/B9780444701138500308

      Kim, H. E., Parvin, D. E., & Ivry, R. B. (2019). The influence of task outcome on implicit motor learning. ELife, 8. https://doi.org/10.7554/eLife.39882

      Klein, S. A., & Levi, D. M. (1987). Position sense of the peripheral retina. JOSA A, 4(8), 1543–1553.

      Kojima, Y., Iwamoto, Y., & Yoshida, K. (2004). Memory of learning facilitates saccadic adaptation in the monkey. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 24(34), 7531–7539.

      Levi, D. M., Klein, S. A., & Yap, Y. L. (1987). Positional uncertainty in peripheral and amblyopic vision. Vision Research, 27(4), 581–597.

      Morehead, J. R., Taylor, J. A., Parvin, D. E., & Ivry, R. B. (2017). Characteristics of implicit sensorimotor adaptation revealed by task-irrelevant clamped feedback. Journal of Cognitive Neuroscience, 29(6), 1061–1074.

      Seitz, & Watanabe. (2005). A unified model for perceptual learning. Trends in Cognitive Sciences, 9(7), 329–334.

      Seitz, & Watanabe. (2009). The phenomenon of task-irrelevant perceptual learning. Vision Research, 49(21), 2604–2610.

      Sivak, B., & Mackenzie, C. L. (1992). Chapter 10 The Contributions of Peripheral Vision and Central Vision to Prehension. In L. Proteau & D. Elliott (Eds.), Advances in Psychology (Vol. 85, pp. 233–259). North-Holland.

      Tsay, J. S., Avraham, G., Kim, H. E., Parvin, D. E., Wang, Z., & Ivry, R. B. (2021). The effect of visual uncertainty on implicit motor adaptation. Journal of Neurophysiology, 125(1), 12–22.

      Tsay, J. S., Kim, H. E., Saxena, A., Parvin, D. E., Verstynen, T., & Ivry, R. B. (2022). Dissociable use-dependent processes for volitional goal-directed reaching. Proceedings. Biological Sciences / The Royal Society, 289(1973), 20220415.

      Tsay, J. S., Kim, H., Haith, A. M., & Ivry, R. B. (2022). Understanding implicit sensorimotor adaptation as a process of proprioceptive re-alignment. ELife, 11, e76639.

      Tsay, J. S., Parvin, D. E., & Ivry, R. B. (2020). Continuous reports of sensed hand position during sensorimotor adaptation. Journal of Neurophysiology, 124(4), 1122–1130.

      Tsay, J. S., Tan, S., Chu, M. A., Ivry, R. B., & Cooper, E. A. (2023). Low Vision Impairs Implicit Sensorimotor Adaptation in Response to Small Errors, But Not Large Errors. Journal of Cognitive Neuroscience, 35(4), 736–748.

      White, J. M., Levi, D. M., & Aitsebaomo, A. P. (1992). Spatial localization without visual references. Vision Research, 32(3), 513–526.

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a valuable finding on the influence of visual uncertainty and Bayesian cue combination on implicit motor adaptation in young healthy participants. The evidence supporting the claims of the authors is solid, although a better discussion of the link between the model variables and the outcomes of related behavioral experiments would strengthen the conclusions. The work will be of interest to researchers in sensory cue integration and motor learning.

      Public Reviews:

      Reviewer #1 (Public Review):

      This valuable study demonstrates a novel mechanism by which implicit motor adaptation saturates for large visual errors in a principled normative Bayesian manner. Additionally, the study revealed two notable empirical findings: visual uncertainty increases for larger visual errors in the periphery, and proprioceptive shifts/implicit motor adaptation are non-monotonic, rather than ramp-like. This study is highly relevant for researchers in sensory cue integration and motor learning. However, I find some areas where statistical quantification is incomplete, and the contextualization of previous studies to be puzzling.

      Thank you for your feedback and the positive highlights of our study. We appreciate your insights and will address the concerns in our revisions.

      Issue #1: Contextualization of past studies.

      While I agree that previous studies have focused on how sensory errors drive motor adaptation (e.g., Burge et al., 2008; Wei and Kording, 2009), I don't think the PReMo model was contextualized properly. Indeed, while PReMo should have adopted clearer language - given that proprioception (sensory) and kinaesthesia (perception) have been used interchangeably, something we now make clear in our new study (Tsay, Chandy, et al. 2023) - PReMo's central contribution is that a perceptual error drives implicit adaptation (see Abstract): the mismatch between the felt (perceived) and desired hand position. The current paper overlooks this contribution. I encourage the authors to contextualize PReMo's contribution more clearly throughout. Not mentioned in the current study, for example, PReMo accounts for the continuous changes in perceived hand position in Figure 4 (Figure 7 in the PReMo study).

      There is no doubt that the current study provides important additional constraints on what determines perceived hand position: Firstly, it offers a normative Bayesian perspective in determining perceived hand position. PReMo suggests that perceived hand position is determined by integrating motor predictions with proprioception, then adding a proprioceptive shift; PEA formulates this as the optimal integration of these three inputs. Secondly, PReMo assumed visual uncertainty to remain constant for different visual errors; PEA suggests that visual uncertainty ought to increase (but see Issue #2).

      Thank you for the comments and suggestions. We have now incorporated the citation for (Tsay et al., 2024), to acknowledge their clarification on the terms of perceptual error. We also agree that our model differs in two fundamental ways. One is to ditch the concept of proprioceptive shift and its contribution to the perceived hand location; instead, we resort to a “one-shot” integration of three types of cues with Bayesian rules. This is a more elegant and probably more ecological way of processing hand location per Occam's Razor. The second essential change is to incorporate the dependency of visual uncertainty on perturbation size into the model, as opposed to resorting to a ramp function of proprioceptive changes relative to perturbation size. The ramp function is not well grounded in perception studies. Yes, we acknowledged that PReMo is the first to recognize the importance of perceptual error, but highlighted the model differences in our Discussion.

      We also think the PReMo model has the potential to explain Fig 4A. But the Tsay et al., 2022 paper assumes that “a generic shift in visual space” explains the gradual proprioceptive changes from negative to positive (see page 17 in Tsay et al., 2022). We do not think that evoking this visual mechanism is necessary to explain Fig 4A; instead, the proprioceptive change is a natural result of hand deviations during implicit adaptation. As the hand moves away from the target (in the positive direction) during adaptation, the estimated hand location goes alone with it. We believe this is the correct way of explaining Fig4A results. As we played around with the PReMo model, we found it is hard to use visual shift to explain this part of data without additional assumptions (at least not with the ones published in Tsay et al., 2022). Furthermore, our PEA model also parsimoniously explains away the proprioceptive shift observed in a completely different setting, i,e., the proprioceptive changes measured by the passive method as a function of perturbation size in Exp 3.

      We expanded the discussion about the comparison between the two models, especially about their different views for explaining Fig4A.

      Issue #2: Failed replication of previous results on the effect of visual uncertainty.

      (2a) A key finding of this paper is that visual uncertainty linearly increases in the periphery; a constraint crucial for explaining the non-monotonicity in implicit adaptation. One notable methodological deviation from previous studies is the requirement to fixate on the target: Notably, in the current experiments, participants were asked to fixate on the target, a constraint not imposed in previous studies. In a free-viewing environment, visual uncertainty may not attenuate as fast, and hence, implicit adaptation does not attenuate as quickly as that revealed in the current design with larger visual errors. Seems like this current fixation design, while important, needs to be properly contextualized considering how it may not represent most implicit adaptation experiments.

      First, we don’t think there is any previous study that examined visual uncertainty as a function of perturbation size. Thus, we do not have a replication problem here. Secondly, our data indicate that even without asking people to fixate on the target, people still predominantly fixate on the target during error-clamp adaptation (when they are “free” viewing). For our Exp 1, the fixation on the straight line between the starting position and the target is 86%-95% (as shown in Figure S1 now, also see below). We also collected eye-tracking data in Exp 4, which is a typical error-clamp experiment. More than 95% fall with +/- 50 pixels around the center of the screen, even slightly higher than Exp 1. This is well understandable: the typical error-clamp adaptation requires people to ignore the cursor and move the hand towards the target. To minimize the interference of the concurrently moving cursor, people depend on the fixation on the target, the sole task-relevant visual marker in the workspace, to achieve the task goal.

      In sum, forcing the participants to fixate on the target is not because we aimed to make up the linear dependency of visual uncertainty; we required them to do so to mimic the eye-tracking pattern in typical error-clamp learning, which has been revealed in our pilot experiment. The visual uncertainty effect is sound, our study is the first to clearly demonstrate it.

      Author response image 1.

      On a side note (but an important one), the high percentage of fixation on the aiming target is also true for conventional visuomotor rotation, which involves strategic re-aiming (shown in Bromberg et al., 2019; de Brouwer et al., 2018, we have an upcoming paper to show this). This is one reason that our new theory would also be applicable to other types of motor adaptation.

      (2b) Moreover, the current results - visual uncertainty attenuates implicit adaptation in response to large, but not small, visual errors - deviates from several past studies that have shown that visual uncertainty attenuates implicit adaptation to small, but not large, visual errors (Tsay, Avraham, et al. 2021; Makino, Hayashi, and Nozaki, n.d.; Shyr and Joshi 2023). What do the authors attribute this empirical difference to? Would this free-viewing environment also result in the opposite pattern in the effect of visual uncertainty on implicit adaptation for small and large visual errors?

      We don’t think all the mentioned previous studies manipulated the visual uncertainty in a parametric way, and none of them provided quantitative measures of visual uncertainty. As we detailed in our Exp4 and in our Discussion, we don’t think Tsay et al., 2021 paper’s manipulation of visual uncertainty is appropriate (see below for 2d). Makino et al., 2023 study used multiple clamped cursors to perturb people, and its effect is not easily accountable since additional processes might be invoked given this kind of complex visual feedback. More importantly, we do not think this is a direct way of modulating visual uncertainty, nor did they provide any evidence.

      (2c) In the current study, the measure of visual uncertainty might be inflated by brief presentation times of comparison and referent visual stimuli (only 150 ms; our previous study allowed for a 500 ms viewing time to make sure participants see the comparison stimuli). Relatedly, there are some individuals whose visual uncertainty is greater than 20 degrees standard deviation. This seems very large, and less likely in a free-viewing environment.

      For our 2AFC, the reference stimulus is the actual clamped cursor, which lasts for 800 ms. The comparison stimulus is a 150-ms dot representation appearing near the reference. For measuring perception of visual motion, this duration is sufficient as previous studies used similar durations (Egly & Homa, 1984; Owsley et al., 1995). We think the 20-degree standard deviation is reasonable given that people fixate on the target, with only peripheral vision to process the fast moving cursor. The steep linear increase in visual uncertainty about visual motion is well documented. The last author of this paper has shown that the uncertainty of visual motion speed (though not about angels) follows the same steep trend (Wei et al., 2010). It is noteworthy that without using our measured visual uncertainty in Exp1, if we fit the adaptation data in Exp2 to “estimate” the visual uncertainty, they are in fact well aligned with each other (see Figure S7 and Supplementary Text 2). This is a strong support that our estimation is valid and accurate. We think this high visual uncertainty is an important message to the field. Thus we now highlighted its magnitude in our Discussion.

      (2d) One important confound between clear and uncertain (blurred) visual conditions is the number of cursors on the screen. The number of cursors may have an attenuating effect on implicit adaptation simply due to task-irrelevant attentional demands (Parvin et al. 2022), rather than that of visual uncertainty. Could the authors provide a figure showing these blurred stimuli (gaussian clouds) in the context of the experimental paradigm? Note that we addressed this confound in the past by comparing participants with and without low vision, where only one visual cursor is provided for both groups (Tsay, Tan, et al. 2023).

      Thank you for raising this important point about types of visual stimuli for manipulating uncertainty. We used Gaussian blur of a single cursor (similar to Burge et al., 2008) instead of a cloud of dots. We now added a figure inset to show how this blur looks.

      Using a cursor cloud Makino et al., 2023; Tsay et al., 2021 to modulate visual uncertainty has inherent drawbacks that make it unsuitable for visuomotor adaptation. For the error clamp paradigm, the error is defined as angular deviation. The cursor cloud consists of multiple cursors spanning over a range of angles, which affects both the sensory uncertainty (the intended outcome) and the sensory estimate of angles (the error estimate, the undesired outcome). In Bayesian terms, the cursor cloud aims to modulate the sigma of a distribution (sigma_v       in         our       model), but it additionally affects the mean of the distribution (mu). This unnecessary confound is avoided by using cursor blurring, which is still a cursor with its center (mu) unchanged from a single cursor. Furthermore, as correctly pointed out in the original paper by Tsay et al., 2021, the cursor cloud often overlaps with the visual target, this “target hit” would affect adaptation, possibly via a reward learning mechanism (See Kim et al., 2019). This is a second confound that accompanies the cursor cloud.

      Issue #3: More methodological details are needed.

      (3a) It's unclear why, in Figure 4, PEA predicts an overshoot in terms of perceived hand position from the target. In PReMo, we specified a visual shift in the perceived target position, shifted towards the adapted hand position, which may result in overshooting of the perceived hand position with this target position. This visual shift phenomenon has been discovered in previous studies (e.g., (Simani, McGuire, and Sabes 2007)).

      Visual shift, as it is called in Simani et al., 2007, is irrelevant for our task here. The data we are modeling are motor adaptation (hand position changes) and so-called proprioceptive changes (hand localization changes), both are measured and referenced in the extrinsic coordinate, not referenced to a visual target. For instance, the proprioceptive changes are either relative to the actual hand location (Exp 3) or relative to the goal (Fig 4A). We also don’t think visual shift is necessary in explaining the perceptual judgment of an unseen hand (the target shown during the judgment indeed has an effect of reducing the biasing effect of PE, see below for responses to reviewer 3).

      In the PEA model, the reported hand angle is the result of integrating cues from the actual hand position and the estimated hand position (x_hand_hat) from previous movements. This integration process leads to the combined reported hand position potentially overshooting or undershooting, depending on the degree of adaptation. It is the changed proprioceptive cue (because the actively moved hand slowly adapted to the error clamp) leading to the overshoot of the perceived hand position.

      In Results, we now explain these value changes with parentheses. Model details about the mechanisms of cue combination and model predictions can be found in Supplementary Text 1. We believe these detailed explanations can make this apparent.

      (3b) The extent of implicit adaptation in Experiment 2, especially with smaller errors, is unclear. The implicit adaptation function seems to be still increasing, at least by visual inspection. Can the authors comment on this trend, and relatedly, show individual data points that help the reader appreciate the variability inherent to these data?

      Indeed, the adaptation for small errors appears not completely saturated with our designated number of trials. However, this will not affect our model analysis. Our model fitting for PEA and other competing models is done on the time-series of adaptation, not on the saturated adaptation extent (see Fig 3A). Thus, despite that some conditions might not produce the full range of adaptation, the data is sufficient to constrain the models. We now mention this concern in Results; we also emphasize that the model not only explains the adaptation magnitude (operationally defined as adaptation extent measured at the same time, i.e., the end of the adaptation phase) but also the full learning process.

      In response, we have included individual data points in the revised Figure 3B-D to provide a clear illustration of the extent of implicit adaptation, particularly for small perturbations.

      (3c) The same participants were asked to return for multiple days/experiments. Given that the authors acknowledge potential session effects, with attenuation upon re-exposure to the same rotation (Avraham et al. 2021), how does re-exposure affect the current results? Could the authors provide clarity, perhaps a table, to show shared participants between experiments and provide evidence showing how session order may not be impacting results?

      Thank you for raising the issue of session and re-exposure effects. First, we don’t think Exp1 has an effect on Exp4. Exp1 is a perceptual task and Exp4 is a motor adaptation task. Furthermore, Exp1 used random visual stimuli on both sides, thus it did not lead to any adaptation effect on its own. Second, Exp4 indeed had three sessions performed on three days, but the session effect does not change our main conclusion about the visual uncertainty. We used a 3-way repeated-measures anova (3 day x 3 perturbation x 2 visual uncertainty) revealed a significant main effect of day (F(2,36) = 17.693, p<0.001), indicating changes in performance across sessions (see Figure below). Importantly, the effects of perturbation and visual uncertainty (including their interactions) remain the same. The day factor did not interact with them. The main effect of day shows that the overall adaptation effect is reduced across days. Post-hoc pairwise comparisons elucidated that single-trial learning (STL) performance on Day 1 was significantly higher than on Day 2 (p = 0.004) and Day 3 (p < 0.001), with no significant difference between Day 2 and Day 3 (p = 0.106). Other ANOVA details: significant main effects for perturbation (F(1,36) = 8.872, p<0.001) and visual uncertainty (F(1,18) = 49.164, p<0.001), as well as a significant interaction between perturbation size and visual uncertainty (F(2,36) = 5.160, p = 0.013). There were no significant interactions involving the day factor with any other factors (all p > 0.182). Thus, the overall adaptation decreases over the days, but the day does not affect our concerned interaction effect of visual uncertainty and perturbation. The fact that their interaction preserved over different sessions strengthened our conclusion about how visual uncertainty systematically affects implicit adaptation.

      Author response image 2.

      (3d) The number of trials per experiment should be detailed more clearly in the Methods section (e.g., Exp 4). Moreover, could the authors please provide relevant code on how they implemented their computational models? This would aid in future implementation of these models in future work. I, for one, am enthusiastic to build on PEA.

      We have clarified the number of trials conducted in each experiment, with detailed information now readily available in the Methods section of the main text. In addition, we have made the code for data analysis and modeling publicly accessible. These resources can be found in the updated "Data Availability" section of our paper.

      (3f) In addition to predicting a correlation between proprioceptive shift and implicit adaptation on a group level, both PReMo and PEA (but not causal inference) predict a correlation between individual differences in proprioceptive shift and proprioceptive uncertainty with the extent of implicit adaptation (Tsay, Kim, et al. 2021). Interestingly, shift and uncertainty are independent (see Figures 4F and 6C in Tsay et al, 2021). Does PEA also predict independence between shift and uncertainty? It seems like PEA does predict a correlation.

      Thank you for addressing this insightful question. Our PEA model indeed predicts a positive correlation (although not linear) between the proprioceptive uncertainty and the amplitude of the estimated hand position (x_hand_hat). This prediction is consistent with the simulations conducted, using the same parameters that were applied to generate the results depicted in

      Figure 4B of our manuscript (there is a sign flip as x_hand_hat is negative).

      Author response image 3.

      Regarding the absence of a correlation observed in Tsay et al., 2021, we offer several potential explanations for this discrepancy. First, the variability observed in passive hand localization during motor adaptation (as in Tsay et al., 2021) does not directly equal proprioceptive uncertainty, which typically requires psychophysical testing to accurately assess. Second, our study showed that the proprioceptive bias attenuates during the repetitive measurements; in our Exp3, it decreased within a block of three trials. We noticed that Tsay et al., 2021 study used 36 measurements in a row without interleaving adaptation trials. Thus, the “averaged” proprioceptive bias in Tsay’s study might not reflect the actual bias during adaptation. We also noticed that that study showed large individual differences in both proprioceptive bias and proprioceptive variability (not uncertainty), thus getting a positive result, if it were really there, would require a large number of participants, probably larger than their n=30ish sample size. These putative explanations are not put in the revision, which already has a long discussion and has no space for discussing about a null result.

      Reviewer #2 (Public Review):

      Summary:

      The authors present the Perceptual Error Adaptation (PEA) model, a computational approach offering a unified explanation for behavioral results that are inconsistent with standard state-space models. Beginning with the conventional state-space framework, the paper introduces two innovative concepts. Firstly, errors are calculated based on the perceived hand position, determined through Bayesian integration of visual, proprioceptive, and predictive cues. Secondly, the model accounts for the eccentricity of vision, proposing that the uncertainty of cursor position increases with distance from the fixation point. This elegantly simple model, with minimal free parameters, effectively explains the observed plateau in motor adaptation under the implicit motor adaptation paradigm using the error-clamp method. Furthermore, the authors experimentally manipulate visual cursor uncertainty, a method established in visuomotor studies, to provide causal evidence. Their results show that the adaptation rate correlates with perturbation sizes and visual noise, uniquely explained by the PEA model and not by previous models. Therefore, the study convincingly demonstrates that implicit motor adaptation is a process of Bayesian cue integration

      Strengths:

      In the past decade, numerous perplexing results in visuomotor rotation tasks have questioned their underlying mechanisms. Prior models have individually addressed aspects like aiming strategies, motor adaptation plateaus, and sensory recalibration effects. However, a unified model encapsulating these phenomena with a simple computational principle was lacking. This paper addresses this gap with a robust Bayesian integration-based model. Its strength lies in two fundamental assumptions: motor adaptation's influenced by visual eccentricity, a well-established vision science concept, and sensory estimation through Bayesian integration. By merging these well-founded principles, the authors elucidate previously incongruent and diverse results with an error-based update model. The incorporation of cursor feedback noise manipulation provides causal evidence for their model. The use of eye-tracking in their experimental design, and the analysis of adaptation studies based on estimated eccentricity, are particularly elegant. This paper makes a significant contribution to visuomotor learning research.

      Weaknesses:

      The paper provides a comprehensive account of visuomotor rotation paradigms, addressing incongruent behavioral results with a solid Bayesian integration model. However, its focus is narrowly confined to visuomotor rotation, leaving its applicability to broader motor learning paradigms, such as force field adaptation, saccadic adaptation, and de novo learning paradigms, uncertain. The paper's impact on the broader fields of neuroscience and cognitive science may be limited due to this specificity. While the paper excellently demonstrates that specific behavioral results in visuomotor rotation can be explained by Bayesian integration, a general computational principle, its contributions to other motor learning paradigms remain to be explored. The paper would benefit from a discussion on the model's generality and its limitations, particularly in relation to the undercompensating effects in other motor learning paradigms.

      Thank you for your thoughtful review and recognition of the contributions our work makes towards understanding implicit motor adaptation through the Perceptual Error Adaptation (PEA) model. We appreciate your suggestion to broaden the discussion about the model's applicability beyond the visuomotor rotation paradigm, a point we acknowledge was not sufficiently explored in our initial discussion.

      Our model is not limited to the error-clamp adaptation, where the participants were explicitly told to ignore the rotated cursor. The error-clamp paradigm is one rare example that implicit motor learning can be isolated in a nearly idealistic way. Our findings thus imply two key aspects of implicit adaptation: 1) localizing one’s effector is implicitly processed and continuously used to update the motor plan; 2) Bayesian cue combination is at the core of integrating movement feedback and motor-related cues (motor prediction cue in our model) when forming procedural knowledge for action control.

      We will propose that the same two principles should be applied to various kinds of motor adaptation and motor skill learning, which constitutes motor learning in general. Most of our knowledge about motor adaptation is from visuomotor rotation, prism adaptation, force field adaptation, and saccadic adaptation. The first three types all involve localizing one’s effector under the influence of perturbed sensory feedback, and they also have implicit learning. We believe they can be modeled by variants of our model, or at least should consider using the two principles we laid out above to think of their computational nature. For skill learning, especially for de novo learning, the area still lacks a fundamental computational model that accounts for skill acquisition process on the level of relevant movement cues. Our model suggests a promising route, i.e., repetitive movements with a Bayesian cue combination of movement-related cues might underlie the implicit process of motor skills.

      We added more discussion on the possible broad implications of our model in the revision.

      Reviewer #3 (Public Review):

      Summary

      In this paper, the authors model motor adaptation as a Bayesian process that combines visual uncertainty about the error feedback, uncertainty about proprioceptive sense of hand position, and uncertainty of predicted (=planned) hand movement with a learning and retention rate as used in state space models. The model is built with results from several experiments presented in the paper and is compared with the PReMo model (Tsay, Kim, et al., 2022) as well as a cue combination model (Wei & Körding, 2009). The model and experiments demonstrate the role of visual uncertainty about error feedback in implicit adaptation.

      In the introduction, the authors notice that implicit adaptation (as measured in error-clamp-based paradigms) does not saturate at larger perturbations, but decreases again (e.g. Moorehead et al., 2017 shows no adaptation at 135{degree sign} and 175{degree sign} perturbations). They hypothesized that visual uncertainty about cursor position increases with larger perturbations since the cursor is further from the fixated target. This could decrease the importance assigned to visual feedback which could explain lower asymptotes.

      The authors characterize visual uncertainty for 3 rotation sizes in the first experiment, and while this experiment could be improved, it is probably sufficient for the current purposes. Then the authors present a second experiment where adaptation to 7 clamped errors is tested in different groups of participants. The models' visual uncertainty is set using a linear fit to the results from experiment 1, and the remaining 4 parameters are then fit to this second data set. The 4 parameters are 1) proprioceptive uncertainty, 2) uncertainty about the predicted hand position, 3) a learning rate, and 4) a retention rate. The authors' Perceptual Error Adaptation model ("PEA") predicts asymptotic levels of implicit adaptation much better than both the PReMo model (Tsay, Kim et al., 2022), which predicts saturated asymptotes, or a causal inference model (Wei & Körding, 2007) which predicts no adaptation for larger rotations. In a third experiment, the authors test their model's predictions about proprioceptive recalibration, but unfortunately, compare their data with an unsuitable other data set. Finally, the authors conduct a fourth experiment where they put their model to the test. They measure implicit adaptation with increased visual uncertainty, by adding blur to the cursor, and the results are again better in line with their model (predicting overall lower adaptation) than with the PReMo model (predicting equal saturation but at larger perturbations) or a causal inference model (predicting equal peak adaptation, but shifted to larger rotations). In particular, the model fits experiment 2 and the results from experiment 4 show that the core idea of the model has merit: increased visual uncertainty about errors dampens implicit adaptation.

      Strengths

      In this study, the authors propose a Perceptual Error Adaptation model ("PEA") and the work combines various ideas from the field of cue combination, Bayesian methods, and new data sets, collected in four experiments using various techniques that test very different components of the model. The central component of visual uncertainty is assessed in the first experiment. The model uses 4 other parameters to explain implicit adaptation. These parameters are 1) learning and 2) retention rate, as used in popular state space models, and the uncertainty (variance) of 3) predicted and 4) proprioceptive hand position. In particular, the authors observe that asymptotes for implicit learning do not saturate, as claimed before, but decrease again when rotations are very large and that this may have to do with visual uncertainty (e.g. Tsay et al., 2021, J Neurophysiol 125, 12-22). The final experiment confirms predictions of the fitted model about what happens when visual uncertainty is increased (overall decrease of adaptation). By incorporating visual uncertainty depending on retinal eccentricity, the predictions of the PEA model for very large perturbations are notably different from and better than, the predictions of the two other models it is compared to. That is, the paper provides strong support for the idea that visual uncertainty of errors matters for implicit adaptation.

      Weaknesses

      Although the authors don't say this, the "concave" function that shows that adaptation does not saturate for larger rotations has been shown before, including in papers cited in this manuscript.

      The first experiment, measuring visual uncertainty for several rotation sizes in error-clamped paradigms has several shortcomings, but these might not be so large as to invalidate the model or the findings in the rest of the manuscript. There are two main issues we highlight here. First, the data is not presented in units that allow comparison with vision science literature. Second, the 1 second delay between the movement endpoint and the disappearance of the cursor, and the presentation of the reference marker, may have led to substantial degradation of the visual memory of the cursor endpoint. That is, the experiment could be overestimating the visual uncertainty during implicit adaptation.

      The paper's third experiment relies to a large degree on reproducing patterns found in one particular paper, where the reported hand positions - as a measure of proprioceptive sense of hand position - are given and plotted relative to an ever-present visual target, rather than relative to the actual hand position. That is, 1) since participants actively move to a visual target, the reported hand positions do not reflect proprioception, but mostly the remembered position of the target participants were trying to move to, and 2) if the reports are converted to a difference between the real and reported hand position (rather than the difference between the target and the report), those would be on the order of ~20{degree sign} which is roughly two times larger than any previously reported proprioceptive recalibration, and an order of magnitude larger than what the authors themselves find (1-2{degree sign}) and what their model predicts. Experiment 3 is perhaps not crucial to the paper, but it nicely provides support for the idea that proprioceptive recalibration can occur with error-clamped feedback.

      Perhaps the largest caveat to the study is that it assumes that people do not look at the only error feedback available to them (and can explicitly suppress learning from it). This was probably true in the experiments used in the manuscript, but unlikely to be the case in most of the cited literature. Ignoring errors and suppressing adaptation would also be a disastrous strategy to use in the real world, such that our brains may not be very good at this. So the question remains to what degree - if any - the ideas behind the model generalize to experiments without fixation control, and more importantly, to real-life situations.

      Specific comments:

      A small part of the manuscript relies on replicating or modeling the proprioceptive recalibration in a study we think does NOT measure proprioceptive recalibration (Tsay, Parvin & Ivry, JNP, 2020). In this study, participants reached for a visual target with a clamped cursor, and at the end of the reach were asked to indicate where they thought their hand was. The responses fell very close to the visual target both before and after the perturbation was introduced. This means that the difference between the actual hand position, and the reported/felt hand position gets very large as soon as the perturbation is introduced. That is, proprioceptive recalibration would necessarily have roughly the same magnitude as the adaptation displayed by participants. That would be several times larger than those found in studies where proprioceptive recalibration is measured without a visual anchor. The data is plotted in a way that makes it seem like the proprioceptive recalibration is very small, as they plot the responses relative to the visual target, and not the discrepancy between the actual and reported hand position. It seems to us that this study mostly measures short-term visual memory (of the target location). What is astounding about this study is that the responses change over time to begin with, even if only by a tiny amount. Perhaps this indicates some malleability of the visual system, but it is hard to say for sure.

      Regardless, the results of that study do not form a solid basis for the current work and they should be removed. We would recommend making use of the dataset from the same authors, who improved their methods for measuring proprioception shifts just a year later (Tsay, Kim, Parvin, Stover, and Ivry, JNP, 2021). Although here the proprioceptive shifts during error-clamp adaptation (Exp 2) were tiny, and not quite significant (p<0.08), the reports are relative to the actual location of the passively placed unseen hand, measured in trials separate from those with reach adaptation and therefore there is no visual target to anchor their estimates to.

      Experiment 1 measures visual uncertainty with increased rotation size. The authors cite relevant work on this topic (Levi & Klein etc) which has found a linear increase in uncertainty of the position of more and more eccentrically displayed stimuli.

      First, this is a question where the reported stimuli and effects could greatly benefit from comparisons with the literature in vision science, and the results might even inform it. In order for that to happen, the units for the reported stimuli and effects should (also) be degrees of visual angle (dva).

      As far as we know, all previous work has investigated static stimuli, where with moving stimuli, position information from several parts of the visual field are likely integrated over time in a final estimate of position at the end of the trajectory (a Kalman filter type process perhaps). As far as we know, there are no studies in vision science on the uncertainty of the endpoint of moving stimuli. So we think that the experiment is necessary for this study, but there are some areas where it could be improved.

      Then, the linear fit is done in the space of the rotation size, but not in the space of eccentricity relative to fixation, and these do not necessarily map onto each other linearly. If we assume that the eye-tracker and the screen were at the closest distance the manufacturer reports it to work accurately at (45 cm), we would get the largest distances the endpoints are away from fixation in dva. Based on that assumed distance between the participant and monitor, we converted the rotation angles to distances between fixation and the cursor endpoint in degrees visual angle: 0.88, 3.5, and 13.25 dva (ignoring screen curvature, or the absence of it). The ratio between the perturbation angle and retinal distance to the endpoint is roughly 0.221, 0.221, and 0.207 if the minimum distance is indeed used - which is probably fine in this case. But still, it would be better to do fit in the relevant perceptual coordinate system.

      The first distance (4 deg rotation; 0.88 dva offset between fixation and stimulus) is so close to fixation (even at the assumed shortest distance between eye and screen) that it can be considered foveal and falls within the range of noise of eye-trackers + that of the eye for fixating. There should be no uncertainty on or that close to the fovea. The variability in the data is likely just measurement noise. This also means that a linear fit will almost always go through this point, somewhat skewing the results toward linearity. The advantage is that the estimate of the intercept (measurement noise) is going to be very good. Unfortunately, there are only 2 other points measured, which (if used without the closest point) will always support a linear fit. Therefore, the experiment does not seem suitable to test linearity, only to characterize it, which might be sufficient for the current purposes. We'd understand if the effort to do a test of linearity using many more rotations requires too much effort. But then it should be made much clearer that the experiment assumes linearity and only serves to characterize the assumed linearity.

      Final comment after the consultation session:

      There were a lot of discussions about the actual interpretation of the behavioral data from this paper with regards to past papers (Tsay et al. 2020 or 2021), and how it matches the different variables of the model. The data from Tsay 2020 combined both proprioceptive information (Xp) and prediction about hand position (Xu) because it involves active movements. On the other hand, Tsay et al. 2021 is based on passive movements and could provide a better measure of Xp alone. We would encourage you to clarify how each of the variables used in the model is mapped onto the outcomes of the cited behavioral experiments.

      The reviewers discussed this point extensively during the consultation process. The results reported in the Tsay 2020 study reflect both proprioception and prediction. However, having a visual target contributes more than just prediction, it is likely an anchor in the workspace that draws the response to it. Such that the report is dominated by short-term visual memory of the target (which is not part of the model). However, in the current Exp 3, as in most other work investigating proprioception, this is calculated relative to the actual direction.

      The solution is fairly simple. In Experiment 3 in the current study, Xp is measured relative to the hand without any visual anchors drawing responses, and this is also consistent with the reference used in the Tsay et al 2021 study and from many studies in the lab of D. Henriques (none of which also have any visual reach target when measuring proprioceptive estimates). So we suggest using a different data set that also measures Xp without any other influences, such as the data from Tsay et al 2021 instead.

      These issues with the data are not superficial and can not be solved within the model. Data with correctly measured biases (relative to the hand) that are not dominated by irrelevant visual attractors would actually be informative about the validity of the PEA model. Dr. Tsay has so much other that we recommend using a more to-the-point data set that could actually validate the PEA model.

      As the comments are repetitive at some places, we summarize them into three questions and address it one by one below:

      (1) Methodological Concerns about visual uncertainty estimation in Experiment 1: a) the visual uncertainty is measured in movement angles (degrees), while the unit in vision science is in visual angles (vda). This mismatch of unit hinders direct comparison between the found visual uncertainty and those reported in the literature, and b) a 1-second delay between movement endpoint and the reference marker presentation causes an overestimate of visual uncertainty due to potential degradation of visual memory. c) The linear function of visual uncertainty is a result of having only three perturbation sizes.

      a) As noted by the reviewer, our visual uncertainty is about cursor motion direction in the display plane, which has never been measured before. We do not think our data is comparable to any findings in visual science about fovea/peripheral comparison. We quoted Klein and others’ work Klein & Levi, 1987; Levi et al., 1987 in vision science since their studies showed that the deviation from the fixation is associated with the increase in visual uncertainty. Their study thus inspired our Exp1 to probe how our concerned visual uncertainty (specifically for visual motion direction) changes with an increasing deviation from the fixation. We believe that any model and its model parameters should be specifically tailored to the task or context it tries to emulate. In our case, motion direction in a center-out reaching setting is the modeled context, and all the relevant model parameters should be specified in movement angles.

      b) The 1s delay of the reference cursor appears to have minimum impact on the estimate of visual uncertainty, based on previous vision studies. Our Exp1 used a similar visual paradigm by White et al., 1992, which shows that delay does not lead to an increase in visual uncertainty over a broad range of values (from 0.2s to >1s, see their Figure 5-6). We will add more methodology justifications in our revision.

      c) We agree that if more angles are tested we can be more confident about the linearity of visual uncertainty. However, the linear function is a good approximation of visual uncertainty (as shown in Figure 2C). More importantly, our model performance does not hinge on a strict linear function. Say, if it is a power function with an increasing slope, our model will still predict the major findings presented in the paper, as correctly pointed out by the reviewer. It is the increasing trend of visual uncertainty, which is completely overlooked by previous studies, that lead to various seemingly puzzling findings in implicit adaptation. Lastly, without assuming a linear function, we fitted the large dataset of motor adaptation from Exp2 to numerically estimate the visual uncertainty. This estimated visual uncertainty has a strong linear relationship with perturbation size (R = 0.991, p<0.001). In fact, the model-fitted visual uncertainty is very close to the values we obtained in Exp1. We now included this analysis in the revision. See details in Supplementary text 2 and Figure S7.

      (2) Experiment 3's: the reviewer argues that the Tsay et al., 2020 data does not accurately measure proprioceptive recalibration, thus it is not suitable for showing our model’s capacity in explaining proprioceptive changes during adaptation.

      Response: We agree that the data from Tsay et al., 2020 is not from passive localization, which is regarded as the widely-accepted method to measure proprioceptive recalibration, a recalibration effect in the sensory domain. The active localization, as used in Tsay et al., 2020, is hypothesized as closely related to people’s forward prediction (where people want to go as the reviewer put it in the comments). However, we want to emphasize that we never equated Tsay’s findings as proprioceptive recalibration: throughout the paper we call them “reported hand location”. We reserved “proprioceptive recalibration” to our own Exp3, which used a passive localization method. Thus, we are not guilty of using this term. Secondly, as far as we know, localization bias or changes, no matter measured by passive or active methods, have not been formally modeled quantitatively. We believe our model can explain both, at least in the error-clamp adaptation setting here. Exp3 is for passive localization, the proprioceptive bias is caused by the biasing effect from the just-perceived hand location (X_hand_hat) from the adaptation trial. Tsay et al. 2020 data is for active localization, whose bias shows a characteristic change from negative to positive. This can be explained by just-perceived hand location (X_hand_hat again) and a gradually-adapting hand (X_p). We think this is a significant advance in the realm of proprioceptive changes in adaptation. Of course, our idea can be further tested in other task conditions, e.g., conventional visuomotor rotation or even gain adaptation, which should be left for future studies.

      For technical concerns, Tsay et al., 2020 data set is not ideal: when reporting hand location, the participants view the reporting wheel as well as the original target. As correctly pointed out by the reviewer, the presence of the target might provide an anchoring cue for perceptual judgment, which acts as an attractor for localization. If it were the case, our cue combination would predict that this extra attractor effect would lead to a smaller proprioceptive effect than that is currently reported in their paper. The initial negative bias will be closer to the target (zero), and the later positive bias will be closer to the target too. However, the main trend will remain, i.e. the reported hand location would still show the characteristic negative-to-positive change. The attractor effect of the target can be readily modeled by giving less weight to the just-perceived hand location (X_hand_hat). Thus, we would like to keep Tsay et al., 2020 data in our paper but add some explanations of the limitations of this dataset as well as how the model would fare with these limitations.

      That being said, our model can explain away both passive and active localization during implicit adaptation elicited by error clamp. The dataset from Tsay et al., 2021 paper is not a good substitute for their 2020 paper in terms of modeling, since that study interleaved some blocks of passive localization trials with adaptation trials. This kind of block design would lead to forgetting of both adaptation (Xp in our model) and the perceived hand (X_hand_hat in our model), the latter is still not considered in our model yet. As our Exp3, which also used passive localization, shows, the influence of the perceived hand on proprioceptive bias is short-lived, up to three trials without adaptation trials. Of course, it would be of great interest to design future studies to study how the proprioceptive bias changes over time, and how its temporal changes relate to the perceptual error. Our model provides a testbed to move forward in this direction.

      (3) The reviewer raises concerns about the study's assumption that participants ignore error feedback, questioning the model's applicability to broader contexts and real-world scenarios where ignoring errors might not be viable or common.

      Reviewer 2 raised the same question above. We moved our responses here. “We appreciate your suggestion to broaden the discussion about the model's applicability beyond the visuomotor rotation paradigm, a point we acknowledge was not sufficiently explored in our initial discussion.

      Our model is not limited to the error-clamp adaptation, where the participants were explicitly told to ignore the rotated cursor. The error-clamp paradigm is one rare example that implicit motor learning can be isolated in a nearly idealistic way. Our findings thus imply two key aspects of implicit adaptation: 1) localizing one’s effector is implicitly processed and continuously used to update the motor plan; 2) Bayesian cue combination is at the core of integrating movement feedback and motor-related cues (motor prediction cue in our model) when forming procedural knowledge for action control.

      We will propose that the same two principles should be applied to various kinds of motor adaptation and motor skill learning, which constitutes motor learning in general. Most of our knowledge about motor adaptation is from visuomotor rotation, prism adaptation, force field adaptation, and saccadic adaptation. The first three types all involve localizing one’s effector under the influence of perturbed sensory feedback, and they also have implicit learning. We believe they can be modeled by variants of our model, or at least should consider using the two principles we laid out above to think of their computational nature. For skill learning, especially for de novo learning, the area still lacks a fundamental computational model that accounts for skill acquisition process on the level of relevant movement cues. Our model suggests a promising route, i.e., repetitive movements with a Bayesian cue combination of movement-related cues might underlie the implicit process of motor skills.”

      We also add one more important implication of our model: as stated above, our model also explains that the proprioceptive changes, revealed by active or passive localization methods, are brought by (mis)perceived hand localization via Bayesian cue combination. This new insight, though only tested here using the error-clamp paradigm, can be further utilized in other domains, e.g., conventional visuomotor rotation or force field adaptation. We hope this serves as an initial endeavor in developing some computational models for proprioception studies. Please see the extended discussion on this matter in the revision.

      Recommendations for the authors:

      Revisions:

      All three reviewers were positive about the work and have provided a set of concrete and well-aligned suggestions, which the authors should address in a revised version of the article. These are listed below.

      A few points of particular note:

      (1) There are a lot of discussions about the actual interpretation of behavioral data from this paper or past papers (Tsay et al. 2020 or 2021) and how it matches the different variables of the model.

      (2) There are some discussions on the results of the first experiment, both in terms of how it is reported (providing degrees of visual angle) and how it is different than previous results (importance of the point of fixation). We suggest also discussing a few papers on eye movements during motor adaptation from the last years (work of Anouk de Brouwer and Opher Donchin). Could the authors also discuss why they found opposite results to that of previous visual uncertainty studies (i.e., visual uncertainty attenuates learning with large, but not small, visual errors); rather than the other way around as in Burge et al and Tsay et al 2021 and Makino Nozaki 2023 (where visual uncertainty attenuates small, but not large, visual errors).

      (3) It is recommended by several reviewers to discuss the applicability of the model to other areas/perturbations.

      (4) Several reviewers and I believe that the impact of the paper would be much higher if the code to reproduce all the simulations of the model is made available to the readers. In addition, while I am very positive about the fact that the authors shared the data of their experiments, metadata seems to be missing while they are highly important because these data are otherwise useless.

      Thank you for the concise summary of the reviewers’ comments. We have addressed their concerns point by point.

      Reviewer #2 (Recommendations For The Authors):

      L142: The linear increase in visual uncertainty should be substantiated by previous research in vision science. Please cite relevant papers and discuss why the linear model is considered reasonable.

      We cited relevant studies in vision science. Their focus is more about eccentricity inflate visual uncertainty, similar to our findings that deviations from the fixation direction inflate visual uncertainty about motion direction.

      We also want to add that our model performance does not hinge on a strict linear function of visual uncertainty. Say, if it is a power function with an increasing slope, our model will still predict the major findings presented in the paper. It is the increasing trend of visual uncertainty, which is completely overlooked by previous studies, that lead to various seemingly puzzling findings in implicit adaptation. Furthermore, without assuming a linear function, we fitted the large dataset of motor adaptation from Exp2 to numerically estimate the visual uncertainty. This estimated visual uncertainty has a strong linear relationship with perturbation size (R = 0.991, p<0.001). In fact, the model-fitted visual uncertainty is very close to the values we obtained in Exp1. We now included this new analysis in the revision. See details in Supplementary text 2 and Figure S7.

      L300: I found it challenging to understand the basis for this conclusion. Additional explanatory support is required.

      We unpacked this concluding sentence as follows:

      “The observed proprioceptive bias is formally modeled as a result of the biasing effect of the perceived hand estimate x_hand_hat. In our mini-block of passive localization, the participants neither actively moved nor received any cursor perturbations for three trials in a row. Thus, the fact that the measured proprioceptive bias is reduced to nearly zero at the third trial suggests that the effect of perceived hand estimate x_hand_hat decays rather rapidly.”

      L331: For the general reader, a visual representation of what the blurring mask looks like would be beneficial.

      Thanks for the nice suggestion. We added pictures of a clear and a blurred cursor in Figure 5D.

      L390: This speculation is intriguing. It would be helpful if the authors explained why they consider causal inference to operate at an explicit process level, as the reasoning is not clear here, although the idea seems plausible.

      Indeed, our tentative conclusion here is only based on the model comparison results here. It is still possible that causal inference also work for implicit adaptation besides explicit adaptation. We make a more modest conclusion in the revision:

      “The casual inference model is also based on Bayesian principle, then why does it fail to account for the implicit adaptation? We postulate that the failure of the causal inference model is due to its neglect of visual uncertainty as a function of perturbation size, as we revealed in Experiment 1. In fact, previous studies that advocating the Bayesian principle in motor adaptation have largely focused on experimentally manipulating sensory cue uncertainty to observe its effects on adaptation (Burge et al., 2008; He et al., 2016; Körding & Wolpert, 2004; Wei & Körding, 2010), similar to our Experiment 4. Our findings suggest that causal inference of perturbation alone, without incorporating visual uncertainty, cannot fully account for the diverse findings in implicit adaptation. The increase in visual uncertainty by perturbation size is substantial: our Experiment 1 yielded an approximate seven-fold increase from a 4° perturbation to a 64° perturbation. We have attributed this to the fact that people fixate in the desired movement direction during movements. Interestingly, even for conventional visuomotor rotation paradigm where people are required to “control” the perturbed cursor, their fixation is also on the desired direction, not on the cursor itself (de Brouwer, Albaghdadi, et al., 2018; de Brouwer, Gallivan, et al., 2018). Thus, we postulate that a similar hike in visual uncertainty in other “free-viewing” perturbation paradigms. Future studies are warranted to extend our PEA model to account for implicit adaptation in other perturbation paradigms.”

      L789: The method of estimating Sigma_hand in the brain was unclear. Since Bayesian computation relies on the magnitude of noise, the cognitive system must have estimates of this noise. While vision and proprioception noise might be directly inferred from signals, the noise of the hand could be deduced from the integration of these observations or an internal model estimate. This process of estimating noise magnitude is theorized in recursive Bayesian integration models (or Kalman filtering), where the size estimate of the state noise (sigma_hand) is updated concurrently with the state estimate (x_hand hat). The equation in L789 and the subsequent explanation appear to assume a static model of noise estimation. However, in practice, the noise parameters, including Sigma_hand, are likely dynamic and updated with each new observation. A more detailed explanation of how Sigma_hand is estimated and its role in the cognitive process.

      This is a great comment. In fact, if a Kalman filter is used, the learning rate and the state noise all should be dynamically updated on each trial, under the influence of the observed (x_v). In fact, most adaptation models assume a constant learning rate, including our model here. But a dynamic learning rate (B in our model) is something worth trying. However, in our error-clamp setting, x_v is a constant, thus this observation variable cannot dynamically update the Kalman filter; that’s why we opt to use a “static” Bayesian model to explain our datasets. Thus, Sigma_hand can be estimated by using Bayesian principles as a function of three cues available, i.e., the proprioceptive cue, the visual cue, and the motor prediction cue. We added a

      detailed derivation of sigma_hand in the revision in Supplementary text 1.

      Reviewer #3 (Recommendations For The Authors):

      We observed values in Fig 2C for the 64-degree perturbation that seem to be outliers, i.e., greater than 50 degrees. It is unclear how a psychometric curve could have a "slope" or JNP of over 60, especially considering that the tested range was only 60. Since the data plotted in panel C is a collapse of the signed data in panel B, it is perplexing how such large data points were derived, particularly when the signed uncertainty values do not appear to exceed 30.

      Related to the previous point, we would also recommend connecting individual data points: if the uncertainty increases (linearly or otherwise), then people with low uncertainty at the middle distance should also have low uncertainty at the high distance, and people with high uncertainty at one point, should also have that at other distances. Or perhaps the best way to go about this is to use the uncertainty at the two smaller perturbations to predict uncertainty at the largest perturbation for each participant individually?

      Thank you for your suggestion to examine the consistency of individual levels of visual uncertainty across perturbation sizes. First, a sigma_v of 60 degrees is well possible, naturally falling out of the experimental data. It shows some individuals indeed have large visual uncertainty. Given these potential outliers (which should not be readily removed as we don’t have any reason to do so), we estimated the linear function of sigma_v with a robust method, i.e., the GLM with a gamma distribution, which favors right-skewed distribution that can well capture positive outliers. Furthermore, we added in our revision a verification test of our estimates of sigma_v: we used Exp2’s adaptation data to estimate sigma_v without assuming its linear dependency. As shown, the model-fitted sigma_v closely matched the estimated ones from Exp1 (see Supplementary text 2 and Figure S7).

      We re-plotted the sigma_v with connected data points provided, and the data clearly indicate that individuals exhibit consistent levels of visual uncertainty across different perturbation sizes, i.e. those with relatively lower uncertainty at middle distances (in fact, angles) tend to exhibit relatively lower uncertainty at higher distances too, and similarly, those with higher uncertainty at one distance maintain that level of uncertainty at other distances. This is confirmed by spearman correlation analysis to assess the consistency of uncertainties across different degrees of perturbation among individuals. Again, we observed significant correlations between perturbation angles, indicating good individual consistency (4 and 16 degrees, rho = 0.759, p<0.001; 16 and 64 degrees, rho = 0.527, p = 0.026).

      Author response image 4.

      The illustration in Fig 2A does not seem to show a stimulus that is actually used in the experiment (looks like about -30{degree sign} perturbation). It would be good to show all possible endpoints with all other visual elements to scale - including the start-points of the PEST procedure.

      Thanks for the suggestion. We updated Fig 2A to show a stimulus of +16 degree, as well as added an additional panel to show all the possible endpoints.

      Finally (related to the previous point), in lines 589-591 it says the target is a blue cross. Then in lines 614-616, it says participants are to fixate the blue cross or the start position. The start position was supposed to have disappeared, so perhaps the blue plus moved to the start position (which could be the case, when looking at the bottom panel in Fig 2A, although in the illustration the plus did not move fully to the start position, just toward it to some degree). Perhaps the descriptions need to be clarified, or it should be explained why people had to make an eye movement before giving their judgments. And if people could have made either 1) no eye movement, but stayed at fixation, 2) moved to the blue plus as shown in the last panel in Fig 2A, or 3) fixated on the home position, we'd be curious to know if this affected participants' judgments.

      Thanks for pointing that out. The blue cross serves as the target in the movement task, then disappears with the cursor after 800ms of frozen time. The blue cross then appeared in the discrimination task at the center of the screen, i.e. the start location. Subjects were asked to fixate at the blue cross during the visual discrimination task. Note this return the fixation to the home position is exactly what we will see in typical error-clamp adaptation: once the movement is over, people guided their hand back to the home position. We performed a pilot study to record the typical fixation pattern during error-clamp adaptation, and Exp1 was intentionally designed to mimic its fixation sequence. We have now updated the description of Figure 2A, emphasizing the stimulus sequence. .

      In Figure 4A, the label "bias" is confusing as that is used for recalibrated proprioceptive sense of hand position as well as other kinds of biases elsewhere in the paper. What seems to be meant is the integrated hand position (x-hat_hand?) where all three signals are apparently combined. The label should be changed and/or it should be clarified in the caption.

      Thanks for pointing that out, it should be x_hand_hat, and we have corrected this in the revised version of Figure 4.

      In the introduction, it is claimed that larger perturbations have not been tested with "implicit adaptation" paradigms, but in the same sentence, a paper is cited (Moorehead et al., 2017) that tests a rotation on the same order of magnitude as the largest one tested here (95{degree sign}), as well as much larger rotations (135{degree sign} and 175{degree sign}). With error-clamps. Interestingly, there is no adaptation in those conditions, which seems more in line with the sensory cue integration model. Can the PEA model explain these results as well? If so, this should be included in the paper, and if not, it should be discussed as a limitation.

      First, we double checked our manuscript and found that we never claimed that larger perturbations had not been tested.

      We agree that it is always good to have as many conditions as possible. However, the 135 and 175 degree conditions would lead to minimum adaptation, which would not help much in terms of model testing. We postulated that this lack of adaptation is simply due to the fact that people cannot see the moving cursor, or some other unknown reasons. Our simple model is not designed to cover those kinds of extreme cases.

      Specify the size of the arc used for the proprioceptive tests in Exp 3 and describe the starting location of the indicator (controlled by the left hand). Ideally, the starting location should have varied across trials to avoid systematic bias.

      Thank you for the comments. The size of the arc used during these tests, as detailed in the methods section of our paper, features a ring with a 10 cm radius centered at the start position. This setup is visually represented as a red arc in Figure 7B.

      After completing each proprioceptive test trial, participants were instructed to position the indicator at approximately -180° on the arc and then relax their left arm. Although the starting location for the subsequent trial remained at-180°, it was not identical for every trial, thereby introducing slight variability.

      Please confirm that the proprioceptive biases plotted in Fig 4E are relative to the baseline.

      Thank you for bringing this to our attention. Yes, the proprioceptive biases illustrated in Figure 4E are indeed calculated relative to the baseline measurements. We have added this in the method part.

      Data availability: the data are available online, but there are some ways this can be improved. First, it would be better to use an open data format, instead of the closed, proprietary format currently used. Second, there is no explanation for what's in the data, other than the labels. (What are the units? What preprocessing was done?) Third, no code is made available, which would be useful for a computational model. Although rewriting the analyses in a non-proprietary language (to increase accessibility) is not a reasonable request at this point in the project, I'd encourage it for future projects. But perhaps Python, R, or Julia code that implements the model could be made available as a notebook of sorts so that other labs could look at (build on) the model starting with correct code - increasing the potential impact of this work.

      Great suggestions. We are also fully supportive of open data and open science. We now:

      (1) Updated our data and code repository to include the experimental data in an open data format (.csv) for broader accessibility.

      (2) The data are now accompanied by detailed descriptions to clarify their contents.

      (3) We have made the original MATLAB (.m) codes for data analysis, model fitting and simulation available online.

      (4) We also provide the codes in Jupyter Notebook (.ipynb) formats.

      These updates can be found in the revised “Data Availability” section of our manuscript.

      References

      Bromberg, Z., Donchin, O., & Haar, S. (2019). Eye Movements during Visuomotor Adaptation Represent Only Part of the Explicit Learning. eNeuro, 6(6). https://doi.org/10.1523/ENEURO.0308-19.2019

      Burge, J., Ernst, M. O., & Banks, M. S. (2008). The statistical determinants of adaptation rate in human reaching. Journal of Vision, 8(4), 1–19.

      de Brouwer, A. J., Gallivan, J. P., & Flanagan, J. R. (2018). Visuomotor feedback gains are modulated by gaze position. Journal of Neurophysiology, 120(5), 2522–2531.

      Egly, R., & Homa, D. (1984). Sensitization of the visual field. Journal of Experimental Psychology. Human Perception and Performance, 10(6), 778–793.

      Kim, H. E., Parvin, D. E., & Ivry, R. B. (2019). The influence of task outcome on implicit motor learning. eLife, 8. https://doi.org/10.7554/eLife.39882

      Klein, S. A., & Levi, D. M. (1987). Position sense of the peripheral retina. JOSA A, 4(8), 1543–1553.

      Levi, D. M., Klein, S. A., & Yap, Y. L. (1987). Positional uncertainty in peripheral and amblyopic vision. Vision Research, 27(4), 581–597.

      Makino, Y., Hayashi, T., & Nozaki, D. (2023). Divisively normalized neuronal processing of uncertain visual feedback for visuomotor learning. Communications Biology, 6(1), 1286.

      Owsley, C., Ball, K., & Keeton, D. M. (1995). Relationship between visual sensitivity and target localization in older adults. Vision Research, 35(4), 579–587.

      Simani, M. C., McGuire, L. M. M., & Sabes, P. N. (2007). Visual-shift adaptation is composed of separable sensory and task-dependent effects. Journal of Neurophysiology, 98(5), 2827–2841.

      Tsay, J. S., Avraham, G., Kim, H. E., Parvin, D. E., Wang, Z., & Ivry, R. B. (2021). The effect of visual uncertainty on implicit motor adaptation. Journal of Neurophysiology, 125(1), 12–22.

      Tsay, J. S., Chandy, A. M., Chua, R., Miall, R. C., Cole, J., Farnè, A., Ivry, R. B., & Sarlegna, F. R. (2024). Minimal impact of proprioceptive loss on implicit sensorimotor adaptation and perceived movement outcome. bioRxiv : The Preprint Server for Biology. https://doi.org/10.1101/2023.01.19.524726

      Tsay, J. S., Kim, H., Haith, A. M., & Ivry, R. B. (2022). Understanding implicit sensorimotor adaptation as a process of proprioceptive re-alignment. eLife, 11, e76639.

      Wei, K., Stevenson, I. H., & Körding, K. P. (2010). The uncertainty associated with visual flow fields and their influence on postural sway: Weber’s law suffices to explain the nonlinearity of vection. Journal of Vision, 10(14), 4.

      White, J. M., Levi, D. M., & Aitsebaomo, A. P. (1992). Spatial localization without visual references. Vision Research, 32(3), 513–526.

    1. eLife assessment

      This important study provides insights into how the brain constructs categorical neural representations during a difficult auditory target detection task. Through recordings of simultaneous single-unit activity in primary and secondary auditory areas, compelling evidence is provided that categorical neural representations emerge in a secondary auditory area, i.e., PEG. The study is of interest to neuroscientists and can also potentially shed light on human psychological studies.

    2. Reviewer #1 (Public Review):

      This is a very interesting paper which addresses how auditory cortex represents sound while an animal is performing an auditory task. The study involves psychometric and neurophysiological measurements from ferrets engaged in a challenging tone in noise discrimination task, and relates these measurements using neurometric analysis. A novel neural decoding technique (decoding-based dimensionality reduction or dDR, introduced in a previous paper by two of the authors) is used to reduce bias so that stimulus parameters can be read out from neuronal responses.

      The central finding of the study is that, when an animal is engaged in a task, non-primary auditory cortex represents task-relevant sound features in a categorical way. In primary cortex, task engagement also affects representations, but in a different way - the decoding is improved (suggesting that representations have been enhanced), but is not categorical in nature. The authors argue that these results are compatible with a model where early sensory representations form an overcomplete representation of the world, and downstream neurons flexibly read out behaviourally relevant information from these representations.

      I find the concept and execution of the study very interesting and elegant. The paper is also commendably clear and readable. The differences between primary and higher cortex are compelling and I am largely convinced by the authors' claim that they have found evidence that broadly supports a mixed selectivity model of neural disentanglement along the lines of Rigotti et al (2013). I think that the increasing body of evidence for these kinds of representations is a significant development in our understanding of higher sensory representations. I also think that the dDR method is likely to be useful to researchers in a variety of fields who are looking to perform similar types of neural decoding analysis.

    3. Reviewer #2 (Public Review):

      This study compares the activity of neural populations in the primary and non-primary auditory cortex of ferrets while the animals actively behaved or passively listened to a sound discrimination task. Using a variety of methods, the authors convincingly show differential effects of task engagement on population neural activity in primary vs non-primary auditory cortex; notably that in the primary auditory cortex, task-engagement (1) improves discriminability for both task-relevant and non-task relevant dimensions, and (2) improves the alignment between covariability and sound discrimination axes; whereas in the non-primary auditory cortex, task-engagement (1) improves discriminability for only task-relevant dimensions, and (2) does not affect the alignment between covariability and sound discrimination axes. They additionally show that task-engagement changes in gain can account for the selectivity noted in the discriminability of non-primary auditory neurons. They also admirably attempt to isolate task-engagement from arousal fluctuations, by using fluctuations in pupil size as a proxy for physiological arousal. This is a well-carried out study with thoughtful analyses which in large part achieves its aims to evaluate how task-engagement changes neural activity across multiple auditory regions . As with all work, there are several caveats or areas for future study/analysis. First, the sounds used here (tones, and narrow-band noise) are relatively simple sounds; previous work suggests that exactly what activity is observed within each region (e.g., sensory only, decision-related, etc) may depend in part upon what stimuli are used. Therefore, while the current study adds importance to the literature, future work may consider the use of more varied stimuli. Second, the animals here were engaged in a behavioral task; but apart from an initial calculation of behavioral d', the task performance (and its effect on neural activity) is largely unaddressed.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      …I find the concept and execution of the study very interesting and elegant. The paper is also commendably clear and readable. The differences between primary and higher cortex are compelling and I am largely convinced by the authors' claim that they have found evidence that broadly supports a mixed selectivity model of neural disentanglement along the lines of Rigotti et al (2013). I think that the increasing body of evidence for these kinds of representations is a significant development in our understanding of higher sensory representations. I also think that the dDR method is likely to be useful to researchers in a variety of fields who are looking to perform similar types of neural decoding analysis.

      Thanks! We agree that questions around population coding and high-level representations are critical in the field of sensory systems.

      Reviewer #2 (Public Review):

      ... This is a well-carried out study with thoughtful analyses which in large part achieves its aims to evaluate how task-engagement changes neural activity across multiple auditory regions. As with all work, there are several caveats or areas for future study/analysis. First, the sounds used here (tones, and narrow-band noise) are relatively simple sounds; previous work suggests that exactly what activity is observed within each region (e.g., sensory only, decision-related, etc) may depend in part upon what stimuli are used. Therefore, while the current study adds importantly to the literature, future work may consider the use of more varied stimuli. Second, the animals here were engaged in a behavioral task; but apart from an initial calculation of behavioral d', the task performance (and its effect on neural activity) is largely unaddressed.

      The reviewer makes several important points that we hope we addressed in the specific changes detailed below. Indeed, it is important to recognize the possibility that the specific stimuli involved in a task may interact with the effects of behavioral state and that variability in task performance should be considered as an important aspect of behavioral state.

      Reviewer #1 (Recommendations For The Authors):

      I have a few minor comments and criticisms:

      (1) Figure 1c. The choice of low-contrast grey text (e.g. "Target vs. target" is unfortunate, especially when printed, and should be replaced (e.g. with dark grey).

      We have edited the figure to use a higher contrast (dark grey). Thanks for catching this.

      (2) Figure 2 and Supplementary Figure 3. I think some indication of error or significance is required in all panels. Without this, it's hard to interpret any of these panels.

      Thank you for this feedback. Including significance here was clarifying and helps to strengthen our claim that state-dependent changes in neural activity were smaller and more diverse for single neurons than at the population level. We modified Figure 2b-c to indicate whether each neuron’s response to the target stimulus was significantly different than its response to the catch stimulus. The same test was performed in Supplementary Figure 3. Additionally, we added a statistical test in Figure 2d-e to indicate, for each pair of target/catch stimuli, whether discrimination (d-prime) changed significantly between active and passive conditions. Furthermore, we modified the text of the second paragraph under the results heading: “Diverse effects of task engagement on single neurons in primary and non-primary auditory cortex” to reference and interpret the results of these significance tests. The new text reads as follows (L. 121):

      “Sound-evoked spiking activity was compared between active and passive states to study the impact of task engagement on sound representation. In both A1 and dPEG, responses to target and catch stimuli were significantly discriminable for a subset of single neurons (about 25% in both areas, Figure 2A-C, Supplemental Figures 3-5, bootstrap test). This supports the idea that stimulus identity can be decoded in both brain regions, regardless of task performance. However, the fact that the responses of most neurons in both brain areas could not significantly discriminate target vs. catch stimuli also highlights the diversity of sound encoding observed at the level of single neurons. The accuracy of catch vs. target discrimination for each neuron was quantified using neural d-prime, the z-scored difference in target minus catch spiking response for each neuron (Methods: Single neuron PSTHs and d-prime (Niwa et al., 2012a)). Task engagement was associated with significant changes in catch vs. target d-prime for roughly 10% of neurons in both A1 (40 / 481 neurons, bootstrap test) and dPEG (33 / 377 neurons, bootstrap test). This included neurons that both increased their discriminability and decreased their discriminability (Figure 2D-E). Thus, the effects of task engagement at the level of single neurons were relatively mild and inconsistent across the population; many neurons showed no significant change and of those that did, effects were bidirectional (Figure 2D-E).”

      We also included an additional methods paragraph in the “Statistical tests” section to describe the bootstrapping procedure used for these significance tests (L. 644):

      “The one exception to this general approach is in Figure 2, where we analyzed the sound discrimination abilities of single neurons. In this case, we computed p-values for each neuron and stimulus independently. First, for each neuron and catch vs. target stimulus pair, we measured d-prime (see Methods: Single neuron evoked activity and d-prime). We generated a null distribution of d-prime values for each neuron-stimulus pair, under each experimental condition by shuffling stimulus identity across trials before computing d-prime (100 resamples). A neuron was determined to have a significant d-prime for a given target vs. catch pair if its actual measured d-prime was greater than the 95th percentile of the null d-prime distribution. Second, for each neuron and catch vs. target stimulus pair, we tested if d-prime was significantly different between active and passive conditions. To test this, we followed a similar procedure as above, however, rather than shuffle stimulus identity, we shuffled active vs. passive trial labels. This allowed us to generate a null distribution of active vs. passive d-prime difference for each neuron and stimulus pair. A neuron was determined to have a significant change in d-prime between conditions if the actual Δ d-prime lay outside the 95% confidence interval of the null Δ d-prime distribution.”

      For Figure 2a, we chose not to indicate significance on the figure to avoid clutter, since the significance for all neurons in the population are shown in panels b-c anyway. Additionally, the difference plot shown in panel a is in units of z-scores, which we believe already gives a raw sense of the significance of the target vs. catch response change per neuron in this example dataset.

      (3) Figure 2 and Supplementary Figure 3. I would consider including some more examples as a Supplementary Figure (and perhaps combining Supp Fig 3 with Fig 2 as a main figure).

      We found no significant or apparent difference in single-neuron properties between A1 and dPEG. Therefore, we decided it is not helpful to plot both A1 and PEG examples in the main text. However, we agree that the ability to see more examples of the raw data could be useful. Therefore, we compiled two supplementary figures (Supplementary Figures 4 and 5) that replicate Figure 2a for all datasets, encompassing A1 and PEG.

      (4) Figure 2a and Supp Fig 3a. I was initially confused that the "delta-spk/sec (z-score)" values had themselves been z-scored, but now I think that they are simply the differences of the two left hand sub-panels. This could be made clear in the figure legend.

      The figure legends have been modified to state the procedure for computing “delta-spk/sec” more clearly. Specifically, we added the following information to the legend (L. 141):

      “Difference is computed as the z-scored response to the target minus the z-scored catch response (resulting in a difference shown in units of z-score).”

      (5) Figure 2b-e and Supp Fig 3b-e. Indicate the time window over which the responses were measured, and the number of neurons.

      Figure legends have been modified to include a sentence clearly stating the time window over which responses were measured. The number of neurons is also now included in the legend and on the figure itself. Furthermore, a brief description of the new statistical testing procedure has been added here (L. 144).

      “Responses were defined as the total number of spikes recorded during the 300 ms of sound presentation (area between dashed lines in panel A). Neurons with a significantly different response to the catch vs. target stimulus are indicated in black and quantified on the respective figure panel.”

      (6) Figure 2. "singe" should read "single"

      Typo in figure label has been fixed.

      (7) Line 144. Figure number is missing (Figure 3B-C).

      The missing figure number has been added to the text.

      (8) Figure 3. Again, the low-contrast grey should be replaced.

      The low-contrast grey has been replaced with dark grey.

      Reviewer #2 (Recommendations For The Authors):

      This study really nicely compares the activity and effects on activity in two areas of the auditory cortex in respect to task-engagement; I think it is, for the most part, very well done.

      A couple of specific recommendations:

      (1) Although I understand 'inf dB' as the SNR, including the actual dB level used in the experiments, would be useful, especially in the case of the inf dB.

      Thank you for this feedback. We agree that clarification about the overall sound level used here would be helpful. We have modified the methods section “Behavioral paradigm” to include the following sentence (L. 450):

      “That is, the masking noise (and distractor stimuli) were always presented with an overall sound level of 60 dB SPL. Infinite (inf) dB trials corresponded to trials where the target tone was presented at 60 dB SPL without any masking noise present, 0 dB to trials where the target was 60 dB SPL, -5 dB to trials where the target was presented at 55 dB SPL etc.”

      In addition, we have modified the main text (L. 82):

      “Animals reported the occurrence of a target tone in a sequence of narrowband noise distractors by licking a piezo spout (Figure 1A, Methods: Behavioral paradigm, distractor stimulus sound level: 60 dB SPL). … We describe SNR as the overall SPL of the target relative to distractor noise level. Thus, an SNR of –5 dB corresponds to a target level of 55 dB SPL while an Inf dB SNR corresponds to a target tone presented without any masking noise.”

      And Figure legend 1 now explicitly states the sound level used in the experiments (L. 104):

      “Variable SNR was achieved by varying overall SPL of the target relative to the fixed (60 dB SPL) distractor noise, e.g., -5 dB SNR corresponds to a 55 dB SPL target with 60 dB SPL masking noise. Infinite (inf) dB SNR corresponds to a target tone presented in isolation (60 dB SPL).”

      (2) I very much appreciate the attempt to disentangle task engagement from generalized arousal state, and specifically, addressing this through the use of pupillometry. However, by focusing the discussion of pupil dynamics solely on the arousal-state aspects of pupil size, the paper doesn't address the increasing evidence suggests that pupil size may fluctuate based upon a lot of other things, including perceptual events (see Kronemer et al, 2022 for a recent human paper; for auditory: Zekveld et al 2018 (review) and Montes-Lourido et al, 2021; but many many others, too). It would be nice to see either a bit more nuanced discussion of what pupil size may be indicating (easier), or analyzing the behavior in the context of pupil dynamics (a heavier lift).

      This is a good point. We agree that it is worth mentioning these more nuanced aspects of cognition that may be reflected by pupil size. Therefore, we also analyzed pupil size in the context of behavioral performance (see Supplemental Figure 6) and added the following text to the results (L. 193).

      “In addition to reflecting overall arousal level, pupil size has also been reported to reflect more nuanced cognitive variables such as, for example, listening effort (Zekveld et al., 2014). Furthermore, rodent data suggests that optimal sensory detection is associated with intermediate pupil size (McGinley et al., 2015), consistent with the hypothesis of an inverted-U relationship between arousal and behavioral performance (Zekveld et al., 2014). To determine if this pattern was true for the animals in our task, we measured the dynamics of pupil size in the context of behavioral performance. Across animals, task stimuli evoked robust pupil dilation that varied with trial outcome (Supplemental Figure 6b-c). Notably, pre-trial pupil size was significantly different between correct (hit and correct reject), hit, and miss trials (Supplemental Figure 6b-c), recapitulating the finding of an inverted-U relationship to performance in rodents (McGinley et al., 2015).  Since we focused only on correct trials in our decoding analysis, these outcome-dependent differences in pupil size are unlikely to contribute to the emergent decoding selectivity in dPEG.”

      (3) I think it would make this paper shine that much more if behavioral performance were not subsumed into the overall label of task engagement. You've already established you have performance that varies as a function of SNR; I would love to see the neural d' and covariability related to the behavioral d' (in the comparisons where this is possible). I would also love to see a more direct measure of choice for those stimuli that show variable behavior (e.g., a choice probability analysis or something of the like would seem to be easily applied to the target SNRs of -5 and 0 dB); and compare task engaged activity of hits vs misses vs passive listening to those same stimuli. You discuss previous studies looking at choice-related/decision-related activity and draw parallels to this work-given that there is the opportunity with this data set to *directly* assess choice-related activity, the absence of such an analysis seems like a missed opportunity.

      Thank you for this feedback. We agree that “task engagement” is not a unimodal state and that a more fine-grained analysis of task-engaged neural activity, according to behavioral choice, could be informative.

      First, we would like to point out that in Figure 4 we did already compare behavioral d’ to delta neural d’. We found that the two were significantly correlated in dPEG, but not in A1. This suggests that task-dependent changes in stimulus decoding in dPEG, but not A1, are predictive of behavioral performance. This is consistent with the finding that task-relevant stimulus representations were selectively enhanced in dPEG, but not in A1.

      Second, we added a choice decoding analysis to address whether auditory cortex represents the animal’s choice in our task. The results of this analysis are summarized in Supplemental Figure 8 and are discussed under the results section: “Behavioral performance is correlated with neural coding changes in non-primary auditory cortex only.” (L. 226):

      “The previous analysis suggests that the task-dependent increase in stimulus information present in dPEG population activity is predictive of overall task performance. Next, we asked whether the population activity in either brain region was directly predictive of behavioral choice on single hit vs. miss trials. To do this, we conducted a choice probability analysis (Methods). We found that in both brain regions choice could be decoded well above chance level (Supplemental Figure 8). Choice information was present throughout the entire trial and did not increase during the target stimulus presentation. This suggests that the difference in population activity primarily reflects a cognitive state associated with the probability of licking on a given trial, or “impulsivity” rather than “choice.” This interpretation is consistent with our finding that baseline pupil size on each trial is predictive of trial outcome (Supplemental Figure 6b).”

      To keep our decoding approach consistent throughout the manuscript, we followed the same approach for choice decoding as we did for stimulus decoding (perform dDR then calculate neural d-prime in the dimensionality reduced space). To make the results more interpretable, we converted choice d-prime to a choice probability (percent correctly decoded choices) using leave-one-out cross validation. (We note that d-prime and percent correct are very highly correlated statistics.) This is described in the methods as follows (L. 550):

      “We performed a choice decoding analysis on hit vs. miss trials. We followed the same procedure as described above for stimulus decoding, where instead of a pair of stimuli our two classes to be decoded were “hit trial” vs. “miss trial”. That is, for each target stimulus we computed the optimal linear discrimination axis separating hit vs. miss trials (Abbott and Dayan, 1999) in the reduced dimensionality space identified with dDR (Heller and David, 2022). For the sake of interpretability with respect to previous work we reported choice probability as the percentage of correctly decoded trial outcomes rather than d-prime. Percent correct was calculated by projecting the population activity onto the optimal discrimination axis and using leave-one-out cross validation to measure the number of correct classifications.”

      (4) It would also be interesting to look at population coding across sessions (although the point is taken that within a session allows the opportunity to assess covariability). Minorly self-servingly but very much related to the above point, Christison-Lagay et al, 2017 employed a similar detect-in-noise task, analyzed single neurons and population level activity, and looked at putative choice-related activity. The current study has the opportunity to expand on that kind of analysis that much more by looking across multiple sites vs within a given recording site; and compare across regions.

      Thank you for highlighting this point, we agree that it is important. When studying population coding it is critical to consider the impact of covariability between neurons. Therefore, it is worthwhile to revisit our interpretations of prior results, e.g., Christison-Lagay et al, 2017, which studied population coding by combining neurons across different sessions, given that we now have access to simultaneously recorded population data.

      First, we would like to point out that this was the primary motivation for our simulation analyses presented in Figure 5. Using simulations, we found that task-dependent gain modulation (which can be observed across sessions) was sufficient to explain our primary finding – selective enhancement in decoding of behaviorally relevant sound stimuli in dPEG.

      Second, to address the question about how covariability affects choice-related information in auditory cortex and compare our findings with prior studies, we performed the same set of simulations for choice probability analysis. We found that, again, choice-dependent gain modulation was sufficient to explain our findings. That is, simulations with hit- vs. miss-dependent gain changes, but fixed covariability, closely mirrored the choice probability we observed in the raw data. An additional simulation where covariability between all neurons was set to zero also recapitulated our findings in the raw data. Collectively, this suggests that covariability does not play a significant role in shaping the choice information present in A1 and dPEG during this task. We have added the following text to the manuscript to summarize this finding (L. 293):

      “Finally, we used the same simulation approach to determine what aspects of population activity carry the “choice” related information we observed in A1 and dPEG (Figure 4 – figure supplement 1). Similar to our findings for stimulus decoding, we found that gain modulation alone was sufficient to recapitulate the choice information present in the raw data for this task. This helps frame prior work that pooled neurons across sessions to study population coding of choice in similar auditory discrimination tasks (Christison-Lagay et al, 2017).”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study introduces and validates the Cyclic Homogeneous Oscillation (CHO) detection method to precisely determine the duration, location, and fundamental frequency of non-sinusoidal neural oscillations. Traditional spectral analysis methods face challenges in distinguishing the fundamental frequency of non-sinusoidal oscillations from their harmonics, leading to potential inaccuracies. The authors implement an underexplored approach, using the auto-correlation structure to identify the characteristic frequency of an oscillation. By combining this strategy with existing time-frequency tools to identify when oscillations occur, the authors strive to solve outstanding challenges involving spurious harmonic peaks detected in time-frequency representations. Empirical tests using electrocorticographic (ECoG) and electroencephalographic (EEG) signals further support the efficacy of CHO in detecting neural oscillations.

      Response:  We thank the reviewer for recognizing the strengths of our method in this encouraging review and for the opportunity to further improve and finalize our manuscript.

      Strengths:

      (1) The paper puts an important emphasis on the 'identity' question of oscillatory identification. The field primarily identifies oscillations through frequency, space (brain region), and time (length, and relative to task or rest). However, more tools that claim to further characterize oscillations by their defining/identifying traits are needed, in addition to data-driven studies about what the identifiable traits of neural oscillations are beyond frequency, location, and time. Such tools are useful for potentially distinguishing between circuit mechanistic generators underlying signals that may not otherwise be distinguished. This paper states this problem well and puts forth a new type of objective for neural signal processing methods.

      Response:  We sincerely appreciate this encouraging summary of the objective of our manuscript.

      (2) The paper uses synthetic data and multimodal recordings at multiple scales to validate the tool, suggesting CHO's robustness and applicability in various real-data scenarios. The figures illustratively demonstrate how CHO works on such synthetic and real examples, depicting in both time and frequency domains. The synthetic data are well-designed, and capable of producing transient oscillatory bursts with non-sinusoidal characteristics within 1/f noise. Using both non-invasive and invasive signals exposes CHO to conditions which may differ in extent and quality of the harmonic signal structure. An interesting followup question is whether the utility demonstrated here holds for MEG signals, as well as source-reconstructed signals from non-invasive recordings.

      Response:  We thank the reviewer for this excellent suggestion.  Indeed, our next paper will focus on applying our CHO method to signals that were source-reconstructed from non-invasive recordings (e.g., MEG and EEG) to extract their periodic activity.

      (3) This study is accompanied by open-source code and data for use by the community.

      Response:  We thank the reviewer for recognizing our effort to widely disseminate our method to the broader community.

      Weaknesses:

      (1) Due to the proliferation of neural signal processing techniques that have been designed to tackle issues such as harmonic activity, transient and event-like oscillations, and non-sinusoidal waveforms, it is naturally difficult for every introduction of a new tool to include exhaustive comparisons of all others. Here, some additional comparisons may be considered for the sake of context, a selection of which follows, biased by the previous exposure of this reviewer. One emerging approach that may be considered is known as state-space models with oscillatory and autoregressive components (Matsuda 2017, Beck 2022). State-space models such as autoregressive models have long been used to estimate the auto-correlation structure of a signal. State-space oscillators have recently been applied to transient oscillations such as sleep spindles (He 2023). Therefore, state-space oscillators extended with auto-regressive components may be able to perform the functions of the present tool through different means by circumventing the need to identify them in time-frequency. Another tool that should be mentioned is called PAPTO (Brady 2022). Although PAPTO does not address harmonics, it detects oscillatory events in the presence of 1/f background activity. Lastly, empirical mode decomposition (EMD) approaches have been studied in the context of neural harmonics and nonsinusoidal activity (Quinn 2021, Fabus 2022). EMD has an intrinsic relationship with extrema finding, in contrast with the present technique. In summary, the existence of methods such as PAPTO shows that researchers are converging on similar approaches to tackle similar problems. The existence of time-domain approaches such as state-space oscillators and EMD indicates that the field of timeseries analysis may yield even more approaches that are conceptually distinct and may theoretically circumvent the methodology of this tool.

      Response:  We thank the reviewer for this valuable insight.  In our manuscript, we acknowledge emerging approaches that employ state-space models or EMD for time-frequency analysis.  However, it's crucial to clarify that the primary focus in our study is on the detection and identification of the fundamental frequency, as well as the onset/offset of non-sinusoidal neural oscillations.  Thus, our emphasis lies specifically on these aspects.  We hope that future studies will use our methods as the basis to develop better methods for time-frequency analysis that will lead to a deeper understanding of harmonic structures.  

      Our Limitation section is addressing this issue.  Specifically, we recognize that a more sophisticated time-frequency analysis could contribute to improved sensitivity and that the core claim of our study is centered around the concept of increasing specificity in the detection of non-sinusoidal oscillations.  We hope that future studies will use this as a basis for improving time-frequency analysis in general.  Notably, our open-source code will greatly enable these future studies in this endeavor.  Specifically, in the first step of our algorithm, the timefrequency estimation can be replaced with any other preferred time-frequency analysis, such as state-space models, EMD, Wavelet transform, Gabor transform, and Matching Pursuit. 

      For our own follow-up study, we plan to conduct a thorough review and comparison of emerging approaches employing state-space models or EMD for time-frequency analysis.  In this study, we aim to identify which approach, including the six methods mentioned by the reviewer (Matsuda 2017, Beck 2022, He 2023, Brady 2022, Quinn 2021, and Fabus 2022), can maximize the estimation of the fundamental frequency of non-sinusoidal neural oscillations using CHO.  The insights provided by the reviewer are appreciated, and we will carefully consider these aspects in our follow-up study.  

      In the revision of this manuscript, we are setting the stage for these future studies.  Specifically, we added a discussion paragraph within the Limitation section about the state-space model, and EMD approaches:

      “However, because our CHO method is modular, the FFT-based time-frequency analysis can be replaced with more sophisticated time-frequency estimation methods to improve the sensitivity of neural oscillation detection.  Specifically, a state-space model (Matsuda 2017, Beck 2022, He 2023, Brady 2022) or empirical mode decomposition (EMD, Quinn 2021, Fabus 2022) may improve the estimation of the auto-correlation of the harmonic structure underlying nonsinusoidal oscillations.  Furthermore, a Gabor transform or matching pursuit-based approach may improve the onset/offset detection of short burst-like neural oscillations (Kus 2013 and Morales 2022).”

      (2) The criteria that the authors use for neural oscillations embody some operating assumptions underlying their characteristics, perhaps informed by immediate use cases intended by the authors (e.g., hippocampal bursts). The extent to which these assumptions hold in all circumstances should be investigated. For instance, the notion of consistent auto-correlation breaks down in scenarios where instantaneous frequency fluctuates significantly at the scale of a few cycles. Imagine an alpha-beta complex without harmonics (Jones 2009). If oscillations change phase position within a timeframe of a few cycles, it would be difficult for a single peak in the auto-correlation structure to elucidate the complex time-varying peak frequency in a dynamic fashion. Likewise, it is unclear whether bounding boxes with a pre-specified overlap can capture complexes that maneuver across peak frequencies.

      Response:  We thank the reviewer for this valuable insight into the methodological limitations in the detection of neural oscillations that exhibit significant fluctuations in their instantaneous frequency.  Indeed, our CHO method is also limited in the ability to detect oscillations with fluctuating instantaneous frequencies.  This is because CHO uses an auto-correlation-based approach to detect neural oscillations that exhibit two or more cycles.  If oscillations change phase position within a timeframe of a few cycles, CHO cannot detect the oscillation because the periodicity is not expressed within the auto-correlation.  This limitation can be partially overcome by relaxing the detection threshold (see Line 30 of Algorithm 1 in the revised manuscript) for the auto-correlation analysis.  However, relaxing the detection threshold, in consequence, increases the probability of detecting other aperiodic activity as well. To clarify how CHO determines the periodicity of oscillations, and to educate the reader about the tradeoff between detecting oscillations with fluctuating instantaneous frequencies and avoiding detecting other aperiod activity, we have added pseudo code and a new subsection in the Methods.

      Author response table 1.

      Algorithm 1

      A new subsection titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO”.

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also reduces sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1.  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

      (3) Related to the last item, this method appears to lack implementation of statistical inferential techniques for estimating and interpreting auto-correlation and spectral structure. In standard practice, auto-correlation functions and spectral measures can be subjected to statistical inference to establish confidence intervals, often helping to determine the significance of the estimates. Doing so would be useful for expressing the likelihood that an oscillation and its harmonic has the same autocorrelation structure and fundamental frequency, or more robustly identifying harmonic peaks in the presence of spectral noise. Here, the authors appear to use auto-correlation and time-frequency decomposition more as a deterministic tool rather than an inferential one. Overall, an inferential approach would help differentiate between true effects and those that might spuriously occur due to the nature of the data. Ultimately, a more statistically principled approach might estimate harmonic structure in the presence of noise in a unified manner transmitted throughout the methodological steps.

      Response:  We thank the reviewer for sharing this insight on further enhancing our method.  Indeed, CHO does not make use of statistical inferential statistics to estimate and interpret the auto-correlation and underlying spectral structure of the neural oscillation.  Implementing this approach within CHO would require calculating phase-phase coupling across all cross-frequency bands and bounding boxes.  However, as mentioned in the introduction section and Figure 1GL, phase-phase coupling analysis cannot fully ascertain whether the oscillations are phaselocked and thus are harmonics or, indeed, independent oscillations.  This ambiguity, combined with the exorbitant computational complexity of the entailed permutation test and the requirement to perform the analysis across all cross-frequency bands, channels, and trials, makes phase-phase coupling impracticable in determining the fundamental frequency of neural oscillations in real-time and, thus, the use in closed-loop neuromodulation applications.  Thus, within our study, we prioritized determining the fundamental frequency without considering the structure of harmonics.  

      An inferential approach can be implemented by adjusting the significance threshold that selects positive peaks within the auto-correlation of the signal.  Currently, this threshold is set to represent the approximate confidence bounds of the periodicity of the fundamental frequency.  To clarify this issue, we added additional pseudo code and a new subsection, titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO,” in the Methods section.

      In future studies, we will investigate the harmonic structure of neural oscillations based on a large data set.  This exploration will help us understand how non-sinusoidal properties may influence the harmonic structure.  Your input is highly appreciated, and we will diligently incorporate these considerations into our research.

      See Author response table 1.

      A new subsection titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO”.

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also reduces sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1.  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

      (4) As with any signal processing method, hyperparameters and their ability to be tuned by the user need to be clearly acknowledged, as they impact the robustness and reproducibility of the method. Here, some of the hyperparameters appear to be: a) number of cycles around which to construct bounding boxes and b) overlap percentage of bounding boxes for grouping. Any others should be highlighted by the authors and clearly explained during the course of tool dissemination to the community, ideally in tutorial format through the Github repository.

      Response:  We thank the reviewer for this helpful suggestion.  In response, we added a new subsection that describes the hyper-parameters of CHO as follows:

      A new subsection named “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO”.

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also reduces sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1.  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

      (5) Most of the validation demonstrations in this paper depict the detection capabilities of CHO. For example, the authors demonstrate how to use this tool to reduce false detection of oscillations made up of harmonic activity and show in simulated examples how CHO performs compared to other methods in detection specificity, sensitivity, and accuracy. However, the detection problem is not the same as the 'identity' problem that the paper originally introduced CHO to solve. That is, detecting a non-sinusoidal oscillation well does not help define or characterize its non-sinusoidal 'fingerprint'. An example problem to set up this question is: if there are multiple oscillations at the same base frequency in a dataset, how can their differing harmonic structure be used to distinguish them from each other? To address this at a minimum, Figure 4 (or a followup to it) should simulate signals at similar levels of detectability with different 'identities' (i.e. different levels and/or manifestations of harmonic structure), and evaluate CHO's potential ability to distinguish or cluster them from each other. Then, does a real-world dataset or neuroscientific problem exist in which a similar sort of exercise can be conducted and validated in some way? If the "what" question is to be sufficiently addressed by this tool, then this type of task should be within the scope of its capabilities, and validation within this scenario should be demonstrated in the paper. This is the most fundamental limitation at the paper's current state.

      Response: Thank you for your insightful suggestion; we truly appreciate it. We recognize that the 'identity' problem requires further studies to develop appropriate methods. Our current approach does not fully address this issue, as it may detect asymmetric non-sinusoidal oscillations with multiple harmonic peaks, without accounting for different shapes of nonsinusoidal oscillations.

      The main reason we could not fully address the “identity” problem results from the general absence of a defined ground truth, i.e., data for which we know the harmonic structure. To overcome this barrier, we would need datasets from well-characterized cognitive tasks or neural disorders.  For example, Cole et al. 2017 showed that the harmonic structure of beta oscillations can explain the degree of Parkinson’s disease, and Hu et al. 2023 showed that the number of harmonic peaks can localize the seizure onset zone. Future studies could use the data from these two studies to study whether CHO can distinguish different harmonic structures of pathological neural oscillations.

      In this paper, we showed the basic identity of neural oscillations, encompassing elements such as the fundamental frequency and onset/offset. Your valuable insights contribute significantly to our ongoing efforts, and we appreciate your thoughtful consideration of these aspects. In response, we added a new paragraph in the Limitation of the discussion section as below:

      “Another limitation of this study is that it does not assess the harmonic structure of neural oscillations. Thus, CHO cannot distinguish between oscillations that have the same fundamental frequency but differ in their non-sinusoidal properties.  This limitation stems from the objective of this study, which is to identify the fundamental frequency of non-sinusoidal neural oscillations.  Overcoming this limitation requires further studies to improve CHO to distinguish between different non-sinusoidal properties of pathological neural oscillations.  The data that is necessary for these further studies could be obtained from the wide range of studies that have linked the harmonic structures in the neural oscillations to various cognitive functions (van Dijk et al., 2010; Schalk, 2015; Mazaheri and Jensen, 2008) and neural disorders (Cole et al., 2017; Jackson et al., 2019; Hu et al., 2023). For example, Cole et al. 2017 showed that a harmonic structure of beta oscillations can explain the degree of Parkinson’s disease, and Hu et al. 2023 showed the number of harmonic peaks can localize the seizure onset zone. “

      References:

      Beck AM, He M, Gutierrez R, Purdon PL. An iterative search algorithm to identify oscillatory dynamics in neurophysiological time series. bioRxiv. 2022. p. 2022.10.30.514422.

      doi:10.1101/2022.10.30.514422

      Brady B, Bardouille T. Periodic/Aperiodic parameterization of transient oscillations (PAPTO)Implications for healthy ageing. Neuroimage. 2022;251: 118974.

      Fabus MS, Woolrich MW, Warnaby CW, Quinn AJ. Understanding Harmonic Structures Through Instantaneous Frequency. IEEE Open J Signal Process. 2022;3: 320-334.

      Jones SR, Pritchett DL, Sikora MA, Stufflebeam SM, Hämäläinen M, Moore CI. Quantitative analysis and biophysically realistic neural modeling of the MEG mu rhythm: rhythmogenesis and modulation of sensory-evoked responses. J Neurophysiol. 2009;102: 3554-3572.

      He M, Das P, Hotan G, Purdon PL. Switching state-space modeling of neural signal dynamics. PLoS Comput Biol. 2023;19: e1011395.

      Matsuda T, Komaki F. Time Series Decomposition into Oscillation Components and Phase Estimation. Neural Comput. 2017;29: 332-367.

      Quinn AJ, Lopes-Dos-Santos V, Huang N, Liang W-K, Juan C-H, Yeh J-R, et al. Within-cycle instantaneous frequency profiles report oscillatory waveform dynamics. J Neurophysiol. 2021;126: 1190-1208.

      Reviewer #2 (Public Review):

      Summary:

      A new toolbox is presented that builds on previous toolboxes to distinguish between real and spurious oscillatory activity, which can be induced by non-sinusoidal waveshapes. Whilst there are many toolboxes that help to distinguish between 1/f noise and oscillations, not many tools are available that help to distinguish true oscillatory activity from spurious oscillatory activity induced in harmonics of the fundamental frequency by non-sinusoidal waveshapes. The authors present a new algorithm which is based on autocorrelation to separate real from spurious oscillatory activity. The algorithm is extensively validated using synthetic (simulated) data, and various empirical datasets from EEG, intracranial EEG in various locations and domains (i.e. auditory cortex, hippocampus, etc.).

      Strengths:

      Distinguishing real from spurious oscillatory activity due to non-sinusoidal waveshapes is an issue that has plagued the field for quite a long time. The presented toolbox addresses this fundamental problem which will be of great use for the community. The paper is written in a very accessible and clear way so that readers less familiar with the intricacies of Fourier transform and signal processing will also be able to follow it. A particular strength is the broad validation of the toolbox, using synthetic, scalp EEG, EcoG, and stereotactic EEG in various locations and paradigms.

      Weaknesses:

      At many parts in the results section critical statistical comparisons are missing (e.g. FOOOF vs CHO). Another weakness concerns the methods part which only superficially describes the algorithm. Finally, a weakness is that the algorithm seems to be quite conservative in identifying oscillatory activity which may render it only useful for analysing very strong oscillatory signals (i.e.

      alpha), but less suitable for weaker oscillatory signals (i.e. gamma).

      Response: We thank Reviewer #2 for the assistance in improving this manuscript.  In the revised manuscript, we have added the missing statistical comparisons, detailed pseudo code, and a subsection that explains the hyper-parameters of CHO.  We also recognize the limitations of CHO in detecting gamma oscillations.  While our results demonstrate beta-band oscillations in ECoG and EEG signals (see Figures 5 and 6), we had no expectation to find gamma-band oscillations during a simple reaction time task.  This is because of the general absence of ECoG electrodes over the occipital cortex, where such gamma-band oscillations may be found. 

      Nevertheless, our CHO method should be able to detect gamma-band oscillations.  This is because if there are gamma-band oscillations, they will be reflected as a bump over the 1/f fit in the power spectrum, and CHO will detect them.  We apologize for not specifying the frequency range of the synthetic non-sinusoidal oscillations.  The gamma band was also included in our simulation. We added the frequency range (1-40 Hz) of the synthetic nonsinusoidal oscillations in the subsection, the caption of Figure 4, and the result section.

      Reviewer #1 (Recommendations For The Authors):

      (1) The example of a sinusoidal neural oscillation in Fig 1 seems to still exhibit a great deal of nonsinusoidal behavior. Although it is largely symmetrical, it has significant peak-trough symmetry as well as sharper peak structure than typical sinusoidal activity. Nevertheless, it has less harmonic structure than the example on the left. A more precisely-stated claim might be that non-sinusoidal behavior is not the distinguishing characteristic between the two, but rather the degree of harmonic structure.

      Response: We are grateful for this thoughtful observation. In response, we now recognize that the depicted example showcases pronounced peak-trough symmetry and sharpness, characteristics that might not be typically associated with sinusoidal behavior. We now better understand that the key differentiator between the examples lies not only in their nonsinusoidal behavior but also in their harmonic structure. To reflect this better understanding, we have refined our manuscript to more accurately articulate the differences in harmonic structure, in accordance with your suggestion. Specifically, we revised the caption of Fig 1 in the manuscript as follows:

      The caption of the Fig 1G-L.

      “We applied the same statistical test to a more sinusoidal neural oscillation (G). Since this neural oscillation more closely resembles a sinusoidal shape, it does not exhibit any prominent harmonic peaks in the alpha and beta bands within the power spectrum (H) and time-frequency domain (I).  Consequently, our test found that the phase of the theta-band and beta-band oscillations were not phase-locked (J-L).  Thus, this statistical test suggests the absence of a harmonic structure.”

      (2) The statement "This suggests that most of the beta oscillations

      detected by conventional methods are simply harmonics of the predominant asymmetric alpha oscillation." is potentially overstated. It is important to constrain this statement to the auditory cortex in which the authors conduct the validation, because true beta still exists elsewhere. The same goes for the beta-gamma claim later on. In general, use of "may be" is also more advisable than the definitive "are".

      Response: We thank the reviewer for this thoughtful feedback. To avoid the potential overstatement of our findings we revised our statement on beta oscillations in the manuscript as follows:

      Discussion:

      “This suggests that most of the beta oscillations detected by conventional methods within auditory cortex may be simply harmonics of the predominant asymmetric alpha oscillation.”

      Reviewer #2 (Recommendations For The Authors):

      All my concerns are medium to minor and I list them as they appear in the manuscript. I do not suggest new experiments or a change in the results, instead I focus on writing issues only.

      a) Line 50: A reference to the seminal paper by Klimesch et al (2007) on alpha oscillations and inhibition would seem appropriate here.

      Response: We added the reference to Klimesch et al. (2007).

      b) Figure 4: It is unclear which length for the simulated oscillations was used to generate the data in panels B-G.

      Response: We generated oscillations that were 2.5 cycles in length and 1-3 seconds in duration. We added this information to the manuscript as follows.

      Figure 4:

      “We evaluated CHO by verifying its specificity, sensitivity, and accuracy in detecting the fundamental frequency of non-sinusoidal oscillatory bursts (2.5 cycles, 1–3 seconds long) convolved with 1/f noise.”

      Results (page 5, lines 163-165):

      “To determine the specificity and sensitivity of CHO in detecting neural oscillations, we applied CHO to synthetic non-sinusoidal oscillatory bursts (2.5 cycles, 1–3 seconds long) convolved with 1/f noise, also known as pink noise, which has a power spectral density that is inversely proportional to the frequency of the signal.”

      Methods (page 20, lines 623-626):

      “While empirical physiological signals are most appropriate for validating our method, they generally lack the necessary ground truth to characterize neural oscillation with sinusoidal or non-sinusoidal properties. To overcome this limitation, we first validated CHO on synthetic nonsinusoidal oscillatory bursts (2.5 cycles, 1–3 seconds long) convolved with 1/f noise to test the performance of the proposed method.”

      c) Figure 5 - supplements: Would be good to re-organize the arrangement of the plots on these figures to facilitate the comparison between Foof and CHO (i.e. by presenting for each participant FOOOF and CHO together).

      Response: We combined Figure 5-supplementary figures 1 and 2 into Figure 5-supplementary figure 1, Figure 6-supplementary figures 1 and 2 into Figure 6-supplementary figure 1, and Figure 8-supplementary figures 1 and 2 into Figure 8-supplementary figure 1. 

      Author response image 1.

      Figure 5-supplementary figure 1:

      Author response image 2.

      Figure 6-supplementary figure 1:

      Author response image 3.

      Figure 8-supplementary figure 1:

      d) Statistics: Almost throughout the results section where the empirical results are described statistical comparisons are missing. For instance, in lines 212-213 the statement that CHO did not detect low gamma while FOOOF did is not backed up by the appropriate statistics. This issue is also evident in all of the following sections (i.e. EEG results, On-offsets of oscillations, SEEG results, Frequency and duration of oscillations). I feel this is probably the most important point that needs to be addressed.

      Response: We added statistical comparisons to Figure 5 (ECoG), 6 (EEG), and the results section as follows.

      Author response image 4.

      Validation of CHO in detecting oscillations in ECoG signals. A. We applied CHO and FOOOF to determine the fundamental frequency of oscillations from ECoG signals recorded during the pre-stimulus period of an auditory reaction time task. FOOOF detected oscillations primarily in the alpha- and beta-band over STG and pre-motor area.  In contrast, CHO also detected alpha-band oscillations primarily within STG, and more focal beta-band oscillations over the pre-motor area, but not STG. B. We investigated the occurrence of each oscillation within defined cerebral regions across eight ECoG subjects. The horizontal bars and horizontal lines represent the median and median absolute deviation (MAD) of oscillations occurring across the eight subjects. An asterisk (*) indicates statistically significant differences in oscillation detection between CHO and FOOOF (Wilcoxon rank-sum test, p<0.05 after Bonferroni correction).”

      Author response image 5.

      Validation of CHO in detecting oscillations in EEG signals. A. We applied CHO and FOOOF to determine the fundamental frequency of oscillations from EEG signals recorded during the pre-stimulus period of an auditory reaction time task.  FOOOF primarily detected alpha-band oscillations over frontal/visual areas and beta-band oscillations across all areas (with a focus on central areas). In contrast, CHO detected alpha-band oscillations primarily within visual areas and detected more focal beta-band oscillations over the pre-motor area, similar to the ECoG results shown in Figure 5. B. We investigated the occurrence of each oscillation within the EEG signals across seven subjects. An asterisk (*) indicates statistically significant differences in oscillation detection between CHO and FOOOF (Wilcoxon rank-sum test, p<0.05 after Bonferroni correction). CHO exhibited lower entropy values of alpha and beta occurrence than FOOOF across 64 channels. C. We compared the performance of FOOO and CHO in detecting oscillation across visual and pre-motor-related EEG channels. CHO detected more alpha and beta oscillations in visual cortex than in pre-motor cortex. FOOOF detected alpha and beta oscillations in visual cortex than in pre-motor cortex.

      We added additional explanations of our statistical results to the “Electrocorticographic (ECoG) results” and “Electroencephalographic (EEG) results” sections.

      “We compared neural oscillation detection rates between CHO and FOOOF across eight ECoG subjects.  We used FreeSurfer to determine the associated cerebral region for each ECoG location. Each subject performed approximately 400 trials of a simple auditory reaction-time task.  We analyzed the neural oscillations during the 1.5-second-long pre-stimulus period within each trial. CHO and FOOOF demonstrated statistically comparable results in the theta and alpha bands despite CHO exhibiting smaller median occurrence rates than FOOOF across eight subjects. Notably, within the beta band, excluding specific regions such as precentral, pars opercularis, and caudal middle frontal areas, CHO's beta oscillation detection rate was significantly lower than that of FOOOF (Wilcoxon rank-sum test, p < 0.05 after Bonferroni correction). This suggests comparable detection rates between CHO and FOOOF in premotor and Broca's areas, while the detection of beta oscillations by FOOOF in other regions, such as the temporal area, may represent harmonics of theta or alpha, as illustrated in Figure 5A and B. Furthermore, FOOOF exhibited a higher sensitivity in detecting delta, theta, and low gamma oscillations overall, although both CHO and FOOOF detected only a limited number of oscillations in these frequency bands.”

      “We assessed the difference in neural oscillation detection performance between CHO and FOOOF across seven EEG subjects.  We used EEG electrode locations according to the 10-10 electrode system and assigned each electrode to the appropriate underlying cortex (e.g., O1 and O2 for the visual cortex). Each subject performed 200 trials of a simple auditory reaction-time task.  We analyzed the neural oscillations during the 1.5-second-long pre-stimulus period. In the alpha band, CHO and FOOOF presented statistically comparable outcomes. However, CHO exhibited a greater alpha detection rate for the visual cortex than for the pre-motor cortex, as shown in Figures 6B and C. The entropy of CHO's alpha oscillation occurrences (3.82) was lower than that of FOOOF (4.15), with a maximal entropy across 64 electrodes of 4.16. Furthermore, in the beta band, CHO's entropy (4.05) was smaller than that of FOOOF (4.15). These findings suggest that CHO may offer a more region-specific oscillation detection than FOOOF.

      As illustrated in Figure 6C, CHO found fewer alpha oscillations in pre-motor cortex (FC2 and FC4) than in occipital cortex (O1 and O2), while FOOOF found more beta oscillations occurrences in pre-motor cortex (FC2 and FC4) than in occipital cortex. However, FOOOF found more alpha and beta oscillations in visual cortex than in pre-motor cortex.

      Consistent with ECoG results, FOOOF demonstrated heightened sensitivity in detecting delta, theta, and low gamma oscillations. 

      Nonetheless, both CHO and FOOOF identified only a limited number of oscillations in delta and theta frequency bands.

      Contrary to the ECoG results, FOOOF found more low gamma oscillations in EEG subjects than in ECoG subjects.”

      e) Line 248: The authors find an oscillatory signal in the hippocampus with a frequency at around 8 Hz, which they refer to as alpha. However, several researchers (including myself) may label this fast theta, according to the previous work showing the presence of fast and slow theta oscillations in the human hippocampus (https://pubmed.ncbi.nlm.nih.gov/21538660/, https://pubmed.ncbi.nlm.nih.gov/32424312/).

      Response: We replaced “alpha” with “fast theta” in the figure and text. We added a citation for Lega et al. 2012.

      f) Line 332: It could also be possible that the auditory alpha rhythms don’t show up in the EEG because a referencing method was used that was not ideal for picking it up. In general, re-referencing is an important preprocessing step that can make the EEG be more susceptible to deep or superficial sources and that should be taken into account when interpreting the data.

      Response: We re-referenced our signals using a common median reference (see Methods section). After close inspection of our results, we found that the EEG topography shown in Figure 6 did not show the auditory alpha oscillation because the alpha power of visual locations greatly exceeded that of those locations that reflect oscillations in the auditory cortex. Further, while our statistical analysis shows that CHO detected auditory alpha oscillations, this analysis also shows that CHO detected significantly more visual alpha oscillations.

      g) Line 463: It seems that the major limitation of the algorithm lies in its low sensitivity which is discussed by the authors. The authors seem to downplay this a bit by saying that the algorithm works just fine at SNRs that are comparable to alpha oscillations. However, alpha is the strongest single in human EEG which may make the algorithm less suitable for picking up less prominent oscillatory signals, i.e. gamma, theta, ripples, etc. Is CHO only seeing the ‘tip of the iceberg’?

      Response:  We performed the suggested analysis. For the theta band, this analysis generated convincing statistical results for ECoG signals (Figures 5, 6, and the results section). For theta oscillation detection, we found no statistical difference between CHO and FOOOF.  Since FOOOF has a high sensitivity even under SNRs (as shown in our simulation), our analysis suggests that CHO and FOOOF should perform equally well in the detection of theta oscillation, even when the theta oscillation amplitude is small.

      To validate the ability of CHO to detect oscillations in high-frequency bands (> 40Hz), such as gamma oscillations and ripples, our follow-up study is applying CHO in the detection of highfrequency oscillations (HFOs) in electrocorticographic signals recorded during seizures.  To this end, our follow-up study analyzed 26 seizures from six patients.  In this analysis, CHO showed similar sensitivity and specificity as the epileptogenicity index (EI), which is the most commonly used method to detect seizure onset times and zones. The results of this follow-up study were presented at the American Epilepsy Society Meeting in December of 2023, and we are currently preparing a manuscript for submission to a peer-reviewed journal. 

      In this study, we want to investigate the performance of CHO in detecting the most prominent neural oscillations (e.g., alpha and beta). Future studies will investigate the performance of  CHO in detecting more difficult to observe oscillations (delta in sleep stages, theta in the hippocampus during memory tasks, and high-frequency oscillation or ripples in seizure or interictal data. 

      h) Methods: The methods section, especially the one describing the CHO algorithm, is lacking a lot of detail that one usually would like to see in order to rebuild the algorithm themselves. I appreciate that the code is available freely, but that does not, in my opinion, relief the authors of their duty to describe in detail how the algorithm works. This should be fixed before publishing.

      Response: We now present pseudo code to describe the algorithms within the new subsection on the hyper-parameterization of CHO.

      See Author response table 1.

      A new subsection titled “Tradeoffs in adjusting the hyper-parameters that govern the detection in CHO.”

      “The ability of CHO to detect neural oscillations and determine their fundamental frequency is governed by four principal hyper-parameters.  Adjusting these parameters requires understanding their effect on the sensitivity and specificity in the detection of neural oscillations. 

      The first hyper-parameter is the number of time windows (N in Line 5 in Algorithm 1), that is used to estimate the 1/f noise.  In our performance assessment of CHO, we used four time windows, resulting in estimation periods of 250 ms in duration for each 1/f spectrum.  A higher number of time windows results in smaller estimation periods and thus minimizes the likelihood of observing multiple neural oscillations within this time window, which otherwise could confound the 1/f estimation.  However, a higher number of time windows and, thus, smaller time estimation periods may lead to unstable 1/f estimates. 

      The second hyper-parameter defines the minimum number of cycles of a neural oscillation to be detected by CHO (see Line 23 in Algorithm 1).  In our study, we specified this parameter to be two cycles.  Increasing the number of cycles increases specificity, as it will reject spurious oscillations.  However, increasing the number also sensitivity as it will reject short oscillations.

      The third hyper-parameter is the significance threshold that selects positive peaks within the auto-correlation of the signal.  The magnitude of the peaks in the auto-correlation indicates the periodicity of the oscillations (see Line 26 in Algorithm 1).  Referred to as "NumSTD," this parameter denotes the number of standard errors that a positive peak has to exceed to be selected to be a true oscillation.  For this study, we set the "NumSTD" value to 1 (the approximate 68% confidence bounds).  Increasing the "NumSTD" value increases specificity in the detection as it reduces the detection of spurious peaks in the auto-correlation.  However, increasing the "NumSTD" value also decreases the sensitivity in the detection of neural oscillations with varying instantaneous oscillatory frequencies. 

      The fourth hyper-parameter is the percentage of overlap between two bounding boxes that trigger their merger (see Line 31 in Algorithm 1).  In our study, we set this parameter to 75% overlap.  Increasing this threshold yields more fragmentation in the detection of oscillations, while decreasing this threshold may reduce the accuracy in determining the onset and offset of neural oscillations.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors investigate the tolerance of aminoglycosides in E. coli mutants deleted in the Krebs cycle and respiratory chain enzymes. The motivation for this study is unclear. Transport of aminoglycosides is pmf-dependent, as the authors correctly note, and knocking out energy-producing components leads to tolerance of aminoglycosides, this has been well established. In S. aureus, clinically relevant "small colony" strains selected for in the course of therapy with aminoglycosides acquire null mutations in the biosynthesis of heme or ubiquinone, and have been studied in detail. In E. coli, such knockouts have not been reported in clinical isolates, probably due to severe fitness costs.

      Response: We sincerely appreciate the time and consideration the reviewer dedicated to evaluating our manuscript. It's important to highlight that while the transport of aminoglycosides is PMF-dependent, recent studies underscore the potential role of metabolic mutations in antibiotic tolerance, a facet that warrants further investigation. For instance, the study by Henimann’s and Michiels' groups explored genomic changes in E. coli strains (including uropathogenic UTI89 strains) subjected to daily antibiotic exposure (Van den Bergh et al., 2022). Notably, mutations predominantly occurred in genes of the nuo operon, a key component of E. coli energy metabolism, suggesting a link between metabolic adaptations and antibiotic tolerance. Furthermore, the research by Collin's group revealed previously unrecognized genes related to central metabolism (e.g., icd, gltD, sucA) that contribute to antibiotic resistance in E. coli cells exposed to multiple antibiotics, including aminoglycosides (Lopatkin et al., 2021). These findings are corroborated by the presence of similar mutations in clinical E. coli pathogens, as evidenced by the analysis of a large library of 7243 E. coli genomes from NCBI Pathogen Detection (Lopatkin et al., 2021). The clinical relevance of metabolic mutations in antibiotic tolerance is increasingly recognized, yet their underlying mechanisms remain enigmatic. Therefore, elucidating the role of metabolic pathways in conferring antibiotic tolerance is highly critical. We have updated the introduction to clearly convey our motivation in this study (see page 4).

      At the same time, single-cell analysis has shown that individual cells with a decrease in the expression of Krebs cycle enzymes are tolerant of antibiotics and have lower ATP (Manuse et al., PLoS Biol 19: e3001194). The authors of the study under review report that knocking out ICD, isocitrate dehydrogenase that catalyzes the rate-limiting step in the Krebs cycle, has little effect on aminoglycoside tolerance and actually leads to an increase in the level of ATP over time. This observation does not seem to make much sense and contradicts previous reports, specifically that E. coli ICD is tolerant of antibiotics and, not surprisingly, produces Less ATP (Kabir and Shimizu, Appl Micro-biol Biotechnol. 2004; 65(1):84-96; Manuse et al., PLoS Biol 19: e3001194). Mutations in other Krebs cycle enzymes, unlike ICD, do lead to a dramatic increase in tolerance of aminoglycosides according to the paper under review. This is all very confusing.

      Response: Although our data cannot be directly compared to that of Kabir and Shimizu (Mohiuddin Kabir and Shimizu, 2004), due to the utilization of entirely different experimental procedures and measurement techniques, we can draw some parallels to the study conducted by Lewis’ group (Manuse et al., 2021), despite certain differences in experimental protocols. Furthermore, the reviewer has made strong assertions regarding our manuscript based on the findings of Lewis’ group. Thus, we believe it's pertinent to expand our response regarding that study.

      In the study of Lewis’ group, bacterial cells were inoculated at a ratio of 1:100 into LB medium from an overnight culture (approximately 16 hours). Subsequently, the cultures were incubated at 37°C for approximately 2 hours, and ATP levels were measured using the BacTiter Glo kit (Promega, Madison, WI, USA). ATP levels were then normalized to cell density, determined through optical density measurements, and represented on a linear diagram. As demonstrated in Supplementary Figure S1c of their paper, there was a 10-15% reduction in normalized ATP levels in the icd mutant compared to the wild type. In our experiments, cells were grown for 24 hours in overnight cultures, diluted 100-fold in fresh media, and ATP levels were measured at 3, 4, 5, and 6 hours using the same kit. ATP levels were normalized to cell counts quantified by flow cytometry. Upon analyzing our data of the icd mutant for around 3 hours (the time point closest to that of the study of Lewis’ group), we observed a reduction of approximately 15-20% (without statistical significance) in the icd mutant compared to the wild-type (see raw data, linear plot, and logarithmic plot below; Author response image 1), which aligns with the findings of Lewis’ group.

      We further investigated the gentamicin tolerance of both wild-type and icd mutant strains of E. coli BW25113 (Author response image 2). Our findings indicate that the increased sensitivity of the icd mutant of the MG1655 strain to gentamicin is similar to the observation in the other E. coli strain.

      Author response image 1.

      ATP levels in the icd mutant. ATP levels of both the mutant and wild-type strains were measured at t=3 hours of cell growth and normalized to cell counts. The figure presents the raw data (a), linear plot (b), and logarithmic plot (c) of the same dataset. This data corresponds to the first panel of Figure 3B in the manuscript.

      Author response image 2.

      Gentamicin tolerance of wild-type and icd mutant strains of E. coli BW25113. Both wild type and mutant strains were treated with gentamicin (50 µg/ml) for 5 hours at the mid-exponential phase. Cells were plated before and after treatment for CFU/ml counts. The dashed line represents the limit of detection. CFU: Colony forming units.

      We think that there are two primary reasons why our study cannot contradict the findings of the Lewis group:

      Firstly, our study cannot be directly compared to theirs, as they did not comprehensively explore the impact of gene deletions on cell metabolism beyond the measurement of ATP levels at a single time point (Manuse et al., 2021). Our study encompasses various metabolic parameters such as cellular ATP, redox status, proton motive force (PMF), intracellular pH, and drug uptake throughout the exponential and/or early stationary phase. Additionally, we conducted proteomic analysis for five different strains including mutants and wild type. Moreover, we performed pathway enrichment analysis grounded in the statistical background of the entire genome, encompassing various functional pathway classification frameworks such as Gene Ontology annotations, KEGG pathways, and Uniprot keywords. The results of these pathway enrichment analyses are now available in the Supplementary File (see Supplementary Tables 11-17 in the current manuscript). Thus, we believe it is unjust to deem our study contradictory compared to the Lewis group's study, which does not have a comprehensive analysis of the metabolism of the mutant strains they investigated.

      Secondly, our study cannot be compared to that specific study (Manuse et al., 2021) due to the utilization of a distinct antibiotic (ciprofloxacin). Cell tolerance is heavily reliant on the mechanism of action of the antibiotic used. Therefore, the reviewer should have focused on studies closely related to aminoglycoside tolerance. Our study is not confusing or contradictory, as Lewis’ group also demonstrated that the tolerance of the icd mutant to gentamicin was significantly reduced while the tolerance of other TCA cycle mutant strains was increased in a different study (Shan et al., 2015). However, they did not delve into the metabolism of these mutant strains, as we did. We now mention this point in our manuscript (see pages 14-15).

      Apart from the confusing data, it is not clear what useful information may be obtained from the choice of the experimental system. The authors examine exponentially growing cells of E. coli for tolerance of aminoglycosides. The population at this stage of growth is highly susceptible to aminoglycosides, and only some rare persister cells can survive. However, the authors do not study persisters. A stationary population of E. coli is tolerant of aminoglycosides, and this is clinically relevant, but this is not the subject of the study.

      Response: Respectfully, we must express our disagreement with the reviewer's comments. Our experimental system is meticulously organized and logically structured. Mutant strains such as gltA, sucA, and nuoI deletions exhibit increased tolerance to all aminoglycosides tested, with their fractions clearly increasing around the mid-exponential phase between 3-4 hours (refer to Figure 2B in our manuscript). This surge in tolerance is evident at the population level as well (as depicted in Figure 1A in our manuscript, where certain mutant strains demonstrate complete survival to streptomycin, with survival fractions nearing 1). Given the pronounced increase observed around the mid-exponential phase, we primarily characterize the metabolism of these cells during this growth phase.

      It's essential to note that any investigation into antibiotic tolerance and/or resistance holds immense significance, regardless of the growth phase under scrutiny, as antibiotic tolerance/resistance poses a substantial healthcare challenge. Additionally, metabolic mutant strains do not necessarily entail severe fitness costs, as evidenced by Figure S2A published by the Lewis group (Manuse et al., 2021), a finding consistent with our study (see Figure 2B in our manuscript). This phenomenon could confer a survival advantage to bacterial cells, as they may acquire metabolic mutations to bolster their tolerance without incurring significant fitness costs. Furthermore, numerous studies suggest that bacterial cells may opt for the evolutionary pathway leading to increased tolerance before acquiring resistance mechanisms (Levin-Reisman et al., 2017; Santi et al., 2021). The presence of metabolic mutations in clinical E. coli pathogens has also been confirmed through the analysis of a large library of 7243 E. coli genomes from NCBI Pathogen Detection by Collin’s group (Lopatkin et al., 2021). Consequently, comprehending the tolerance mechanisms of metabolic mutations holds paramount importance.

      References

      Levin-Reisman I, Ronin I, Gefen O, Braniss I, Shoresh N, Balaban NQ. 2017. Antibiotic tolerance facilitates the evolution of resistance. Science (1979) 355:826–830. doi:10.1126/science.aaj2191

      Lopatkin AJ, Bening SC, Manson AL, Stokes JM, Kohanski MA, Badran AH, Earl AM, Cheney NJ, Yang JH, Collins JJ. 2021. Clinically relevant mutations in core metabolic genes confer antibiotic resistance. Science (1979) 371. doi:10.1126/science.aba0862

      Manuse S, Shan Y, Canas-Duarte SJ, Bakshi S, Sun WS, Mori H, Paulsson J, Lewis K. 2021. Bacterial persisters are a stochastically formed subpopulation of low-energy cells. PLoS Biol 19. doi:10.1371/journal.pbio.3001194

      Mohiuddin Kabir M, Shimizu K. 2004. Metabolic regulation analysis of icd-gene knockout Escherichia coli based on 2D electrophoresis with MALDI-TOF mass spectrometry and enzyme activity measurements. Appl Microbiol Biotechnol 65:84–96. doi:10.1007/s00253-004-1627-1

      Santi I, Manfredi P, Maffei E, Egli A, Jenal U. 2021. Evolution of Antibiotic Tolerance Shapes Resistance Development in Chronic Pseudomonas aeruginosa Infections. doi:10.1128/mBio.03482-20

      Shan Y, Lazinski D, Rowe S, Camilli A, Lewis K. 2015. Genetic basis of persister tolerance to aminoglycosides in Escherichia coli. mBio 6. doi:10.1128/mBio.00078-15

      Van den Bergh B, Schramke H, Michiels JE, Kimkes TEP, Radzikowski JL, Schimpf J, Vedelaar SR, Burschel S, Dewachter L, Lončar N, Schmidt A, Meijer T, Fauvart M, Friedrich T, Michiels J, Heinemann M. 2022. Mutations in respiratory complex I promote antibiotic persistence through alterations in intracellular acidity and protein synthesis. Nat Commun 13:546. doi:10.1038/s41467-022-28141-x

      Reviewer #2 (Public Review):

      Summary:

      This interesting study challenges a dogma regarding the link between bacterial metabolism decrease and tolerance to aminoglycosides (AG). The authors demonstrate that mutants well-known for being tolerant to AG, such as those of complexes I and II, are not so due to a decrease in the proton motive force (PMF) and thus antibiotic uptake, as previously reported in the literature.

      Strengths:

      This is a complete study. These results are surprising and are based on various read-outs, such as ATP levels, pH measurement, membrane potential, and the uptake of fluorophore-labeled gentamicin. Utilizing a proteomic approach, the authors show instead that in tolerant mutants, there is a decrease in the levels of proteins associated with ribosomes (targets of AG), causing tolerance.

      Response: We sincerely appreciate the reviewer for taking the time to read our manuscript and offer valuable suggestions.

      Weaknesses:

      The use of a single high concentration of aminoglycoside: my main comment on this study concerns the use of an AG concentration well above the MIC (50 µg/ml or 25 µg/ml for uptake experiments), which is 10 times higher than previously used concentrations (Kohanski, Taber) in study showing a link with PMF. This significant difference may explain the discrepancies in results. Indeed, a high concentration of AG can mask the effects of a metabolic disruption and lead to less specific uptake. However, this concentration highlights a second molecular level of tolerance. Adding experiments using lower concentrations (we propose 5 µg/ml to compare with the literature) would provide a more comprehensive understanding of AG tolerance mechanisms during a decrease in metabolism.

      Another suggestion would be to test iron limitation (using an iron chelator as DIP), which has been shown to induce AG tolerance. Can the authors demonstrate if this iron limitation leads to a decrease in ribosomal proteins? This experiment would validate their hypothesis in the case of a positive result. Otherwise, it would help distinguish two types of molecular mechanisms for AG tolerance during a metabolic disruption: (i) PMF and uptake at low concentrations, (ii) ribosomal proteins at high concentrations.

      Response: While we acknowledge the intriguing possibility of exploring whether iron limitation results in a reduction of ribosomal proteins, we believe that this topic falls slightly outside the scope of our current study. This area warrants independent investigation since our current research did not specifically focus on iron-limited environments (LB medium is iron-rich, as referenced (Abdul-tehrani et al., 1999; Rodríguez-Rojas et al., 2015)). However, we fully concur with the notion that experimental outcomes may be contingent upon the concentration of aminoglycosides (AG). Hence, we repeated the critical experiments using a lower concentration of gentamicin (5 µg/mL), as suggested by the reviewer. Before delving into a discussion of these results, we wish to emphasize two key points. Firstly, the majority of our metabolic measurements, including ATP levels, redox activities, intracellular pH, and metabolomics, were conducted in mutant and wild-type cells in the absence of drugs. Our objective was to elucidate the impact of genetic perturbations of the TCA cycle on cell metabolism. Secondly, it's important to emphasize that our study does not invalidate the hypothesis that AG uptake is proton motive force (PMF)-dependent. We observed similar drug uptake across the strains tested, which is reasonable considering that their energy metabolism and PMF are not significantly altered compared to the wild type (at least we did not observe a consistent trend in their metabolic levels). Consequently, our study does not necessarily contradict with previous claims (Taber Harry W et al., 1987). We have now clarified this point in the manuscript (see pages 1 and 13).

      When we employed a lower gentamicin concentration, we still noted a significant elevation in tolerance among the gltA, sucA, and nuoI mutant strains compared to the wild type. Also, it remained evident that the observed tolerance in the mutant strains cannot be ascribed to differences in drug uptake or impaired PMF, as the levels of drug uptake and the disruption of PMF by gentamicin (at lower concentrations) in the mutant strains were comparable to those of the wild type. Moreover, since our metabolic measurements and proteomics analyses failed to reveal any notable alterations in energy metabolism in these strains, the consistency in drug uptake levels across both mutant and wild-type strains, even at lower concentrations, further bolsters the validity of our findings obtained at higher gentamicin concentrations. The new results have been incorporated into the Supplementary file (see Supplementary Figures S1, S5, S7, and S9) and discussed throughout the manuscript.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Line 120: Luria-Bertani (LB), used Lysogeny Broth.

      Line 180: "RSG dye can be reduced by bacterial reductases of PMF" to be reformulated.

      Response: The suggested corrections have been incorporated into the manuscript.

      References

      Abdul-tehrani H, Hudson AJ, Chang Y, Timms AR, Hawkins C, Williams JM, Harrison PM, Guest JR, Andrews SC. 1999. Ferritin Mutants of Escherichia coli Are Iron Deficient and Growth Impaired, and fur Mutants are Iron Deficient, Journal of Bacteriology.

      Rodríguez-Rojas A, Makarova O, Müller U, Rolff J. 2015. Cationic Peptides Facilitate Iron-induced Mutagenesis in Bacteria. PLoS Genet 11. doi:10.1371/journal.pgen.1005546

      Taber Harry W, Mueller JP, Miller PF, Arrow AS. 1987. Bacterial Uptake of Aminoglycoside Antibiotics. Microbiol Rev 51:439–457. doi:10.1128/mr.51.4.439-457.1987

    1. eLife assessment

      In this study, the authors found that a species of aphid that is a known agricultural pest salivated longer and produced more honeydew when feeding at night. The authors identified aphid genes with diurnal expression patterns, including potential saliva-related genes. Silencing these genes reduced aphid performance only on real plants, suggesting a specific role in plant feeding. While this study is valuable for understanding plant-insect interactions in agriculture, it is currently incomplete, as further research is needed to elucidate the function of the identified genes.

    2. Reviewer #1 (Public Review):

      Summary :

      This study presents valuable data on diurnal patterns in aphid (Rhopalosiphum padi) feeding behavior and transcriptome profiles. The authors measured honeydew production by the aphids on plants and artificial diet during the day and night and conducted a comprehensive feeding behavior study using EPG with many biological replicates at 6 time-points in 24 hours. They also conducted transcriptome analyses of three samples of each 30 aphids at these time points. Differentially expressed transcripts were grouped into four clusters with distinct expression patterns. The expression of two genes found to be diurnally rhythmic was knocked down with RNAi and these aphids did less well, especially at night. They also analyzed the differential expression of candidate effector genes and found rhythmic ones to be enriched for more expression in aphid heads versus bodies - this pattern is expected given that effectors are most likely expressed in the salivary glands. Knockdown of a known effector (C002) that is diurnally rhythmic, and a novel effector gene, was found to alter aphid feeding dynamics and performance.

      Strengths:

      The manuscript was highly accessible, with clear writing, and the figures provided were both comprehensive and of good quality. The datasets generated from this research are valuable to the research field, especially the findings for honeydew secretion, EPG analysis, and transcriptome experiments.

      The datasets generated in this study will be useful to scientists working on aphids and aphid-plant interactions and will inform similar studies on other insect species.

      Weaknesses:

      The weaknesses mainly relate to the (depth of) analyses and interpretation of the data. Also, some methods require more explanation, as follows:

      In Figure 1, data show that aphids produce more honeydew at night than during the day. This suggests that the aphids ingest more phloem (E2 phase). However, in Figure 1d the duration of the E2 phase does not show obvious differences among the time points in the 24 hours. The authors contribute the explanation that the aphids may osmoregulate more during the night, leading to more honeydew secretion at night. This may be the case, but there could be other explanations. For example, the physiology, including regulation of water transport, of plants is known to change during night/day. The authors may focus this section more on the differences in the E1 phase, as this involves the delivery of aphid saliva and effectors into the plant phloem.

      Transcriptome data shown in Figure 2 (and the experimental procedure of Figure 5b) appears to be based on three biological replicates. However, these replicates appear to have been harvested at the same time in the experiment, and this makes them technical replicates, not biological replicates. The inclusion of true biological replicates that include samples from time series experiments done on different days should be considered.

      The authors conducted knockdown experiments targeting aquaporin 1 and gut sucrase 1 in aphids, resulting in reduced nymph production and decreased honeydew secretion. It is concluded that these results indicate significant roles of aquaporin 1 and gut sucrase 1 in diurnal regulation. However, it is essential to consider that these genes likely play crucial roles in aphid physiology beyond diurnal rhythms. Consequently, reduced expression would naturally impair aphid performance. The dsAQP1 and dsSUC1 aphids consistently produced less honeydew, regardless of the time of day, indicating a broader impact of gene knockdown. The observed increase of the phenotype at night may not be attributable to the specific roles of these genes in diurnal regulation but rather due to heightened aphid activity during that time (as evidenced by increased honeydew secretion) that could magnify the impact of the knockdown effect, making it easier to observe. Therefore, the knockdown of aquaporin 1 and gut sucrase 1 may exert a general negative influence on aphid fitness, independently of diurnal factors.

      To analyze the roles of genes in diurnal regulation, additional controls should be incorporated. This could involve the knockdown of genes with essential functions that are not influenced by diurnal rhythms, providing a baseline comparison. Furthermore, consider including genes known to be involved in diurnal regulation in other insects, as documented in the existing literature, in the experimental design.

      The same arguments as for aquaporin 1 and gut sucrase 1 above may be made for knockdown of effector genes (Figure 4). It has already been shown that knockdown of C002 impacts aphid performance, and the data herein may be explained by a general lower performance of aphids rather than a specific function of these effectors in diurnal regulation. It is also expected that knockdown of the effectors has less impact on aphids feeding from artificial diets. This does not necessarily indicate the role of the effectors in diurnal regulation.

      In the abstract and elsewhere, the authors assert priority by stating, "...the first evidence of...". However, it's important to note that priority claims are often challenging to verify across many fields. Instead of relying solely on claims of precedence, the evidence presented in the research could stand on its own merit.

      Conclusion:

      The study presents intriguing new findings, particularly in the realms of honeydew analysis, EPG, and transcriptome analysis. However, the interpretation of subsequent studies employing gene knockdowns needs further consideration.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors conducted a time-course of whole-body transcriptional analysis of a pest aphid, Rhopalosiphum padi, and identified four major clusters of the genes that show diurnal rhythmicity in transcription. In addition, they conducted the analysis of aphid feeding behaviour and showed that aphids salivate longer from the end of the day toward the beginning of the night while their phloem feeding time does not change throughout the day. The genes up-regulated at night time were enriched with the genes involved in metabolic activities, collaborating with the results showing a higher number of honeydew excretion at night. The authors identified the list of candidate salivary genes that show diurnal rhythmicity in the transcription and silenced a salivary gene C002 and the candidate salivary gene E8696. Silencing of these genes reduced aphid fecundity and survival rate on the host plant but not on the artificial diet.

      Strengths:

      The time-course transcription study and its analysis will be of interest to researchers studying diurnal rhythms in insect biology. Also, the analysis of aphid feeding behaviour at different times of day is interesting. This study provides variable resources for those who study insect biology.

      Weaknesses:

      It is not clear to me which data was used to define the putative salivary effectors for R. padi, but the candidate salivary gene list made by Thorpe et al consists of the aphid genes encoding secreted proteins that are up-regulated in the head samples compared to the body samples. Although some proteins were confirmed to be secreted into the aphid saliva, many genes in the list are not confirmed to be expressed in the aphid salivary glands, and their products are not confirmed to be secreted into the saliva and the plant. Is E8696 expressed in the aphid salivary glands and secreted into its host plant? Without the data confirming the expression of the gene in the salivary glands and its secretion into the saliva and into the host plant, we cannot call the protein a salivary protein. Furthermore, without the observation that E8696 has some effect on plant biology, we cannot call it an aphid effector. Therefore, I cannot agree with the parts of the manuscript that refer to E8686 as an aphid salivary effector.

      It is interesting to know that some candidate salivary gene expression showed a diurnal rhythm. However, without the knowledge of the functions of the salivary effectors, especially their targets, it is not possible to conclude that the rhythmical expression is important for the aphid performance. In addition, I wonder whether the increase in gene expression is directly correlated with the increase of protein secretion into the saliva and the plant.

      Finally, the authors examined aphid survival, fecundity, and feeding behaviour. Those are important for overall aphid performance, but they do not "shape" aphid colonization. Aphid colonisation is shaped by the mechanisms by which aphids find and select their host plant and start to feed on it. Therefore, I do not agree with the title of this manuscript and some parts of the discussion.

      I would like the authors to develop how the knowledge of the diurnal rhythm of aphid feeding can contribute to optimise pest management. I see that there are some differences in aphid metabolism and feeding behaviour between day and night, but I would like to hear how such knowledge can optimise pest management strategies.

    1. eLife assessment

      This fundamental study investigates the transcriptional changes in neurons that underlie loss of learning and memory with age in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. The presented evidence is compelling, utilizing a cutting-edge method to isolate neurons from worms for genomics that is clearly conveyed with a rigorous experimental approach. Overall, this study supports that older daf-2 worms maintain cognitive function via mechanisms that are unique from younger wild type worms, which will be of great interest to neuroscientists and researchers studying ageing.

    2. Reviewer #1 (Public Review):

      The authors perform RNA-seq on FACS isolated neurons from adult worms at days 1 and 8 of adulthood to profile the gene expression changes that occur with cognitive decline. Supporting data are included indicating that by day 7 of adulthood, learning and memory are reduced, indicating that this timepoint or after represents cognitively aged worms. Neuronal identity genes are reduced in expression within the cognitively aged worms, whereas genes involved in proteostasis, transcription/chromatin, and the stress response are elevated. A number of specific examples are provided, representing markers of specific neuronal subtypes, and correlating expression changes to the erosion of particular functions (e.g. motor neurons, chemosensory neurons, aversive learning neurons, etc).

      To investigate whether upregulation of genes in neurons with age is compensatory or deleterious, the authors reduced expression of a set of three significantly upregulated genes and performed behavioral assays in young adults. In each case, reduction of expression improved memory, consistent with a model in which age-associated increases impair neuronal function.

      The authors then characterize learning and memory in wild type, daf-2, and daf-2/daf-16 worms with age and find that daf-2 worms have an extended ability to learn for approximately 10 days longer that wild types. This was daf-16 dependent. Memory was extended in daf-2 as well, and strikingly, daf-2;daf-16 had no short term memory even at day 1. Transcriptomic analysis of FACS-sorted neurons was performed on the three groups at day 8. The authors focus their analysis on daf-2 vs. daf-2;daf-16 and present evidence that daf-2 neurons express a stress-resistance gene program. They also find small differences between the N2 and daf-2;daf-16 neurons, which correlate with the observed behavioral differences, though these differences are modest.

      The authors tested eight candidate genes that were more highly expressed in daf-2 neurons vs. daf-2;daf-16 and showed that reduction of 2 and 5 of these genes impaired learning and memory, respectively, in daf-2 worms. This finding implicates specific neuronal transcriptional targets of IIS in maintaining cognitive ability in daf-2 with age, which, importantly, are distinct from those in young wild type worms.

      Overall, this is a strong study with rigorously performed experiments. The authors achieved their aim of identifying transcriptional changes in neurons that underlie loss of learning and memory in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants.

    3. Reviewer #2 (Public Review):

      Weng et al. perform a comprehensive study of gene expression changes in young and old animals, in wild-type and daf-2 insulin receptor mutants, in the whole animal and specifically in the nervous system. Using this data, they identify gene families that are correlated with neuronal ageing, as well as a distinct set of genes that are upregulated in neurons of aged daf-2 mutants. This is particularly interesting as daf-2 mutants show both extended lifespan and healthier neurons in aged animals, reflected by better learning/memory in older animals compared with wild-type controls. Indeed, knockdown of several of these upregulated genes resulted in poorer learning and memory. In addition, the authors showed that several genes upregulated during ageing in wild-type neurons also contribute to learning and memory; specifically, knockdown of these genes in young animals resulted in improved memory. This indicates that (at least in this small number of cases), genes that show increased transcript levels with age in the nervous system somehow suppress memory, potentially by having damaging effects on neuronal health.

      Finally, from a resource perspective, the neuronal transcriptome provided here will be very useful for C. elegans researchers as it adds to other existing datasets by providing the transcriptome of older animals (animals at day 8 of adulthood) and demonstrating the benefits of performing tissue-specific RNAseq instead of whole-animal sequencing.

      The work presented here is of high quality and the authors present convincing evidence supporting their conclusions.

    4. Reviewer #3 (Public Review):

      Summary

      In this manuscript, Weng et al. identify the neuron specific transcriptome that impacts age dependent cognitive decline. The authors design a pipeline to profile neurons from wild type and long-lived insulin receptor/IGF-1 mutants using timepoints when memory functions are declining. They discover signatures unique to neurons which validates their approach. The authors identify that genes related to neuronal identity are lost with age in wild type worms. For example, old neurons reduce the expression of genes linked to synaptic function and neuropeptide signaling and increase the expression of chromatin regulators, insulin peptides and glycoproteins. Depletion of selected genes which are upregulated in old neurons (utx-1, ins-19 and nmgp-1) leads to improved short memory function. This indicates that some genes that increase with age have detrimental effects on learning and memory. The pipeline is then used to test neuronal profiles of long-lived insulin/IGF-1 daf-2 mutants. Genes related to stress response pathways are upregulated in long lived daf-2 mutants (e.g. dod-24, F08H9.4) and those genes are required for improved neuron function.

      Strengths

      The manuscript is well written, and the experiments are well described. The authors take great care to explain their reasoning for performing experiments in a specific way and guide the reader through the interpretation of the results, which makes this manuscript an enjoyable and interesting read. The authors discover novel regulators of learning and memory using neuron-specific transcriptomic analysis in aged animals, which underlines the importance of cell specific deep sequencing. The timepoints of the transcriptomic profiling are elegantly chosen, as they coincide with the loss of memory and can be used to specifically reveal gene expression profiles related to neuron function. The authors discuss on the dod-24 example how powerful this approach is. In daf-2 mutants whole-body dod-24 expression differs from neuron specific profiles, which underlines the importance of precise cell specific approaches. This dataset provides a very useful resource for the C. elegans and aging community as it complements existing datasets with additional time points and neuron specific deep profiling.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This fundamental study investigates the transcriptional changes in neurons that underlie loss of learning and memory with age in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. The presented evidence is compelling, utilizing a cutting-edge method to isolate neurons from worms for genomics that is clearly conveyed with a rigorous experimental approach. Overall, this study supports that older daf-2 worms maintain cognitive function via mechanisms that are unique from younger wild type worms, which will be of great interest to neuroscientists and researchers studying ageing.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors perform RNA-seq on FACS isolated neurons from adult worms at days 1 and 8 of adulthood to profile the gene expression changes that occur with cognitive decline. Supporting data are included indicating that by day 7 of adulthood, learning and memory are reduced, indicating that this timepoint or after represents cognitively aged worms. Neuronal identity genes are reduced in expression within the cognitively aged worms, whereas genes involved in proteostasis, transcription/chromatin, and the stress response are elevated. A number of specific examples are provided, representing markers of specific neuronal subtypes, and correlating expression changes to the erosion of particular functions (e.g. motor neurons, chemosensory neurons, aversive learning neurons, etc).

      To investigate whether upregulation of genes in neurons with age is compensatory or deleterious, the authors reduced expression of a set of three significantly upregulated genes and performed behavioral assays in young adults. In each case, reduction of expression improved memory, consistent with a model in which age-associated increases impair neuronal function.

      The authors then characterize learning and memory in wild type, daf-2, and daf-2/daf-16 worms with age and find that daf-2 worms have an extended ability to learn for approximately 10 days longer that wild types. This was daf-16 dependent. Memory was extended in daf-2 as well, and strikingly, daf-2;daf-16 had no short term memory even at day 1. Transcriptomic analysis of FACS-sorted neurons was performed on the three groups at day 8. The authors focus their analysis on daf-2 vs. daf-2;daf-16 and present evidence that daf-2 neurons express a stress-resistance gene program. They also find small differences between the N2 and daf-2;daf-16 neurons, which correlate with the observed behavioral differences, though these differences are modest.

      The authors tested eight candidate genes that were more highly expressed in daf-2 neurons vs. daf-2;daf-16 and showed that reduction of 2 and 5 of these genes impaired learning and memory, respectively, in daf-2 worms. This finding implicates specific neuronal transcriptional targets of IIS in maintaining cognitive ability in daf-2 with age, which, importantly, are distinct from those in young wild type worms.

      Overall, this is a strong study with rigorously performed experiments. The authors achieved their aim of identifying transcriptional changes in neurons that underlie loss of learning and memory in C. elegans, and how cognition is maintained in insulin/IGF-1-like signaling mutants. 

      We thank you for the evaluation and response.

      Reviewer #2 (Public Review):

      Weng et al. perform a comprehensive study of gene expression changes in young and old animals, in wild-type and daf-2 insulin receptor mutants, in the whole animal and specifically in the nervous system. Using this data, they identify gene families that are correlated with neuronal ageing, as well as a distinct set of genes that are upregulated in neurons of aged daf-2 mutants. This is particularly interesting as daf-2 mutants show both extended lifespan and healthier neurons in aged animals, reflected by better learning/memory in older animals compared with wild-type controls. Indeed, knockdown of several of these upregulated genes resulted in poorer learning and memory. In addition, the authors showed that several genes upregulated during ageing in wild-type neurons also contribute to learning and memory; specifically, knockdown of these genes in young animals resulted in improved memory. This indicates that (at least in this small number of cases), genes that show increased transcript levels with age in the nervous system somehow suppress memory, potentially by having damaging effects on neuronal health.

      Finally, from a resource perspective, the neuronal transcriptome provided here will be very useful for C. elegans researchers as it adds to other existing datasets by providing the transcriptome of older animals (animals at day 8 of adulthood) and demonstrating the benefits of performing tissue-specific RNAseq instead of whole-animal sequencing.

      The work presented here is of high quality and the authors present convincing evidence supporting their conclusions. I only have a few comments/suggestions:

      (1) Do the genes identified to decrease learning/memory capacity in daf-2 animals (Figure 4d/e) also impact neuronal health? daf-2 mutant worms show delayed onset of age-related changes to neuron structure (Tank et al., 2011, J Neurosci). Does knockdown of the genes shown to affect learning also affect neuron structure during ageing, potentially one mechanism through which they modulate learning/memory? 

      (2) The learning and memory assay data presented in this study uses the butanone olfactory learning paradigm, which is well established by the same group. Have the authors tried other learning assays when testing for learning/memory changes after knockdown of candidate genes? Depending on the expression pattern of these genes, they may have more or less of an effect on olfactory learning versus for e.g. gustatory or mechanosensory-based learning.

      (3) A comment on the 'compensatory vs dysregulatory' model as stated by the authors on page 7 - I understand that this model presents the two main options, but perhaps this is slightly too simplistic: gene expression that rises during ageing may be detrimental for memory (= dysregulatory), but at the same time may also be beneficial other physiological roles in other tissues (=compensatory). 

      Thank you for your original suggestions; we addressed them in the previous version of response to the reviewers.

      Comments on revised version:

      I am satisfied with how the authors have addressed all my comments/suggestions. 

      Thank you for your response!

      Reviewer #3 (Public Review):

      Summary

      In this manuscript, Weng et al. identify the neuron specific transcriptome that impacts age dependent cognitive decline. The authors design a pipeline to profile neurons from wild type and long-lived insulin receptor/IGF-1 mutants using timepoints when memory functions are declining. They discover signatures unique to neurons which validates their approach. The authors identify that genes related to neuronal identity are lost with age in wild type worms. For example, old neurons reduce the expression of genes linked to synaptic function and neuropeptide signaling and increase the expression of chromatin regulators, insulin peptides and glycoproteins. Depletion of selected genes which are upregulated in old neurons (utx-1, ins-19 and nmgp-1) leads to improved short memory function. This indicates that some genes that increase with age have detrimental effects on learning and memory. The pipeline is then used to test neuronal profiles of long-lived insulin/IGF-1 daf-2 mutants. Genes related to stress response pathways are upregulated in long lived daf-2 mutants (e.g. dod-24, F08H9.4) and those genes are required for improved neuron function.

      Strengths

      The manuscript is well written, and the experiments are well described. The authors take great care to explain their reasoning for performing experiments in a specific way and guide the reader through the interpretation of the results, which makes this manuscript an enjoyable and interesting read. The authors discover novel regulators of learning and memory using neuron-specific transcriptomic analysis in aged animals, which underlines the importance of cell specific deep sequencing. The timepoints of the transcriptomic profiling are elegantly chosen, as they coincide with the loss of memory and can be used to specifically reveal gene expression profiles related to neuron function. The authors discuss on the dod-24 example how powerful this approach is. In daf-2 mutants whole-body dod-24 expression differs from neuron specific profiles, which underlines the importance of precise cell specific approaches. This dataset will provide a very useful resource for the C. elegans and aging community as it complements existing datasets with additional time points and neuron specific deep profiling.

      Weakness

      This study nicely describes the neuron specific profiles of aged long-lived daf-2 mutants. Selected neuronal genes that were upregulated in daf-2 mutants (e.g. F08H9.4, mtl-1, dod-24, alh-2, C44B7.5) decreased learning/memory when knocked down. However, the knock down of these genes was not specific to neurons. The authors use a neuron-sensitive RNAi strain to address this concern and acknowledge this caveat in the text. While it is likely that selected candidates act only in neurons it is possible that other tissues participate as well.

      Thank you for pointing this caveat out. We have mentioned it in the figure legend.

    1. eLife assessment

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The study presents valuable insights that the location of synapses on dendritic branches, as well as synaptic plasticity of excitatory and inhibitory synapses, influences the ability of a neuron to discriminate combinations of sensory stimuli. However, the evidence presented is incomplete - the major conclusions are only partially supported by the data presented, and there are identified gaps between the supporting evidence and the major conclusions.

    2. Reviewer #1 (Public Review):

      Summary:

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." The proposed mechanisms result in moderate performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Given the high level of complexity of all components of the model, it is not clear which features of which components are most important for its performance. There is also room for improvement in the narrative structure of the manuscript and the organization of concepts and data.

      Strengths:

      The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.

      Weaknesses:

      The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation. That said, the fundamental concepts behind nonlinear feature binding in neurons with compartmentalized dendrites have been explored in previous work, so it is not clear how this study represents a significant conceptual advance. Finally, the presentation of the model, the motivation and justification of each design choice, and the interpretation of each result could be restructured for clarity to be better received by a wider audience.

    3. Reviewer #2 (Public Review):

      Summary:

      The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.

      Strengths:

      The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.

      Weaknesses:

      I am concerned that the manuscript was submitted too hastily, as evidenced by the quality and logic of the writing and the presentation of the figures. These issues may compromise the integrity of the work. I would recommend a substantial revision of the manuscript to improve the clarity of the writing, incorporate more experiments, and better define the goals of the study.

      Major Points:

      (1) Quality of Scientific Writing: The current draft does not meet the expected standards. Key issues include:

      i. Mathematical and Implementation Details: The manuscript lacks comprehensive mathematical descriptions and implementation details for the plasticity models (LTP/LTD/Meta) and the SPN model. Given the complexity of the biophysically detailed multicompartment model and the associated learning rules, the inclusion of only nine abstract equations (Eq. 1-9) in the Methods section is insufficient. I was surprised to find no supplementary material providing these crucial details. What parameters were used for the SPN model? What are the mathematical specifics for the extra-synaptic NMDA receptors utilized in this study? For instance, Eq. 3 references [Ca2+]-does this refer to calcium ions influenced by extra-synaptic NMDARs, or does it apply to other standard NMDARs? I also suggest the authors provide pseudocodes for the entire learning process to further clarify the learning rules.

      ii. Figure quality. The authors seem not to carefully typeset the images, resulting in overcrowding and varying font sizes in the figures. Some of the fonts are too small and hard to read. The text in many of the diagrams is confusing. For example, in Panel A of Figure 3, two flattened images are combined, leading to small, distorted font sizes. In Panels C and D of Figure 7, the inconsistent use of terminology such as "kernels" further complicates the clarity of the presentation. I recommend that the authors thoroughly review all figures and accompanying text to ensure they meet the expected standards of clarity and quality.

      iii. Writing clarity. The manuscript often includes excessive and irrelevant details, particularly in the mathematical discussions. On page 24, within the "Metaplasticity" section, the authors introduce the biological background to support the proposed metaplasticity equation (Eq. 5). However, much of this biological detail is hypothesized rather than experimentally verified. For instance, the claim that "a pause in dopamine triggers a shift towards higher calcium concentrations while a peak in dopamine pushes the LTP kernel in the opposite direction" lacks cited experimental evidence. If evidence exists, it should be clearly referenced; otherwise, these assertions should be presented as theoretical hypotheses. Generally, Eq. 5 and related discussions should be described more concisely, with only a loose connection to dopamine effects until more experimental findings are available.

      (2) Goals of the Study: The authors need to clearly define the primary objective of their research. Is it to showcase the computational advantages of the local learning rule, or to elucidate biological functions?

      i. Computational Advantage: If the intent is to demonstrate computational advantages, the current experimental results appear inadequate. The learning rule introduced in this work can only solve for four features, whereas previous research (e.g., Bicknell and Hausser, 2021) has shown capability with over 100 features. It is crucial for the authors to extend their demonstrations to prove that their learning rule can handle more than just three features. Furthermore, the requirement to fine-tune the midpoint of the synapse function indicates that the rule modifies the "activation function" of the synapses, as opposed to merely adjusting synaptic weights. In machine learning, modifying weights directly is typically more efficient than altering activation functions during learning tasks. This might account for why the current learning rule is restricted to a limited number of tasks. The authors should critically evaluate whether the proposed local learning rule, including meta-plasticity, actually offers any computational advantage. This evaluation is essential to understand the practical implications and effectiveness of the proposed learning rule.

      ii. Biological Significance: If the goal is to interpret biological functions, the authors should dig deeper into the model behaviors to uncover their biological significance. This exploration should aim to link the observed computational features of the model more directly with biological mechanisms and outcomes.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary:

      This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." The proposed mechanisms result in moderate performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Given the high level of complexity of all components of the model, it is not clear which features of which components are most important for its performance. There is also room for improvement in the narrative structure of the manuscript and the organization of concepts and data.

      To begin with, we will better explain the goal of the study in the introduction and explain that it relies on earlier theoretical work. The goal of the study was to investigate whether and how detailed neuron models with biologically-based morphologies, membrane properties, ion channels, dendritic nonlinearities, and biologically plausible learning rules can quantitatively account for the theoretical results obtained with more abstract models.

      We will further evaluate and clarify the roles of several components in our model regarding their impact on the results. These include a) the role of sufficiently robust and supralinear plateau potentials in computing the NFBP; and b) the importance of metaplasticity for individual synapses, allowing them to start or stop responding to relevant or irrelevant stimuli, respectively, over the training period.

      Strengths:

      The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.

      Weaknesses:

      The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation. That said, the fundamental concepts behind nonlinear feature binding in neurons with compartmentalized dendrites have been explored in previous work, so it is not clear how this study represents a significant conceptual advance. Finally, the presentation of the model, the motivation and justification of each design choice, and the interpretation of each result could be restructured for clarity to be better received by a wider audience.

      To achieve the goal of the study as described above, we chose to use a biophysically and morphologically detailed neuron model to see if it could quantitatively account for the theoretically-based nonlinear computations, for instance, those discussed in Tran-Van-Minh, A. et al. (2015).

      We will explain the role of each component of the learning rule, as well as the dendritic nonlinearities, for the performance on the NFBP.

      Reviewer #2 (Public Review):

      Summary:

      The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.

      Strengths:

      The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.

      Indeed, the learning rule is local and reward-based, and we will highlight better in the paper that it is “always on”, i.e. there are no separate training and testing phases.

      Weaknesses:

      I am concerned that the manuscript was submitted too hastily, as evidenced by the quality and logic of the writing and the presentation of the figures. These issues may compromise the integrity of the work. I would recommend a substantial revision of the manuscript to improve the clarity of the writing, incorporate more experiments, and better define the goals of the study.

      We will revise the manuscript thoroughly to better present the figures and writing (more detailed below). We will also show supplementary figures showcasing the role of the different components of the learning rule.

      Major Points:

      (1) Quality of Scientific Writing: The current draft does not meet the expected standards. Key issues include:

      i. Mathematical and Implementation Details: The manuscript lacks comprehensive mathematical descriptions and implementation details for the plasticity models (LTP/LTD/Meta) and the SPN model. Given the complexity of the biophysically detailed multicompartment model and the associated learning rules, the inclusion of only nine abstract equations (Eq. 1-9) in the Methods section is insufficient. I was surprised to find no supplementary material providing these crucial details. What parameters were used for the SPN model? What are the mathematical specifics for the extra-synaptic NMDA receptors utilized in this study? For instance, Eq. 3 references [Ca2+]-does this refer to calcium ions influenced by extra-synaptic NMDARs, or does it apply to other standard NMDARs? I also suggest the authors provide pseudocodes for the entire learning process to further clarify the learning rules.

      The detailed setup of the model is described in the referenced papers, including equations and parameter values. The model is downloadable on github. For this reason we did not repeat the information here. That said, we will go through the manuscript and clarify all details, and provide supplemental figures and a GitHub link where necessary for reproducing the results.

      ii. Figure quality. The authors seem not to carefully typeset the images, resulting in overcrowding and varying font sizes in the figures. Some of the fonts are too small and hard to read. The text in many of the diagrams is confusing. For example, in Panel A of Figure 3, two flattened images are combined, leading to small, distorted font sizes. In Panels C and D of Figure 7, the inconsistent use of terminology such as "kernels" further complicates the clarity of the presentation. I recommend that the authors thoroughly review all figures and accompanying text to ensure they meet the expected standards of clarity and quality.

      We will revise the figures for consistency and clarity.

      iii. Writing clarity. The manuscript often includes excessive and irrelevant details, particularly in the mathematical discussions. On page 24, within the "Metaplasticity" section, the authors introduce the biological background to support the proposed metaplasticity equation (Eq. 5). However, much of this biological detail is hypothesized rather than experimentally verified. For instance, the claim that "a pause in dopamine triggers a shift towards higher calcium concentrations while a peak in dopamine pushes the LTP kernel in the opposite direction" lacks cited experimental evidence. If evidence exists, it should be clearly referenced; otherwise, these assertions should be presented as theoretical hypotheses. Generally, Eq. 5 and related discussions should be described more concisely, with only a loose connection to dopamine effects until more experimental findings are available.

      The reviewer is correct; the cited text does not present experimental facts but rather illustrates how the learning rule operates. We will revise the section on the construction of learning rules to clarify which aspects are explicit assumptions and which are experimentally verified. In particular, we will provide a more detailed description and motivation for metaplasticity

      (2) Goals of the Study: The authors need to clearly define the primary objective of their research. Is it to showcase the computational advantages of the local learning rule, or to elucidate biological functions?

      Briefly, the goal of the study was to investigate whether earlier theoretical results with more abstract models can be quantitatively recapitulated in morphologically and biophysically detailed neuron models with dendritic nonlinearities and with biologically based learning rules. (similar response to Summary and Weaknesses to Reviewer #1). We will update the introduction with this information.

      i. Computational Advantage: If the intent is to demonstrate computational advantages, the current experimental results appear inadequate. The learning rule introduced in this work can only solve for four features, whereas previous research (e.g., Bicknell and Hausser, 2021) has shown capability with over 100 features. It is crucial for the authors to extend their demonstrations to prove that their learning rule can handle more than just three features. Furthermore, the requirement to fine-tune the midpoint of the synapse function indicates that the rule modifies the "activation function" of the synapses, as opposed to merely adjusting synaptic weights. In machine learning, modifying weights directly is typically more efficient than altering activation functions during learning tasks. This might account for why the current learning rule is restricted to a limited number of tasks. The authors should critically evaluate whether the proposed local learning rule, including meta-plasticity, actually offers any computational advantage. This evaluation is essential to understand the practical implications and effectiveness of the proposed learning rule.

      As mentioned above, our intent is not to demonstrate the computational advantages of the proposed learning rule but to investigate and illustrate how biophysically detailed neuron models that also display dendritic plateau potential mechanisms, together with biologically-based learning rules, can support the theoretically predicted computational requirements for complex neuronal processing (e.g., Tran-Van-Minh, A. et al., 2015), as well as the results obtained with more abstract neuron models and plateau potential mechanisms (e.g., Schiess et al., 2016; Legenstein and Maass, 2011).

      In the revised manuscript, we will also discuss the differences between the supervised learning rule in Bicknell and Hausser (2021) and our local and reward-based learning rule. We will also show a critical evaluation of how our local learning rule and metaplasticity affect the synaptic weights and why the different components of the rule are needed.

      ii. Biological Significance: If the goal is to interpret biological functions, the authors should dig deeper into the model behaviors to uncover their biological significance. This exploration should aim to link the observed computational features of the model more directly with biological mechanisms and outcomes.

      We will make an attempt to better link the learning rule and dendritic supra-linearities and interpret their biological function.

    1. eLife assessment

      The authors study how cells with lower levels of the conserved steroid hormone signaling component Taiman (tai) are out-competed by neighboring wild-type cells with higher fitness in Drosophila imaginal discs. The findings are useful since they uncover an unexpected link between tai and Wingless signaling in cell competition. The evidence however is incomplete, since the tai loss-of-clone phenotype is based on one allele and the mechanism involved in cell competition through Dlp and Wg lacks adequate supporting data.

    2. Reviewer #1 (Public Review):

      Summary:

      Schweibenz et al are investigating how cells with lower levels of Tai are out-competed by neighboring wild-type (WT) cells. They show that clones homozygous for a tai hypomorphic mutation are disadvantaged and are killed by apoptosis. But tai-low clones are partially rescued when generated in a background that is heterozygous for mutations in apoptotic genes, in the Hippo pathway component warts, or for the Wg/Wnt pathway negative regulator Apc. They then follow up in the link between tai LOF and Wg. The story then shifts away from clones and into experiments that have Tai RNAi depletion or Tai over-expression in the posterior compartment of the wing disc, using the anterior compartment as a control. These non-clonal experiments show that depletion of Tai in the posterior compartment of wing discs results in less Wg in this compartment. This is shown to be due to a reduction in the glypican Dally-like protein (Dlp). The fact that long-range Wg is reduced in tai-depleted discs that also show a reduction in Dlp, suggests that Tai somehow positively promotes Wg distribution. There is some data in the supplementary materials suggesting that Tai promotes dlp mRNA expression but this was not compelling. In fact, the compelling data was that Dlp protein in tai mutant clones is not abundantly on the cell surface, but instead somehow retained in the mutant cell. The authors don't further examine Dlp protein in tai clones. The final figure (Figure 8) shows that there is less Wg at the DV margin in wing discs when tai is depleted from wg-producing cells. In sum, the authors have uncovered some interesting results, but the story has some unresolved issues that, if addressed, could boost its impact. Additionally, the preprint seems to have 2 stories, one about tai and cell competition and the other about tai and Wg distribution. It would be helpful to reorder the figures and improve the narrative so that these are better integrated with each other.

      Strengths:

      The authors are studying competition between tai-low clones and their fitter WT neighbors, and have uncovered an interesting connection to Wg.

      Weaknesses:

      (1) It would be good to know whether the authors can rescue tai-low clones by over-expression UAS-Dlp.

      (2) The data about tai-promoting dlp (Figure S4) is not compelling as there are no biological replicates and no statistical analyses.

      (3) The data on Wg distribution seems disjointed from the data about cell competition. The authors could refocus the paper to emphasize the cell competition story. The role of Dlp in Wg distribution is well established, so the authors could remove or condense these results. The story really could be Figsured 1, 2, 3 and 7 and keep the paper focused on cell competition. The authors could then discuss Dlp as needed for Wg signaling transduction, which is already established in the literature.

      (4) The model of tai controlling dlp mRNA and Dlp protein distribution is confusing. In fact, the data for the former is weak, while the data for the latter is strong. I suggest that the authors focus on the altered Dlp protein distribution on tai-low clones. It would also be helpful to prove the Wg signaling is impeded in tai clones (see #5 below).

      (5) I don't know if the Fz3-RFP reported for Wg signaling works in imaginal discs, but if it does then the authors could make clones in this background to prove that cell-autonomous Wg signaling is reduced in tai-low clones.

    3. Reviewer #2 (Public Review):

      The authors investigate the properties of the transcriptional co-activator Taiman in regulating tissue growth. In previously published work they had shown that cells that overexpress Tiaman in the pupal wing can cause the death of thoracic cells adjacent to the wing tip to die and thus allow the wing to invade the thorax. This was mediated by the secretion of Spz ligands. Here, they investigate the properties of cells that are homozygous for a hypomorphic allele of taiman (tai). They show that homozygous mutant clones are much smaller than their wild-type twin spots and that cells in the clones are dying by apoptosis which is inferred from elevated levels of anti-Dcp1 staining (Figure 1).

      By generating clones during eye development, the authors screen for dominant modifiers that increase the representation of homozygous tai tissue in the adult eye (Figure 2). They find that reducing the levels of hid, the entire rpr/hid/grim locus and Apc (and/or Apc2) each increase the representation of tai clones. They then show that the survival of tissue to the adult stage correlates with the size of lones in the third-instar larval wing disc (Figure 3). The rest of the study derives from the modification of the phenotype by Apc and investigates the interaction between Wnt signaling and tai clone survival.

      The authors then investigate interactions between tai and the wingless (wg) pathway. First, they show that increasing tai expression increases the expression of a wg reporter (nkd-lacZ) while reducing tai levels decreases its expression (Figure 4) indicating that wg signaling is likely reduced when tai levels are decreased. This finding is strengthened by examining wg-lacZ expression since the expression of this reporter is normally restricted to the D/V boundary in the wing disc by feedback inhibition via Wg signaling. Expression of the reporter is increased when tai expression is reduced and decreased when tai expression is increased (Figure 5).

      The authors then look at Wg protein away from the DV boundary. They find increased levels when tai expression is increased and decreased levels when tai is decreased. They conclude that tai activity increased Wg protein in cells (Figure 6). They suggest that this could be the result of the regulation of expression of Dally-like protein (Dlp). Consistent with this idea, increasing tai expression increases Dlp levels, and decreasing tai decreases Dlp levels (Figure 7). They then show that increasing Dlp levels when tai is reduced increases Wg levels which presumably means that Dlp is epistatic to tai. Puzzlingly, increasing both tai and Dlp decreases Wg.

      The authors then examine the effect of reducing Dlp in the cells that secrete Wg. They find that increasing tai results in the diffusion of Wg further from its source while reducing tai reduces its spread (Figure 8). They then show that in clones with reduced tai, there is increased cytoplasmic Dlp (Figure 9). They therefore propose that tai clones fail to survive because they do not secrete enough Dlp which results in reduced capture of the Wg for those cells and hence decreased Wg signaling.

      Evaluation

      While the authors present good evidence in support of most of their conclusions, there are alternative explanations in many cases that have not been excluded.

      From the results in Figure 1 (and Figure 3), the authors conclude that "The data indicate the existence of an extracellular competition mechanism that allows normal tai[wt] cells to kill tai[k15101] neighbors" (line 127). However, the experiments have been done with a single allele, and these experiments do not exclude the possibility that there is another mutation on the same chromosome arm that is responsible for the observed phenotype. Since the authors have a UAS-tai stock, they could strengthen their results using a MARCM experiment where they could test whether the expression of UAS-tai rescues the elimination of tai mutant clones. Alternatively, they could use a second (independent) allele to demonstrate that the phenotype can be attributed to a reduction in tai activity.

      By screening for dominant modifiers of a phenotype one would not expect to identify all interacting genes - only those that are haploinsufficient in this situation. The authors have screened a total of 21 chromosomes for modification and have not really explained which alleles are nulls and which are hypomorphs. The nature of each of the alleles screened needs to be explained better. Also, the absence of a dominant modification does not necessarily exclude a function of that gene or pathway in the process. This is especially relevant for the Spz/Toll pathway which the authors have previously implicated in the ability of tai-overexpressing cells to kill wild-type cells. The most important discovery from this screen is the modification by the Apc alleles. This part of the paper would be strengthened by testing for modification by other components of the Wingless pathway. The authors show modification by Apc[MI01007] and the double mutant Apc[Q8] Apc2[N175A]. Without showing the Apc[Q8] and Apc2[N175A] alleles separately, it is hard to know if the effect of the double mutant is due to Apc, Apc2,` or the combination.

      RNAi of tai seems to block the formation of the Wg gradient. If so, one might expect a reduction in wing size. Indeed, this could explain why the wings of tai/Df flies are smaller. The authors mention briefly that the posterior compartment size is reduced when tai-RNAi is expressed in that compartment. However, this observation merits more emphasis since it could explain why tai/Df flies are smaller (Are their wings smaller?).

      In Figure 7, the authors show the effect of manipulating Tai levels alone or in combination with increasing Dlp levels. However, they do not include images of Wg protein distribution upon increasing Dlp levels alone.

      In Figure 8, there is more Wg protein both at the DV boundary and spreading when tai is overexpressed in the source cells using bbg-Gal4. However, in an earlier experiment (Figure 5C) they show that the wg-lacZ reporter is downregulated at the DV boundary when tai is overexpressed using en-Gal4. They therefore conclude that wg is not transcriptionally upregulated but is, instead secreted at higher levels when tai is expressed in the source cells. Wg protein is reduced in the DV stripe with tai is overexpressed using the en-Gal4 driver (Figure 6B') and is increased at the same location when tai is overexpressed with the bbg-Gal4 driver. (Figure 8) I don't know how to reconcile these observations.

      In Figure 9, the tai-low clones have elevated levels of Dlp. How can this be reconciled with the tai-RNAi knockdown shown in Figure 7C' where reducing tai levels causes a strong reduction in Dlp levels?

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, Schweibenz et al., identify the transcriptional coactivator, Taiman (Tai), as a factor that determines the fitness level of epithelial cells by regulating Wingless (Wg), which is an important determinant of cellular fitness. Taiman determines cellular fitness level by regulating levels of cell-surface glypican Dally-like protein (Dlp), which regulates extracellular Wingless (Wg) distribution. Thus, by affecting levels of Wg via glypican regulation, Tai participates in determining cellular fitness, and cells with low Tai levels are eliminated as they are deprived of adequate Wg levels.

      Strengths:

      (1) The authors make a strong case for the effect of tai on Dlp and Wg levels in experiments where a relatively large group of cells have reduced tai levels.<br /> (2) The claim that tai-low clones are competitively eliminated is supported by experiments that show cell death in them, and their elimination at different time points.<br /> (3) The manuscript is well written.

      Weaknesses:

      (1) The study has relatively weak evidence for the mechanism of cell competition mediated by Dlp and Wg.

      (2) More evidence is required to support the claim that dlp transcription or endocytosis is affected in tai clones.

      Other comments:

      (1) The authors put the study in the context of cell competition, and the first figure indeed is convincing in this regard. However, most of the rest of the study is not in the clonal context, and mainly relies on RNAi KD of tai in the posterior compartment, which is a relatively large group of cells. I understand why the authors chose a different approach to investigate the role of tai in cell competition. However because ubiquitous loss of tai results in smaller organs, it is important to determine to what extent reducing levels of tai in the entire posterior compartment compares with clonal elimination i.e. cell competition. This is important in order to determine to what extent the paradigm of Tai-mediated regulation of Dlp levels and by extension, Wg availability, can be extended as a general mechanism underlying competitive elimination of tai-low clones. If the authors want to make a case for mechanisms involved in the competitive elimination of tai clones, then they need to show that the KD of tai in the posterior compartment shows hallmarks of cell competition. Is there cell death along the A/P boundary? Or is the compartment smaller because those cells are growing slower? Are the levels of Myc/DIAP1, proteins required for fitness, affected in en>tai RNAi cells?

      2) The authors do not have direct/strong evidence of changes in dlp mRNA levels or intracellular trafficking. To back these claims, the authors should look for dlp mRNA levels and provide more evidence for Dlp endocytosis like an antibody uptake assay or at the very least, a higher resolution image analysis showing a change in the number of intracellular Dlp positive punctae. Also, do the authors think that loss of tai increases Dlp endocytosis, making it less available on the cell surface for maintaining adequate extracellular Wg levels?

      3) The data shown in the last figure is at odds with the model (I think) the authors are trying to establish: When cells have lower Tai levels, this reduces Dlp levels (S2) presumably either by reducing dlp transcription and/or increasing (?) Dlp endocytosis. This in turn reduces Wg (availability) in cells away from source cells (Figure 6). The reduced Wg availability makes them less fit, targeting them for competitive elimination. But in tai clones, I do not see any change in cell-surface Dlp (9B) (I would have expected them to be down based on the proposed model). The authors also see more total Dlp (9A) (which is at odds with S2 assuming data in S2 were done under permeabilizing conditions.).

      As a side note, because Dlp is GPI-anchored, the authors should consider the possibility that the 'total' Dlp staining observed in 9A may not be actually total Dlp (and possibly mostly intracellular Dlp, since the permeabilizing membranes with detergent will cause some (most?) Dlp molecules to be lost, and how this might be affecting the interpretation of the data. I think one way to address this would be to process the permeabilized and non-permeabilized samples simultaneously and then image them at the same settings and compare what membrane staining in these two conditions looks like. If membrane staining in the permeabilized condition is decreased compared to non-permeabilized conditions, and the signal intensity of Dlp in permeabilized conditions remains high, then the authors will have evidence to support increased endocytosis in tai clones. Of course, these data will still need to be reconciled with what is shown in S2.

    5. Author response:

      eLife assessment

      “…The evidence however is incomplete, since the tai loss-of-clone phenotype is based on one allele and the mechanism involved in cell competition through Dlp and Wg lacks adequate supporting data.”

      We agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      We also agree that additional functional data would help demonstrate that differences in Dlp levels are required for the mechanism of Tai cell competition. Experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      Reviewer #1:

      Overall Statements:

      “There is some data in the supplementary materials suggesting that Tai promotes dlp mRNA expression, but this was not compelling.”

      We are currently testing effects on Tai on dlp and dally transcription using qPCR and reporter transgenes. As noted below, the effects of Tai on Dlp trafficking are ‘strong’, so resolving effects on Dlp transcription will complement this localization data.

      “The authors don't further examine Dlp protein in tai clones.”

      As noted by the Reviewer, we do examine Dlp levels and localization in tai-low clones (see Figure 9), but these experiments are challenging due to their very small size and the hypomorphic nature of the tai allele (tai[k15101]) that was used. Experiments are in progress to examine the effect of our Crispr null allele of tai on Dlp levels and localization in wing clones.

      “In sum, the authors have uncovered some interesting results, but the story has some unresolved issues that, if addressed, could boost its impact. Additionally, the preprint seems to have 2 stories, one about tai and cell competition and the other about tai and Wg distribution. It would be helpful to reorder the figures and improve the narrative so that these are better integrated with each other.”

      We agree. The results of our modifier screen required that we first understand how Tai regulates the Wg pathway before could apply this to understanding the competitive mechanism. Thus, the paper is composed of three sections: 1. the screen, 2. the Tai-Dlp-Wg connection in the absence of competition, and 3. the contribution of Dlp-Wg to the tai[low] ‘loser’ phenotype. These sections use different techniques (e.g., clonal mosaics with genomic alleles, Gal4/UAS and RNAi to define the effect of Tai loss on Wg and Dlp). Ongoing experiments return to clonal mosaics to test whether elevating Dlp can rescue tai lof clones in the same manner as Apc/Apc2 alleles (see Figs. 2-3), which elevate Wg pathway activity.

      Specifics:

      “It would be good to know whether the authors can rescue tai-low clones by over-expression UAS-Dlp.”

      As noted above, experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      “The data on Wg distribution seems disjointed from the data about cell competition. The authors could refocus the paper to emphasize the cell competition story. The role of Dlp in Wg distribution is well established, so the authors could remove or condense these results. The story really could be Figs 1, 2, 3 and 7 and keep the paper focused on cell competition. The authors could then discuss Dlp as needed for Wg signaling transduction, which is already established in the literature.”

      We appreciate the suggestion to reorganize the figures to focus the first part of the story on competition, and then follow with the role of Tai in controlling Dlp. We will consider this approach pending the results of ongoing experiments.  

      “The model of tai controlling dlp mRNA and Dlp protein distribution is confusing. In fact, the data for the former is weak, while the data for the latter is strong. I suggest that the authors focus on the altered Dlp protein distribution on tai-low clones. It would also be helpful to prove the Wg signaling is impeded in tai clones (see #5 below).”

      We agree but are currently testing how dlp reporters and mRNA respond to Tai in order to rigorously test a Dlp transcriptional mechanism. To complement the ‘strong’ evidence that Tai regulates Dlp distribution, we are testing Dlp in clones of our Tai Crispr null. Since submission, we have also assessed the effect of blocking the endocytic factor shibire/dynamin in Dlp distribution in Tai deficient cells to complement the data on Pentagone that is already in the paper (see Fig. S3).

      “I don't know if the Fz3-RFP reported for Wg signaling works in imaginal discs, but if it does then the authors could make clones in this background to prove that cell-autonomous Wg signaling is reduced in tai-low clones.”

      We thank the reviewer for this suggestion, which we are now testing.

      Reviewer #2

      Overall Comments:

      “While the authors present good evidence in support of most of their conclusions, there are alternative explanations in many cases that have not been excluded.”

      We appreciate this point and are conducting experiments for a revised submission that will help test alternative mechanisms and clarify our conclusions.

      Specifics:

      “However, the experiments have been done with a single allele, and these experiments do not exclude the possibility that there is another mutation on the same chromosome arm that is responsible for the observed phenotype. Since the authors have a UAS-tai stock, they could strengthen their results using a MARCM experiment where they could test whether the expression of UAS-tai rescues the elimination of tai mutant clones. Alternatively, they could use a second (independent) allele to demonstrate that the phenotype can be attributed to a reduction in tai activity.”

      As noted above, we agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      The tai[k15101] allele acts as a tai hypomorph and has been shown to produce weaker phenotypes than the 61G1 strong lof in a number of papers (Bai et al, 2000; König et al, 2011, Luo et al, 2019, and Zhang et al, 2015). We agree that rescue of tai[k1501] with a UAS-Tai transgene would help rule out effects of second site mutations. We are currently pursuing the reviewer’s second suggestion of phenocopy with a different allele, our new tai Crispr lof.   

      “The authors have screened a total of 21 chromosomes for modification and have not really explained which alleles are nulls and which are hypomorphs. The nature of each of the alleles screened needs to be explained better.”

      We will update the text to better reflect what type of alleles were chosen. In most cases we preferred amorphs or null alleles over hypomorphs, however when the amorph option was not available, we used hypomorphs.

      “Also, the absence of a dominant modification does not necessarily exclude a function of that gene or pathway in the process. This is especially relevant for the Spz/Toll pathway which the authors have previously implicated in the ability of tai-overexpressing cells to kill wild-type cells.”

      We thank the reviewer for this completely accurate point. The dominant screen does not rule out effects of other pathways such as Spz/Toll. Indeed, we were surprised by the lack of dominant effects by Spz/Toll alleles on tai[low] competition given our prior work. The reciprocally clear dominant effect of Apc/Apc2 led us to consider that Wg signaling plays a role in this phenomenon, which then became the starting point of this study.

      “The most important discovery from this screen is the modification by the Apc alleles. This part of the paper would be strengthened by testing for modification by other components of the Wingless pathway. The authors show modification by Apc[MI01007] and the double mutant Apc[Q8] Apc2[N175A]. Without showing the Apc[Q8] and Apc2[N175A] alleles separately, it is hard to know if the effect of the double mutant is due to Apc, Apc2,` or the combination.”

      We agree that testing for modification with other components of the Wg pathway would be helpful to strengthen the connection between Tai low clonal elimination and Wg pathway biology. We also agree that separating Apc [Q8] and Apc2 [N175A] would be a good idea to check if both Apc proteins are equally important for rescuing Tai low cell death, and future experiments for the lab could investigate this distinction.

      “RNAi of tai seems to block the formation of the Wg gradient. If so, one might expect a reduction in wing size. Indeed, this could explain why the wings of tai/Df flies are smaller. The authors mention briefly that the posterior compartment size is reduced when tai-RNAi is expressed in that compartment. However, this observation merits more emphasis since it could explain why tai/Df flies are smaller (Are their wings smaller?).”

      We agree that this is an exciting possibility. Growth effects of Tai linked to interactions with Yorkie and EcR could be due to a distinct role in promoting Wg activity. Alternatively, Tai may cooperate with Yorkie or EcR to control Wg pathway. These are exciting possibilities that we are pursuing in future work

      With regard to the “small size” effect of reducing Tai, we have previously shown that RNAi of Tai using engrailed-Gal4 causes the posterior compartment to shrink (Zhang et al. 2015, Figure 1C-F, H). In this paper, we also showed that tai[k15101]/Df animals are proportionally smaller than wildtype animals and quantified this by measuring 2D wing size (Zhang et al. 2015, Figure 1A and 1B)

      “In Figure 7, the authors show the effect of manipulating Tai levels alone or in combination with increasing Dlp levels. However, they do not include images of Wg protein distribution upon increasing Dlp levels alone.”

      We thank the reviewer for this reminder and have already generated these control images to include in a revised submission paper.

      “In Figure 8, there is more Wg protein both at the DV boundary and spreading when tai is overexpressed in the source cells using bbg-Gal4. However, in an earlier experiment (Figure 5C) they show that the wg-lacZ reporter is downregulated at the DV boundary when tai is overexpressed using en-Gal4. They therefore conclude that wg is not transcriptionally upregulated but is, instead secreted at higher levels when tai is expressed in the source cells. Wg protein is reduced in the DV stripe with tai is overexpressed using the en-Gal4 driver (Figure 6B') and is increased at the same location when tai is overexpressed with the bbg-Gal4 driver. (Figure 8) I don't know how to reconcile these observations.”

      We thank the reviewer for pressing us to develop an overall model explaining our results and how we envision Tai regulating Dlp and Wg. We are preparing a graphic abstract that illustrates this model and will be included in our revision.

      Briefly, we favor a model in which Tai controls the rate of Wg spread via Dlp, without a significant effect on wg transcription. For example, the induction of Dlp across the ‘engrailed’ domain of en>Tai discs (Fig 7B-B”) allows Wg to spread rapidly across the flanks and moderately depletes it from the DV margin (Fig 6B-B”) as noted by the reviewer. Adding a UAS-Dlp transgene in the en>Tai background dramatically accelerates Wg spread and causes it to be depleted from the DV margin and build up at the far end of the gradient adjacent to the dorsal and ventral hinge. Significantly blocking endocytosis of Wg in en>Tai discs with a dominant negative shibire transgene also causes Wg to build up in the same location (new data to be added in a revision) consistent with enhanced spreading. The difference in the bbg-Gal4 experiment is that Tai is only overexpressed in DV margin cells, which constrains and concentrates Wg within this restricted domain; we are in the process of testing whether this effect on Wg is blocked by RNAi of Dlp in bbg>Tai discs.

      “In Figure 9, the tai-low clones have elevated levels of Dlp. How can this be reconciled with the tai-RNAi knockdown shown in Figure 7C' where reducing tai levels causes a strong reduction in Dlp levels?”

      We apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. As noted, we are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      Reviewer #3:

      Overall Weaknesses:

      “The study has relatively weak evidence for the mechanism of cell competition mediated by Dlp and Wg.”

      The screen and middle section of the paper provide genetic evidence that elevating Wg pathway activity rescues Tai[low} loser cells and that Tai controls levels/localization of Dlp and distribution of Wg in the developing wing disc. Our current work is focused on linking these two finding together in Tai “loser” clones.

      “More evidence is required to support the claim that dlp transcription or endocytosis is affected in tai clones.”

      As noted above, we are testing whether normalizing Dlp levels across clonal boundaries rescues tai[low] loser clones and assessing effects of Tai on dlp transcription and Dlp trafficking.

      Specifics:

      “Most of the rest of the study is not in the clonal context, and mainly relies on RNAi KD of tai in the posterior compartment, which is a relatively large group of cells. I understand why the authors chose a different approach to investigate the role of tai in cell competition. However because ubiquitous loss of tai results in smaller organs, it is important to determine to what extent reducing levels of tai in the entire posterior compartment compares with clonal elimination i.e. cell competition. This is important in order to determine to what extent the paradigm of Tai-mediated regulation of Dlp levels and by extension, Wg availability, can be extended as a general mechanism underlying competitive elimination of tai-low clones. If the authors want to make a case for mechanisms involved in the competitive elimination of tai clones, then they need to show that the KD of tai in the posterior compartment shows hallmarks of cell competition. Is there cell death along the A/P boundary? Or is the compartment smaller because those cells are growing slower?”

      Based on data that cell competition does not occur over compartment boundaries (e.g., see review by L.A. Johnston, Science, 2009), we chose not to use UAS-Gal4 to assess competition, but rather to investigate underlying biology occurring between Tai, Wg, and Dlp.

      “Are the levels of Myc/DIAP1, proteins required for fitness, affected in en>tai RNAi cells?”

      This is, of course, an interesting question given that Myc is a well-studied competition factor and is proposed to be downstream of the Tai-interacting protein Yki. We are not currently focused on Myc, but plan to test its role in the Tai-Dlp-Wg pathway in future work.

      “The authors do not have direct/strong evidence of changes in dlp mRNA levels or intracellular trafficking. To back these claims, the authors should look for dlp mRNA levels and provide more evidence for Dlp endocytosis like an antibody uptake assay or at the very least, a higher resolution image analysis showing a change in the number of intracellular Dlp positive punctae. Also, do the authors think that loss of tai increases Dlp endocytosis, making it less available on the cell surface for maintaining adequate extracellular Wg levels?”

      As noted above, have added experiments using a dominant-negative shibire/dynamin allele to test whether Tai controls Dlp endocytosis. These data will be added to a revised manuscript. We have also gathered reagents to test effects of Tai gain/loss on Dlp secretion.

      “The data shown in the last figure is at odds with the model (I think) the authors are trying to establish: When cells have lower Tai levels, this reduces Dlp levels (S2) presumably either by reducing dlp transcription and/or increasing (?) Dlp endocytosis. This in turn reduces Wg (availability) in cells away from source cells (Figure 6). The reduced Wg availability makes them less fit, targeting them for competitive elimination. But in tai clones, I do not see any change in cell-surface Dlp (9B) (I would have expected them to be down based on the proposed model). The authors also see more total Dlp (9A) (which is at odds with S2 assuming data in S2 were done under permeabilizing conditions.).”

      As noted above (under Rev #2 comments), we apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. We are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      “As a side note, because Dlp is GPI-anchored, the authors should consider the possibility that the 'total' Dlp staining observed in 9A may not be actually total Dlp (and possibly mostly intracellular Dlp, since the permeabilizing membranes with detergent will cause some (most?) Dlp molecules to be lost, and how this might be affecting the interpretation of the data. I think one way to address this would be to process the permeabilized and non-permeabilized samples simultaneously and then image them at the same settings and compare what membrane staining in these two conditions looks like. If membrane staining in the permeabilized condition is decreased compared to non-permeabilized conditions, and the signal intensity of Dlp in permeabilized conditions remains high, then the authors will have evidence to support increased endocytosis in tai clones. Of course, these data will still need to be reconciled with what is shown in S2.

      We thank the reviewer for this excellent suggestion and are generating mosaic discs to test the proposed approach of synchronous analysis of total vs. intracellular Dlp.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      (1) A problem with in vitro work is that homogeneous cell lines/cultures are, by nature, absent from the rest of the microenvironment. The authors need to discuss this. 

      We have added two sentences to the second paragraph of the Discussion section in which we now acknowledge this concern, but also point out that in vitro models of this sort also provide an experimental advantage in that they facilitate a deconvolution of the extensive complexity resident within the intact animal. Nevertheless, we acknowledge that this deconvolution requires ultimate validation of findings obtained within an in vitro model system to ensure they accurately recapitulate functions that occur in the intact animal in vivo.

      (2) What are n's/replicates for each study? Were the same or different samples used to generate the data for RNA sequencing, methylation beadchip analysis, and EM-seq? This clarification is important because if the same cultures were used, this would allow comparisons and correlations within samples.  

      Additional text has been added in the Methods section to indicate that all samples involving cell culture models which include iPSCs and PGCLCs came from a single XY iPS cell line aliquoted into replicates and all primary cultures which included Sertoli and granulosa cells were generated from pooled tissue preps from mice and then aliquoted into replicates. Finally, all experiments in the study were performed on three replicates. Because this experimental design did indeed allow for comparisons among samples, we have added a new Supplement figure 9 which displays PCA plots showing clustering among control and treatment datasets, respectively, as well as distinctions between each cluster representing each experimental condition.

      (3) In Figure 1, it is interesting that the 50 uM BPS dose mainly resulted in hypermethylation whereas 100 uM appears to be mainly hypomethylation. (This is based on the subjective appearance of graphs). The authors should discuss and/or present these data more quantitatively. For example, what percentage of changes were hypo/hypermethylation for each treatment? How many DMRs did each dose induce? For the RNA-seq results, again, what were the number of up/down-regulated genes for each dose?  

      The experiment shown in Figure 1 was designed to 1) serve as proof of principle that cells maintained in culture could be susceptible to EDC-induced epimutagenesis at all, 2) determine if any response observed would be dose-dependent, and 3) identify a minimally effective dose of BPS to be used for the remaining experiments in this study (which we identified as 1 μM). We agree that it is interesting that the 50 µM dose of BPS induced predominantly hypermethylation changes whereas the 1 µM and 100 µM doses induced predominantly hypomethylation changes, but are not in a position to offer a mechanistic explanation for this outcome at this time. As the results shown satisfied our primary objectives of demonstrating that exposure of cells in culture to BPS could indeed induce DNA methylation epimutations, that this occurs in a dose-dependent manner, and that a dose of as low as 1 µM of BPS was sufficient to induce epimutagenesis, the data obtained satisfied all of the initial objectives of this experiment. That said, in response to the reviewer’s request we have now added text on pages 6-7 alluding to new Supplemental tables 1-3 indicating the total number of DMCs and DMRs, as well as the number of DEGs, detected in response to exposure to each dose of BPS shown in Figure 1, as well as stratifying those results to indicate the numbers of hyper- and hypomethylation epimutations and up- and down-regulated DEGs induced in response to each dose of BPS. While, as noted above, investigating the mechanistic basis for the difference in responses induced by the 50 µM versus 1 and 100 µM doses of BPS was beyond the scope of the study presented in this manuscript, we do find this result reminiscent of the “U-shaped” response curves often observed in toxicology studies. Importantly, this result does demonstrate the elevated resolution and specificity of analysis facilitated by our in vitro cell culture model system.

      (4) Also in Figure 1, were there DMRs or genes in common across the doses? How did DMRs relate to gene expression results? This would be informative in verifying or refuting expectations that greater methylation is often associated with decreased gene expression.  

      In general, we observed a coincidence between changes in DNA methylation and changes in gene expression (Supplement Tables 1-3). Pertaining directly to the reviewer’s question about the extent to which we observed common DMRs and DEGs across all doses, while we only found 3 overlapping DMRs conserved across all doses tested, we did find an average of 51.25% overlap in DMCs and an average of 80.45% overlap in DEGs across iPSCs exposed to the different doses of BPS shown in Figure 1. In addition, within each dose of BPS tested in iPSCs, we also found that there was an overlap between DMCs and the promoters or gene bodies of many DEGs (Supplement Table 4). Specifically within gene promoters, we observed a correlation between hypermethylated DMCs and decreased gene expression and hypomethylated DMCs and increased gene expression, respectively (Supplement Figure 2).

      (5) In Figure 2, was there an overlap in the hypo- and/or hyper-methylated DMCs? Please also add more description of the data in 2b to the legend including what the dot sizes/colors mean, etc. Some readers (including me) may not be familiar with this type of data presentation. Some of this comes up in Figure 4, so perhaps allude to this earlier on, or show these data earlier.  

      We observed an average of 11.05% overlapping DMCs between different pairs of cell types, we did not observe any DMCs that were shared among all four cell types. Indeed, this limited overlap of DMCs among different cell types exposed to BPS was the primary motivation for the analysis described in Figure 2. Thus, instead of focusing solely on direct overlap between specific DMCs, we instead examined similarities among the different cell types tested in the occurrence of epimutations within different annotated genomic regions. To better describe this, we have now added additional text to page 9. We have also added more detail to the legend for Figure 2 on page 8 to more clearly explain the significance of the dot sizes and colors, explaining that the dot sizes are indicative of the relative number of differentially methylated probes that were detected within each specific annotated genomic region, and that the dot colors are indicative of the calculated enrichment score reflecting the relative abundance of epimutations occurring within a specific annotated genomic region. The relative score is calculated by iterating down the list of DMCs and increasing a running-sum statistic when encountering a DMC within the specific annotated genomic region of interest and decreasing the sum when the epimutation is not in that annotated region. The magnitude of the increment depends upon the relative occurrence of DMCs within a specific annotated genomic region.

      (6) iPSCs were derived from male mice MEFs, and subsequently used to differentiate into PGCLCs. The only cell type from an XX female is the granulosa cells. This might be important, and should be mentioned and its potential significance discussed (briefly).  

      We have added a new paragraph just before the final paragraph of the Discussion section in which we acknowledge that most of the cell types analyzed during our study were XY-bearing “male” cells and that the manner in which XX-bearing “female” cells might respond to similar exposures could differ from the responses we observed in XY cells. However, we also noted that our assessment of XX-bearing granulosa cells yielded results very similar to those seen in XY Sertoli cells suggesting that, at least for differentiated somatic cell types, there does not appear to be a significant sex-specific difference in response to exposure to a similar dose of the same EDC. That said, we also acknowledged that in cell types in which dosage compensation based on X-chromosome inactivation is not in place, differences between XY- and XX-bearing cells could accrue.

      (7) EREs are only one type of hormone response element. The authors make the point that other mechanisms of BPS action are independent of canonical endocrine signaling. Would authors please briefly speculate on the possibility that other endocrine pathways including those utilizing AREs or other HREs may play a role? In other words, it may not be endocrine signaling independent. The statement that the differences between PGCLCs and other cells are largely due to the absence of ERs is overly simplistic.  

      Previous reports have indicated that BPS does not have the capacity to bind with the androgen receptor (Pelch et al., 2019; Yang et al., 2024). However there have been reports indicating that BPS can interact with other endocrine receptors including PPARγ and RXRα, which play a role in lipid accumulation and the potential to be linked to obesity phenotypes (Gao et al., 2020; Sharma et al., 2018). To address the reviewer’s comment we assessed the expression of a panel of hormone receptors including PPARγ, RXRα, and AR  in each of the cell types examined in our study and these results are now shown in a new Supplent Figure 4. We show that in addition to not expressing either estrogen receptor (ERa or ERb), germ cells also do not express any of the other endocrine receptors we tested including AR, PPARγ, and RXRα. Thus we now note that these results support our suggestion that the induction of epimutations we observed in germ cells in response to exposure to BPS appears to reflect disruption of non-canonical endocrine signaling. We also note that non-canonical endocrine signaling is well established (Brenker et al., 2018; Ozgyin et al., 2015; Song et al., 2011; Thomas and Dong, 2006). Thus we feel the suggestion that the effects of BPS exposure could conceivably reflect either disruption of canonical or non-canonical signaling in any cell type is well justified and that our data suggests that both of these effects appear to have accrued in the cells examined in our study as suggested in the text of our manuscript.

      (8) Interpretation of data from the GO analysis is similarly overly simplistic. The pathways identified and discussed (e.g. PI3K/AKT and ubiquitin-like protease pathways) are involved in numerous functions, both endocrine and non-endocrine. Also, are the data shown in Figure 6a from all 4 cell types? I am confused by the heatmap in 6c, which genes were significantly affected by treatment in which cell types?  

      Per the reviewer’s request, we have added text to indicate that Figure 6a is indeed data from all four cell types examined. We have also modified the text to further clarify that Figure 6c displays the expression of other G-coupled protein receptors which are expressed at similar, if not higher, levels than either ER in all cell types examined, and that these have been shown to have the potential to bind to either 17β-estradiol or BPA in rat models. As alluded to by the reviewer, this is indicative of a wide variety of distinct pathways and/or functions that can potentially be impacted by exposure to an EDC such as BPS. Thus, we have attempted to acknowledge the reviewer’s primary point that BPS may interact with a variety of receptors or other factors involved with a wide variety of different pathways and functions. Importantly, this illustrates the strength of our model system in that it can be used to identify potential impacted target pathways that can then be subsequently pursued further as deemed appropriate.

      (9) In Figure 7, what were the 138 genes? Any commonalities among them? 

      We have now added a new supplemental Excel file that lists the 138 overlapping conserved DEGs that did not become reprogrammed/corrected during the transition from iPSCs to PGCLCs. In addition, we have added new text on page 22 and a new Supplemental Figure 8 which displays KEGG analysis of pathways associated with these 138 retained DEGs. We find that these genes are primarily involved with cell cycle and apoptosis pathways which, interestingly, have the potential to be linked to cancer development which is often linked to disruptions in chromatin architecture.

      (10) The Introduction is very long. The last paragraph, beginning line 105, is a long summary of results and interpretations that better fit in a Discussion section.

      We have now significantly reduced the length and scope of the final paragraph of the Introduction per the reviewer’s recommendation.

      (11) Provide some details on husbandry: e.g. were they bred on-site? What food was given, and how was water treated? These questions are to get at efforts to minimize exposure to other chemicals.  

      We have added additional text detailing that all mice used in the project were bred onsite, water was non-autoclaved conventional RO water, and our selection of 5V5R extruded feed for mice used in this study which was highly controlled for the presence of isoflavones and has been certified to be used for estrogen-sensitive animal protocols.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript uses cell lines representative of germ line cells, somatic cells, and pluripotent cells to address the question of how the endocrine-disrupting compound BPS affects these various cells with respect to gene expression and DNA methylation. They find a relationship between the presence of estrogen receptor gene expression and the number of DNA methylation and gene expression changes. Notably, PGCLCs do not express estrogen receptors and although they do have fewer changes, changes are nevertheless detected, suggesting a nonconical pathway for BPS-induced perturbations. Additionally, there was a significant increase in the occurrence of BPS-induced epimutations near EREs in somatic and pluripotent cell types compared to germ cells. Epimutations in the somatic and pluripotent cell types were predominantly in enhancer regions whereas that in the germ cell type was predominantly in gene promoters.

      Strengths:

      The strengths of the paper include the use of various cell types to address the sensitivity of the lineages to BPS as well as the observed relationship between the presence of estrogen receptors and changes in gene expression and DNA methylation.

      Weaknesses:

      The weaknesses include the lack of reporting of replicates, superficial bioinformatic analysis, and the fact that exposures are more complicated in a whole organism than in an isolated cell line.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Overall, this is an intriguing paper but more transparency in the replicates and methods and a more rigorous bioinformatic treatment of the data are required.

      Specific comments:

      (1) End of abstract "These results suggest a unique mechanism by which an EDC-induced epimutated state may be propagated transgenerationally following a single exposure to the causative EDC." This is overly speculative for an abstract. There is only epigenetic inheritance following mitosis or differentiation presented in this study. There is no meiosis and therefore no ability to assess multi- or transgenerational inheritance. 

      We have modified the text at the end of the abstract to more precisely reflect our intended conclusions based on our data. In our view, the ability of induced epimutations to transcend meiosis per se is not as relevant to the mechanism of transgenerational inheritance as their ability to transcend major waves of epigenetic reprogramming that normally occur during development of the germ line. In this regard the transition from pluripotent iPSCs to germline PGCLCs has been shown to recapitulate at least the first portion of normal germline reprogramming, and now our data provide novel insight into the fate of induced epimutations during this process. Specifically, we show that a prevelance of epimutations was conserved during the iPSC à germ cell transition but that very few (< 5%) of the specific epimutations present in the the BPS-exposed iPSCs were retained when those cells were induced to form PGCLCs. Rather, we observed apparent correction of a large majority of the initially induced epimutations during this transition, but this was accompanied by the apparent de novo generation of novel epimutations in the PGCLCs. We suggest, based on other recent reports in the literature, that this is a result of the BPS exposure inducing changes in the chromatin architecture in the exposed iPSCs such that when the normal germline reprogramming mechanism is imposed on this disrupted chromatin template there is both correction of many existing epimutations and the genesis of many novel epimutations. This observation has the potential to explain the long-standing question of why the prevalence of epimutations persists across multiple generations despite the occurrence of epigenetic reprogramming during each generation. Nevertheless, as noted above, we have modified the text at the end of the abstract to temper this interpretation given that it is still somewhat speculative at this point.

      (2) Doses used in the experiments. One needs to be careful when stating that the dose used is "below FDA's suggested safe environmental level established for BPA" because a different bisphenol is being used here (BPA vs BPS) and the safe level is that which the entire organism experiences. It is likely that cell lines experience a higher effective dose.  

      We have now made a point of noting that our reference to an EPA-recommended “safe dose” of BPA was for humans and/or intact animals. Changes to this effect have been made in the second and sixth paragraphs of the Introduction section. In addition, we have added text at the end of the fourth paragraph of the Discussion section acknowledging that, as the reviewer suggests, the same dose of an EDC could exert greater effects on cells in a homogeneous culture than on the same cell type within an intact animal given the potential for mitigating metabolic effects in the latter. However, we also note that the ability we demonstrated to quantify the effects of such exposures on the basis of numbers of epimutations (DMCs or DMRs) induced could potentially be used in future studies to study this question by assessing the effects of a specific dose of a specific EDC on a specific cell type when exposed either within a homogeneous culture or within an intact animal.

      (3) Figure 1: In the dose response, what was the overlap in DMCs and DEGs among the 3 doses? Are the responses additive, synergistic, or completely non-overlapping? This is an important point that should be addressed. 

      Please see our response to Reviewer 1 critique #4 above where we address similar concerns. While we do find overlap among different cell types with respect to the DMCs, DMRs, and DEGs displayed in Figure 1, we found the effect to be only partially additive as opposed to synergistic in any apparent manner. The fold increase in DMCs, DMRs, and DEGs resulting from exposure to doses of 1 μM or 50 μM ranged from 2.5x to 4.4x, which was well below the 50x increase that would have been expected from a strictly additive effect, and the effect increased even less, if at all, in response to exposure to doses of 50 μM versus 100 μM BPS. Finally, as now noted in the Discussion section on page 25, our conclusion is that these results display a limited dose-dependent effect that was partially additive but also plateaued at the highest doses tested.

      (4) Methods: How many times was each exposure performed on a given cell type? This information should be in the figure legends and methods. In the case of multiple exposures for a given line, do the biological replicates agree? 

      Please see our response to Reviewer 1 critique #2 where we address similar concerns with newly added text and analysis. We now note repeatedly on pages 39-45 that each analysis was conducted on three replicate samples, and we display the similarity among those replicates graphically in a new Supplement Figure 9.

      (5) DNA methylation analyses. Very little analysis is presented on the BeadChip array other than hypermethylated/hypomethylated and genomic regions of DMCs. What is the range of methylation changes? Does it vary between hypo vs. hyper DMCs? How many array experiments were performed (biological replicates) and what stats were used to determine the DMCs? Are there DMCs in common among the various cell types? As an example, if more meaningful analysis, one can plot the %5mC over a given array for comparisons between control and treated cell types. For more granularity, the %5mC can be presented according to the element type (enhancers vs promoters). 

      Please see our response to Reviewer 1 critique #2 above where we address similar concerns regarding the number of biological replicates used in this study. DMCs on the Infinium array are identified using mixed linear models. This general supervised learning framework identifies CpG loci at which differential methylation is associated with known control vs. treated co-variates. CpG probes on the array were defined as having differential changes that met both p-value and FDR (≤ 0.05) significant thresholds between treatment and control samples for each cell type analyzed. The range of medians across all samples was 0.0278 to 0.0059 for hypermethylated beta values and -0.0179 to -0.0033 for hypomethylated beta values. As noted above, we did observe an overlap in DMCs between cell types. Thus, we observed an average of 11.05% overlapping DMCs between two or more cell types but we did not observe any DMCs shared between all four cell types. We have added additional text on page 9 and new Supplement Tables 1-4 and Supplement Figure 1 to now more clearly describe that this limited similarity in direct overlap of DMCs was the underlying motivation for the analysis described in Figure 2. Finally, the enrichment dot plots shown in Figure 2 provide the information the reviewer requested regarding the %5mC observed at different annotated genomic element types.

      (6) The investigators correlate the number of DMCs in a given cell type with the presence of estrogen receptors. Does the correlation extend to the methylation difference (delta beta) at the statistically different probes?

      We have added a new Supplement Figure 3 in which we provide data addressing this question. In brief, we find that the delta betas of probes enriched at enhancer regions and associated with relative proximity to ERE elements in Sertoli cells, granulosa cells, and iPSCs appear very similar to those associated with DMCs not located within these enriched regions. However, when we compared the similarity of the two data sets with goodness of fit tests, we found these relatively small differences were, in fact, statistically significant based on a two-sample Kolmogorov-Smirnov test. These observed significant differences appear to indicate that there is higher variability among the delta betas associated with hypomethylated, but not hypermethylation changes occurring at DMCs associated with enhancers, potentially suggesting a greater tendency for exposure to BPS to induce hypomethylation rather than hypermethylation changes, at least in these specific regions.

      (7) Methylation changes relative to EREs are presented in multiple figures. Are other sequences enriched in the DMCs? 

      We profiled the genomic sequence within 500 bp of cell type-specific enriched DMCs that were either associated with enhancer regions in Sertoli, granulosa, or iPS cells or transcription factor binding sites in PGCLCs for the identification of higher abundance motif sequences. We then compared any motifs identified with the JASPAR database to potentially find transcription factors that could be binding to these regions. Interestingly we found that the two most common motifs across all cell types were associated with either the chromatin remodeling transcription factor HMG1A or the pluripotency factor KLF4.

      (8) Please present a correlation plot between the methylation differences and the adjacent DEGs. Again, the absence of consideration of the absolute changes in methylation and gene expression minimizes the impact of the data. 

      We analyzed the relationship between DMCs at DEGs promoter regions and the corresponding change in expression of that DEG. Our data support a relationship between up-regulated genes showing decreased methylation in promoter regions and down-regulated genes showing increased methylation at promoter regions, although there were some exceptions to this relationship.

      (9) EM-Seq is mentioned in Figure 7 and in the material and methods. Where is it used in this study? 

      We now note in the text on page 22 that EM-seq was used during experiments assessing the propagation of BPS-induced epimutations during the iPSC à EpiLC à PGCLC cell state transitions to gather higher resolution data of changes to DNA methylation differences at the whole-epigenome level.

      References

      Brenker C, Rehfeld A, Schiffer C, Kierzek M, Kaupp UB, Skakkebæk NE, Strünker T. 2018. Synergistic activation of CatSper Ca2+ channels in human sperm by oviductal ligands and endocrine disrupting chemicals. Hum Reprod 33:1915–1923. doi:10.1093/humrep/dey275

      Gao P, Wang L, Yang N, Wen J, Zhao M, Su G, Zhang J, Weng D. 2020. Peroxisome proliferator-activated receptor gamma (PPARγ) activation and metabolism disturbance induced by bisphenol A and its replacement analog bisphenol S using in vitro macrophages and in vivo mouse models. Environ Int 134. doi:10.1016/J.ENVINT.2019.105328

      Ozgyin L, Erdos E, Bojcsuk D, Balint BL. 2015. Nuclear receptors in transgenerational epigenetic inheritance. Prog Biophys Mol Biol. doi:10.1016/j.pbiomolbio.2015.02.012

      Pelch KE, Li Y, Perera L, Thayer KA, Korach KS. 2019. Characterization of Estrogenic and Androgenic Activities for Bisphenol A-like Chemicals (BPs): In Vitro Estrogen and Androgen Receptors Transcriptional Activation, Gene Regulation, and Binding Profiles. Toxicol Sci 172:23–37. doi:10.1093/TOXSCI/KFZ173

      Sharma S, Ahmad S, Khan MF, Parvez S, Raisuddin S. 2018. In silico molecular interaction of bisphenol analogues with human nuclear receptors reveals their stronger affinity vs. classical bisphenol A. Toxicol Mech Methods 28:660–669. doi:10.1080/15376516.2018.1491663

      Song K-H, Lee K, Choi H-S. 2011. Endocrine Disrupter Bisphenol A Induces Orphan Nuclear Receptor Nur77 Gene Expression and Steroidogenesis in Mouse Testicular Leydig Cells. Endocrinology 143:2208–2215. doi:10.1210/endo.143.6.8847

      Thomas P, Dong J. 2006. Binding and activation of the seven-transmembrane estrogen receptor GPR30 by environmental estrogens: A potential novel mechanism of endocrine disruption. J Steroid Biochem Mol Biol 102:175–179. doi:10.1016/j.jsbmb.2006.09.017

      Yang Z, Wang L, Yang Y, Pang X, Sun Y, Liang Y, Cao H. 2024. Screening of the Antagonistic Activity of Potential Bisphenol A Alternatives toward the Androgen Receptor Using Machine Learning and Molecular Dynamics Simulation. Environ Sci Technol 58:2817–2829. doi:10.1021/ACS.EST.3C09779/ASSET/IMAGES/LARGE/ES3C09779_0004.JPEG

    2. eLife assessment

      The findings are valuable and supported by compelling evidence from deep sequencing and bioinformatics analyses. The strength of this work lies in the comprehensive analysis of different cells representing various life stages, exposing vulnerabilities to EDCs and relating epimutations to specific genomic regulatory regions. Despite the small sample size, the results make a major contribution to the field and provide novel insight into the emergence and correction of epimutations during epigenetic programming and into the processes underlying multigenerational effects of EDCs.

    3. Reviewer #1 (Public Review):

      In this revised manuscript, authors have conducted epigenetic and transcriptomic profiling to understand how environmental chemicals such as BPS can cause epimutations that can propagate to future generations. They used isolated somatic cells from mice (Sertoli, granulosa), pluripotent cells to model preimplantation embryos (iPSCs) and cells to model the germline (PGCLCs). This enabled them to model sequential steps in germline development, and when/how epimutations occur. The major findings were that BPS induced unique epimutations in each cell type, albeit with qualitative and quantitative cell-specific differences; that these epimutations are prevalent in regions associated with estrogen-response elements (EREs); and that epimutations induced in iPSCs are corrected as they differentiate into PGCLCs, concomitant with the emergence of de novo epimutations. This study will be useful in understanding multigenerational effects of EDCs, and underlying mechanisms.

      Strengths include:

      (1) Using different cell types representing life stages of epigenetic programming and during which exposures to EDCs have different effects. This progression revealed information both about correction of epimutations and the emergence of new ones in PGCLCs.<br /> (2) Work conducted by exposing iPSCs to BPS or vehicle, then differentiating to PGCLCs, revealed that novel epimutations emerged.<br /> (3) Relating epimutations to promoter and enhancer regions

      A few weaknesses remain: Authors need to discuss the limitations of the small sample size. The supplemental data, while extremely helpful, requires better organization.

    4. Reviewer #2 (Public Review):

      Summary:

      This manuscript uses cell lines representative of germ line cells, somatic cells and pluripotent cells to address the question of how the endocrine disrupting compound BPS affects these various cells with respect to gene expression and DNA methylation. They find a relationship between the presence of estrogen receptor gene expression and the number of DNA methylation and gene expression changes. Notably, PGCLCs do not express estrogen receptors and although they do have fewer changes, changes are nevertheless detected, suggesting a nonconical pathway for BPS-induced perturbations. Additionally, there was a significant increase in the occurrence of BPS-induced epimutations near EREs in somatic and pluripotent cell types compared to germ cells. Epimutations in the somatic and pluripotent cell types were predominantly in enhancer regions whereas that in the germ cell type was predominantly in gene promoters.

      Strengths:

      The strengths of the paper include the use of various cell types to address sensitivity of the lineages to BPS as well as the observed relationship between the presence of estrogen receptors and changes in gene expression and DNA methylation.

      Weaknesses:

      The weakness includes the fact that exposures are more complicated in a whole organism than in an isolated cell line.

    1. Author response:

      eLife assessment

      This manuscript reports an important finding that the transcription factor Scleraxis regulates regenerative myogenesis by controlling the proliferation and differentiation of muscle stem cells. The evidence presented is compelling and supports the conclusions and the mechanisms by which this gene regulates satellite cell function. These data will be of interest to developmental, transcriptional, and stem cell biologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      This manuscript by Bai et al concerns the expression of Scleraxis (Scx) by muscle satellite cells (SCs) and the role of that gene in regenerative myogenesis. The authors report the expression of this gene associated with tendon development in satellite cells. Genetic deletion of Scx in SCs impairs muscle regeneration, and the authors provide evidence that SCs deficient in Scx are impaired in terms of population growth and cellular differentiation. Overall, this report provides evidence of the role of this gene, unexpectedly, in SC function and adult regenerative myogenesis.

      We appreciate the comments and thank her/him for the support of our manuscript.

      There are a few minor points of concern.

      (1) From the data in Figure 1, it appears that all of the SCs, assessed both in vitro and in vivo, express Scx. The authors refer to a scRNA-seq dataset from their lab and one report from mdx mouse muscle that also reveals this unexpected gene expression pattern. Has this been observed in many other scRNA-seq datasets? If not, it would be important to discuss potential explanations as to why this has not been reported previously.

      Thanks for this question regarding data in Figure 1. We did initially use immunofluorescence staining of Pax7 and GFP on muscle sections and primary myoblast cultures prepared from Tg-ScxGFP mice to conclude that Scx was expressed in satellite cells (SCs). In addition to the cited mdx RNA-seq data, we have included a re-analysis of a published scRNA-seq data set in Figure 2E (Dell'Orso, Juan et al., Development, 2019), and our own scRNA-seq data (Figure S5D, F). We have also re-examined an additional scRNA-seq data set of TA muscles at various regeneration time points (De Micheli et al., Cell Rep. 2020), in which Scx expression was detected in MuSC progenitors and mature muscle cells (in addition to tenocytes). Thus, our immunostaining results are consistent with scRNA-seq data from our and two other independent scRNA-seq data sets.

      We think that Scx expression in the adult myogenic lineage was not previously reported mainly because its expression level was low, and might be dismissed as spurious detection. Additionally, detecting such low expression levels requires sophisticated detection methods with high capture efficiency. Previous studies have noted limitations in transcript capture or transcription factor dropout in 10x Genomics-based datasets (Lambert et al., Cell, 2018; Pokhilko et al., Genome Res., 2021). Or, Scx was simply not a focus in prior studies amid other genes of interest. Our specific focus on Scx has led us to evaluate its expression in these data sets. We will add the above cited scRNA-seq data set (De Micheli et al., Cell Rep. 2020) and provide a discussion in the revised version.

      (2) A major point of the paper, as illustrated in Fig. 3, is that Scx-neg SCs fail to produce normal myofibers and renewed SCs following injury/regeneration. They mention in the text that there was no increased PCD by Caspase staining at 5 DPI. A failure of cell survival during the process of SC activation, proliferation, and cell fate determination (differentiation versus self-renewal) would explain most of the in vivo data. As such, this conclusion would seem to warrant a more detailed analysis in terms of at least one or two other time points and an independent method for detecting dead/dying cells (the in vitro data in Fig. 4F is also based on an assessment of activated Caspase to assess cell death). The in vitro data presented later in Fig. S4G, H do suggest an increase in cell loss during proliferative expansion of Scx-neg SCs. To what extent does cell loss (by whatever mechanism of cell death) explain both the in vivo findings of impaired regeneration and even the in vitro studies showing slower population expansion in the absence of Scx?

      We appreciate these constructive suggestions. Additional methods and different time points should be helpful in investigating SC cell loss in ScxcKO. Based on the number of available cKO animals, we will carefully choose additional time point(s) to assess PCD, using anti-active Caspase-3 immunostaining and another independent method (e.g., TUNNEL). Although the outcomes are uncertain, we will endeavor to obtain meaningful data from these experiments.

      (3) I'm not sure I understand the description of the data or the conclusions in the section titled "Basement membrane-myofiber interaction in control and Scx cKO mice". Is there something specific to the regeneration from Scx-neg myogenic progenitors, or would these findings be expected in any experimental condition in which myogenesis was significantly delayed, with much smaller fibers in the experimental group at 5 DPI?

      We very much appreciate this comment. We agree that there is unlikely anything specific about the regeneration from Scx-negative myogenic progenitors. Unfilled or empty ghost fibers (basement membrane remnant) are to be expected due to the small fiber and poor regeneration in the ScxcKO mice at 5 dpi. We will correct the subtitle and content accordingly.

      (4) The data presented in Fig. 4B showing differences in the purity of SC populations isolated by FACS depending on the reporter used are interesting and important for the field. The authors offer the explanation of exosomal transfer of Tdt from SCs to non-SCs. The data are consistent with this explanation, but no data are presented to support this. Are there any other explanations that the authors have considered and that could be readily tested?

      Thanks for highlighting this phenomenon. We struggled with the SC purity issue for a long time. The project started with using the R26RtdT reporter for tdT’s paraformaldehyde  resistant strong fluorescence (fixation) to aid visualization in vivo. Later, when we used the tdT signal to purify SCs by FACS, we found that only 80% sorted tdT+ cells are Pax7+. We then switched to the R26RYFP reporter, from which we achieved much higher purity (95%) of SCs (Pax7+) by FACS. As such, we also repeated and confirmed many in vivo experimental results using the R26RYFP reporter (included in the manuscript). Due to the low purity of tdT+SCs by FACS, we discontinued that mouse colony after we confirmed the superior utility of the R26RYFP reporter for SC isolation.

      We sincerely apologize for not being able to conduct further testable experiments on this intriguing phenomenon. However, this issue has since been addressed and published by Murach et al., iScience, (2021). Like our experience, they found non-satellite mononuclear cells with tdT fluorescence after TMX treatment when SCs were isolated via FACS. To determine this was not due to off-target recombination or a technical artifact from tissue processing, they conducted extensive analyses. They found that the tdT+ mononuclear cells included fibrogenic cells (fibroblasts and FAPs), immune cells/macrophages, and endothelial cells. Additionally, they confirmed the significant potential of extracellular vesicle (EV)-mediated cargo transfer, which facilitates the transfer of full-length tdT transcript from lineage-marked Pax7+ cells to those mononuclear cells. We will modify our text to include and acknowledge their contribution to this important point.

      (5) The Cut&Run data of Fig. 6 certainly provide evidence of direct Scx targets, especially since the authors used a novel knock-in strain for analyses. The enrichment of E-box motifs provides support for the 207 intersecting genes (scRNA-seq and Cut&Run) being direct targets. However, the rationale elaborated in the final paragraph of the Results section proposing how 4 of these genes account for the phenotypes on the Scx-neg cells and tissues is just speculation, however reasonable. These are not data, and these considerations would be more appropriate in the Discussion in the absence of any validation studies.

      We agree with this comment and will move this speculation into the discussion.

      Reviewer #2 (Public Review):

      Summary:

      Scx is a well-established marker for tenocytes, but the expression in myogenic-lineage cells was unexplored. In this study, the authors performed lineage-trace and scRNA-seq analyses and demonstrated that Scx is expressed in activated SCs. Further, the authors showed that Scx is essential for muscle regeneration using conditional KO mice and identified the target genes of Scx in myogenic cells, which differ from those of tendons.

      Strengths:

      Sometimes, lineage-trace experiments cause mis-expression and do not reflect the endogenous expression of the target gene. In this study, the authors carefully analyzed the unexpected expression of Scx in myogenic cells using some mouse lines and scRNA-seq data.

      We appreciate the comments and thank her/him for noting the strengths of our manuscript.

      Weaknesses:

      Scx protein expression has not been verified.

      We are aware of this weakness. We had previously used Western blotting (WB) using cultured SCs from control and ScxcKO mice, but did not detect endogenous Scx protein in the control. Hence, we used ScxCreERT2 lineage-tracing, Tg-ScxGFP expression, and ScxTy1 knock-in allele as complementary, even though indirect, ways to address this issue. Following the reviewer’s comment, we will purchase new anti-Scx antibodies and re-perform WB using cultured SCs. If the new antibodies fail to detect endogenous Scx by WB, we will then use immunofluorescence staining to detect endogenous Scx protein.

    2. eLife assessment

      This manuscript reports an important finding that the transcription factor Scleraxis regulates regenerative myogenesis by controlling the proliferation and differentiation of muscle stem cells. The evidence presented is compelling and supports the conclusions and the mechanisms by which this gene regulates satellite cell function. These data will be of interest to developmental, transcriptional, and stem cell biologists.

    3. Reviewer #1 (Public Review):

      This manuscript by Bai et al concerns the expression of Scleraxis (Scx) by muscle satellite cells (SCs) and the role of that gene in regenerative myogenesis. The authors report the expression of this gene associated with tendon development in satellite cells. Genetic deletion of Scx in SCs impairs muscle regeneration, and the authors provide evidence that SCs deficient in Scx are impaired in terms of population growth and cellular differentiation. Overall, this report provides evidence of the role of this gene, unexpectedly, in SC function and adult regenerative myogenesis.

      There are a few minor points of concern.

      (1) From the data in Figure 1, it appears that all of the SCs, assessed both in vitro and in vivo, express Scx. The authors refer to a scRNA-seq dataset from their lab and one report from mdx mouse muscle that also reveals this unexpected gene expression pattern. Has this been observed in many other scRNA-seq datasets? If not, it would be important to discuss potential explanations as to why this has not been reported previously.

      (2) A major point of the paper, as illustrated in Fig. 3, is that Scx-neg SCs fail to produce normal myofibers and renewed SCs following injury/regeneration. They mention in the text that there was no increased PCD by Caspase staining at 5 DPI. A failure of cell survival during the process of SC activation, proliferation, and cell fate determination (differentiation versus self-renewal) would explain most of the in vivo data. As such, this conclusion would seem to warrant a more detailed analysis in terms of at least one or two other time points and an independent method for detecting dead/dying cells (the in vitro data in Fig. 4F is also based on an assessment of activated Caspase to assess cell death). The in vitro data presented later in Fig. S4G, H do suggest an increase in cell loss during proliferative expansion of Scx-neg SCs. To what extent does cell loss (by whatever mechanism of cell death) explain both the in vivo findings of impaired regeneration and even the in vitro studies showing slower population expansion in the absence of Scx?

      (3) I'm not sure I understand the description of the data or the conclusions in the section titled "Basement membrane-myofiber interaction in control and Scx cKO mice". Is there something specific to the regeneration from Scx-neg myogenic progenitors, or would these findings be expected in any experimental condition in which myogenesis was significantly delayed, with much smaller fibers in the experimental group at 5 DPI?

      (4) The data presented in Fig. 4B showing differences in the purity of SC populations isolated by FACS depending on the reporter used are interesting and important for the field. The authors offer the explanation of exosomal transfer of Tdt from SCs to non-SCs. The data are consistent with this explanation, but no data are presented to support this. Are there any other explanations that the authors have considered and that could be readily tested?

      (5) The Cut&Run data of Fig. 6 certainly provide evidence of direct Scx targets, especially since the authors used a novel knock-in strain for analyses. The enrichment of E-box motifs provides support for the 207 intersecting genes (scRNA-seq and Cut&Run) being direct targets. However, the rationale elaborated in the final paragraph of the Results section proposing how 4 of these genes account for the phenotypes on the Scx-neg cells and tissues is just speculation, however reasonable. These are not data, and these considerations would be more appropriate in the Discussion in the absence of any validation studies.

    4. Reviewer #2 (Public Review):

      Summary:

      Scx is a well-established marker for tenocytes, but the expression in myogenic-lineage cells was unexplored. In this study, the authors performed lineage-trace and scRNA-seq analyses and demonstrated that Scx is expressed in activated SCs. Further, the authors showed that Scx is essential for muscle regeneration using conditional KO mice and identified the target genes of Scx in myogenic cells, which differ from those of tendons.

      Strengths:

      Sometimes, lineage-trace experiments cause mis-expression and do not reflect the endogenous expression of the target gene. In this study, the authors carefully analyzed the unexpected expression of Scx in myogenic cells using some mouse lines and scRNA-seq data.

      Weaknesses:

      Scx protein expression has not been verified.

    1. eLife assessment

      This article describes a novel mechanism allowing the insect Drosophila to combat pathogenic enteric pathogens while preserving the beneficial indigenous microbiota. The authors provide compelling evidence that oral infection of Drosophila larvae by pathogenic bacteria activate a valve that traps the intruders in the anterior midgut, allowing them to be killed by antimicrobial peptides. This important work substantially advances our understanding of pathogen clearance in the insect gut.

    2. Reviewer #1 (Public Review):

      Tleiss et al. demonstrate that while commensal Lactiplantibacillus plantarum freely circulate within the intestinal lumen, pathogenic strains such as Erwinia carotovora or Bacillus thuringiensis are blocked in the anterior midgut where they are rapidly eliminated by antimicrobial peptides. This sequestration of pathogenic bacteria in the anterior midgut requires the Duox enzyme in enterocytes, and both TrpA1 and Dh31 in enteroendocrine cells. This effect induces muscular muscle contraction, which is marked by the formation of TARM structures (thoracic ary-related muscles). This muscle contraction-related blocking happens early after infection (15mins). On the other side, the clearance of bacteria is done by the IMD pathway possibly through antimicrobial peptide production while it is dispensable for the blockage. Genetic manipulations impairing bacterial compartmentalization result in abnormal colonization of posterior midgut regions by pathogenic bacteria. Despite a functional IMD pathway, this ectopic colonization leads to bacterial proliferation and larval death, demonstrating the critical role of bacteria anterior sequestration in larval defense.

      This important work substantially advances our understanding of the process of pathogen clearance by identifying a new mode of pathogen eradication from the insect gut. The evidence supporting the authors' claims is solid and would benefit from more rigorous experiments.

      (1) The authors performed the experiments on Drosophila larvae. I wonder whether this model could extend to adult flies since they have shown that the ROS/TRPA1/Dh31 axis is important for gut muscle contraction in adult flies. If not, how would the authors explain the discrepancy between larvae and adults?

      (2) The authors performed their experiments and proposed the models based on two pathogenic bacteria and one commensal bacterial at a relatively high bacterial dose. They showed that feeding Bt at 2X1010 or Ecc15 at 4X108 did not induce a blockage phenotype. I wonder whether larvae die under conditions of enteric infection with low concentrations of pathogenic bacteria. If larvae do not show mortality, what is the mechanism for resisting low concentrations of pathogenic bacteria? Why is this model only applied to high-dose infections?

      (3) The authors claim that the lock of bacteria happens at 15 minutes while killing by AMPs happens 6-8 hours later. What happened during this period? More importantly, is IMD activity induced in the anterior region of the larval gut in both Ecc15 and Bt infection at 6 hours after infection? Are they mostly expressed in the anterior midgut in both bacterial infections? Several papers have shown quite different IMD activity patterns in the Drosophila gut. Zhai et al. have shown that in adult Drosophila, IMD activity was mostly absent in the R2 region as indicated by dpt-lacZ. Vodovar et al. have shown that the expression of dpt-lacZ is observable in proventriculus while Pe is not in the same region. Tzou et al. showed that Ecc15 infection induced IMD activity in the anterior midgut 24 hours after infection. Using TrpA1 and Dh31 mutant, the authors found both Ecc15 and Bt in the posterior midgut. Why are they not evenly distributed along the gut? Last but not least, does the ROS/TrpA1/Dh31 axis affect AMP expression?

      (4) The TARM structure part is quite interesting. However, the authors did not show its relevance in their model. Is this structure the key-driven force for the blocking phenotype and killing phenotype? Is the ROS/TrpA1/Dh31 axis required to form this structure?

    3. Reviewer #2 (Public Review):

      This article describes a novel mechanism of host defense in the gut of Drosophila larvae. Pathogenic bacteria trigger the activation of a valve that blocks them in the anterior midgut where they are subjected to the action of antimicrobial peptides. In contrast, beneficial symbiotic bacteria do not activate the contraction of this sphincter, and can access the posterior midgut, a compartment more favorable to bacterial growth.

      Strengths:

      The authors decipher the underlying mechanism of sphincter contraction, revealing that ROS production by Duox activates the release of DH31 by enteroendocrine cells that stimulate visceral muscle contractions. The use of mutations affecting the Imd pathway or lacking antimicrobial peptides reveals their contribution to pathogen elimination in the anterior midgut.

      Weaknesses:

      - The mechanism allowing the discrimination between commensal and pathogenic bacteria remains unclear.

      - The use of only two pathogens and one symbiotic species may not be sufficient to draw a conclusion on the difference in treatment between pathogenic and symbiotic species.

      - We can also wonder how the process of sphincter contraction is affected by the procedure used in this study, where larvae are starved. Does the sphincter contraction occur in continuous feeding conditions? Since larvae are continuously feeding, is this process physiologically relevant?

    1. eLife assessment

      This study represents valuable findings on the asymmetric connectivity pattern of two different types of CA3 pyramidal cell types showing that while athorny cells receive strong inputs from all other cell types, thorny cells receive weaker inputs from athorny neurons. Computational modeling is used to evaluate the impact of this connectivity scheme on the sequential activation of different cell types during sharp wave ripples. The experimental evidence supporting the authors' claims is solid, although improvements to the modelling aspect of the study would strengthen the study.

    2. Reviewer #1 (Public Review):

      Summary:

      Sammons, Masserini et al. examine the connectivity of different types of CA3 pyramidal cells ("thorny" and "athorny"), and how their connectivity putatively contributes to their relative timing in sharp-wave-like activity. First, using patch-clamp recordings, they characterize the degree of connectivity within and between athorny and thorny cells. Based upon these experimental results, they compute a synaptic product matrix, and use this to inform a computational model of CA3 activity. This model finds that this differential connectivity between these populations, augmented by two different types of inhibitory neurons, can account for the relative timing of activity observed in sharp waves in vivo.

      Strengths:

      The patch-clamp experiments are exceptionally thorough and well done. These are very challenging experiments and the authors should be commended for their in-depth characterization of CA3 connectivity.

      Weaknesses:

      (1) The computational elements of this study feel underdeveloped. Whereas the authors do a thorough job experimentally characterizing connections between excitatory neurons, the inhibitory neurons used in the model seem to be effectivity "fit neurons" and appear to have been tuned to produce the emergent properties of CA3 sharp wave-like activity. Although I appreciate the goal was to implicate CA3 connectivity contributions to activity timing, a stronger relationship seems like it could be examined. For example, did the authors try to "break" their model? It would be informative if they attempted different synaptic product matrices (say, the juxtaposition of their experimental product matrix) and see whether experimentally-derived sequential activity could not be elicited. It seems as though this spirit of analysis was examined in Figure 4C, but only insofar as individual connectivity parameters were changed in isolation.

      (2) Additional explanations of how parameters for interneurons were incorporated in the model would be very helpful. As it stands, it is difficult to understand the degree to which the parameters of these neurons are biologically constrained versus used as fit parameters to produce different time windows of activity in types of CA3 pyramidal cells.

    3. Reviewer #2 (Public Review):

      Sharp wave ripples are transient oscillations occurring in the hippocampus that are thought to play an important role in organising temporal sequences during the reactivation of neuronal activity. This study addresses the mechanism by which these temporal sequences are generated in the CA3 region focusing on two different subtypes of pyramidal neurons, thorny and athorny. Using high-quality electrophysiological recordings from up to 8 pyramidal neurons at a time the authors measure the connectivity rates between these pyramidal cell subtypes in a large dataset of 348 cells. This is a significant achievement and provides important data. The most striking finding is how similar connection characteristics are between cell types. There are no differences in synaptic strength or failure rates and some small differences in connectivity rates and short-term plasticity. Using model simulations, the authors explore the implications of the differences in connectivity rates for the temporal specificity of pyramidal cell firing within sharp-wave ripple events. The simulations show that the experimentally observed connectivity rates may contribute to the previously observed temporal sequence of pyramidal cell firing during sharp wave ripples.

      The conclusions drawn from the simulations are not experimentally tested so remain theoretical. In the simple network model, the authors include basket cell and anti-SWR interneurons but the connectivity of these cell types is not measured experimentally and variations in interneuron parameters may also influence temporal specificity of firing. In addition, the influence of short-term plasticity measured in their experiments is not tested in the model. Interestingly, the experimental data reveal a large variability in many of the measured parameters. This may strongly influence the firing of pyramidal cells during SWRs but it is not represented within the model which uses the averaged data.

    4. Reviewer #3 (Public Review):

      Summary:

      The hippocampal CA3 region is generally considered to be the primary site of initiation of sharp wave ripples-highly synchronous population events involved in learning and memory although the precise mechanism remains elusive. A recent study revealed that CA3 comprises two distinct pyramidal cell populations: thorny cells that receive mossy fiber input from the dentate gyrus, and athorny cells that do not. That study also showed that it is athorny cells in particular that play a key role in sharp wave initiation. In the present work, Sammons, Masserini, and colleagues expand on this by examining the connectivity probabilities among and between thorny and athorny cells. First, using whole-cell patch clamp recordings, they find an asymmetrical connectivity pattern, with athorny cells receiving the most synaptic connections from both athorny and thorny cells, and thorny cells receiving fewer. They then demonstrate in spiking neural network simulations how this asymmetrical connectivity may underlie the preferential role of athorny cells in sharp wave initiation.

      Strengths:

      The authors provide independent validation of some of the findings by Hunt et al. (2018) concerning the distinction between thorny and athorny pyramidal cells in CA3 and advance our understanding of their differential integration in CA3 microcircuits. The properties of excitatory connections among and between thorny and athorny cells described by the authors will be key in understanding CA3 functions including, but not limited to, sharp wave initiation.

      As stated in the paper, the modeling results lend support to the idea that the increased excitatory connectivity towards athorny cells plays a key role in causing them to fire before thorny cells in sharp waves. More generally, the model adds to an expanding pool of models of sharp wave ripples which should prove useful in guiding and interpreting experimental research.

      Weaknesses:

      The mechanism by which athorny cells initiate sharp waves in the model is somewhat confusingly described. As far as I understood, random fluctuations in the activities of A and B neurons provide windows of opportunity for pyramidal cells to fire if they have additionally recovered from adaptive currents. Thorny and athorny pyramidal cells are then set in a winner-takes-all competition which is quickly won by the athorny cells. The main thesis of the paper seems to be that athorny cells win this competition because they receive more inputs both from themselves and from thorny cells, hence, the connectivity "underlies the sequential activation". However, it is also stated that athorny cells activate first due to their lower rheobase and steeper f-I curve, and it is also indicated in the methods that athorny (but not thorny) cells fire in bursts. It seems that it is primarily these features that make them fire first, something which apparently happens even when the A to A connectivity is set to 0-albeit with a very small lag. Perhaps the authors could further clarify the differential role of single cell and network parameters in determining the sequential activation of athorny and thorny cells. Is the role of asymmetric excitatory connectivity only to enhance the initial intrinsic advantage of athorny cells? If so, could this advantage also be enhanced in other ways?

      Although a clear effort has been made to constrain the model with biological data, too many degrees of freedom remain that allow the modeler to make arbitrary decisions. This is not a problem in itself, but perhaps the authors could explain more of their reasoning and expand upon the differences between their modeling choices and those of others. For example, what are the conceptual or practical advantages of using adaptation in pyramidal neurons as opposed to short-term synaptic plasticity as in the model by Hunt et al.? Relatedly, what experimental observations could validate or falsify the proposed mechanisms?

      In the data by Hunt et al., thorny cells have a higher baseline (non-SPW) firing rate, and it is claimed that it is actually stochastic correlations in their firing that are amplified by athorny cells to initiate sharp waves. However, in the current model, the firing of both types of pyramidal cells outside of ripples appears to be essentially zero. Can the model handle more realistic firing rates as described by Hunt et al., or as produced by e.g., walking around an environment tiled with place cells, or would that trigger SPWs continuously?

    1. eLife assessment

      This important work describes the activation of astrocytes via the nuclear translocation of PKM2 in an animal model of multiple sclerosis. This study provides solid evidence of the interaction between TRIM21 and PKM2 as the crucial molecular event leading to the translocation of PKM2 and a metabolic shift towards glycolysis dominance, fostering proliferation in stimulated astrocytes. This finding is significant as it underscores the potential of targeting glycolytic metabolism to mitigate neurological diseases mediated by astrocytes, offering a strong rationale for potential therapeutic interventions. However, control experiments and imaging analyses with higher magnification images should be performed to better support the main claims of the study.

    1. eLife assessment

      This study reports an important discovery highlighting the essential role of the putative ion channel, TMC7, in acrosome formation during sperm development and thus male fertility. The evidence for the requirement of TMC7 in acrosome biogenesis and sperm function is convincing, although its function as an ion channel remains to be further determined. Overall, this work will be of great interest to developmental biologists and ion channel physiologist alike.

    2. Reviewer #1 (Public Review):

      Summary:

      TMC7 knockout mice were generated by the authors and the phenotype was analyzed. They found that Tmc7 is localized to Golgi and is needed for acrosome biogenesis.

      Strengths:

      The phenotype of infertility is clear, and the results of TMC7 localization and the failed acrosome formation are highly reliable. In this respect, they made a significant discovery regarding spermatogenesis.

      In the original version, I pointed out the gap between their pH/calcium imaging data and the hypothesis of ion channel function of TMC7 in the Golgi. Now the author agrees and has changed the description to be reasonable. Additional experiments were also performed, and I can say that they have answered my concern adequately.

      I would say it is good to add any presumed mechanism for the observed changes in pH and calcium concentration in the cytoplasm this time.

    3. Reviewer #2 (Public Review):

      Summary:

      This study presents a significant finding that enhances our understanding of spermatogenesis. TMC7 belongs to a family of transmembrane channel-like proteins (TMC1-8), primarily known for their role in the ear. Mutations to TMC1/2 are linked to deafness in humans and mice and were originally characterized as auditory mechanosensitive ion channels. However, the function of the other TMC family members remains poorly characterized. In this study, the authors begin to elucidate the function of TMC7 in acrosome biogenesis during spermatogenesis. Through analysis of transcriptomics datasets, they elevated levels of TMC7 in round spermatids in both mouse and human testis. They then generate Tmc7-/- mice and find that male mice exhibit smaller testes and complete infertility. Examination of different developmental stages reveals spermatogenesis defects, including with reduced sperm count, elongated spermatids and large vacuoles. Additionally, abnormal acrosome morphology are observed beginning at the early-stage Golgi phase, indicating TMC7's involvement in proacrosomal vesicle trafficking and fusion. They observed localization of TMC7 in the cis-Golgi and suggest that its presence is required for maintaining Golgi integrity, with Tmc7-/- leading to reduced intracellular Ca2+, elevated pH and increased ROS levels, likely resulting in spermatid apoptosis. Overall, the work delineates a new function of TMC7 in spermatogenesis and the authors propose that that its ion channel and/or scramblase activity is likely important for Golgi homeostasis. This work is of significant interest to the community and is of high quality.

      Strengths:

      The biggest strength of the paper is the phenotypic characterization of the TMC7-/- mouse model, which has clear acrosome biogenesis/spermatogenesis defects. This is the main claim of the paper and it is supported with the data that are presented.

      Weaknesses:

      It isn't clear whether TMC7 functions as an ion channel from the current data presented in this paper, but the authors are careful in their interpretation and present this merely as a hypothesis supporting this idea.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, Wang et al. have demonstrated that TMC7, a testis-enriched multipass transmembrane protein, is essential for male reproduction in mice. Tmc7 KO male mice are sterile due to reduced sperm count and abnormal sperm morphology. TMC7 co-localizes with GM130, a cis-Golgi marker, in round spermatids. The absence of TMC7 results in reduced levels of Golgi proteins, elevated abundance of ER stress markers, as well as changes of Ca2+ and pH levels in the KO testis. However, further confirmation is required because the analyses were performed with whole testis samples in spite of the differences in the germ cell composition in WT and KO testis. In addition, the causal relationships between the reported anomalies await thorough interrogation

      Strengths:

      By using PD21 testes, the revised assays have consolidated that depletion of TMC7 leads to a reduced level of Ca2+ and an elevated level of ROS in the male germ cells. The immunohistochemistry analyses have clearly indicated the reduced abundance of GM130, P115, and GRASP65 in the knockout testis.

      Weaknesses:

      The Discussion section contains sentences reiterating the Introduction and Results of this manuscript (e.g., Lines 79-85 and 231-236; Lines 175-179 and 259-263). Those read repetitive and can be removed.

      Future studies are required to decipher how TMC7 stabilizes Golgi structure, coordinates vesicle transport, and maintains the germ cell homeostasis.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      TMC7 knockout mice were generated by the authors and the phenotype was analyzed. They found that Tmc7 is localized to Golgi and is needed for acrosome biogenesis.

      Strengths:

      The phenotype of infertility is clear, and the results of TMC7 localization and the failed acrosome formation are highly reliable. In this respect, they made a significant discovery regarding spermatogenesis.

      Weaknesses:

      There are also some concerns, which are mainly related to the molecular function of TMC7 and Figure 5.

      (1) It is understandable that TMC7 exhibits some channel activity in the Golgi and somehow affects luminal pH or Ca2+, leading to the failure of acrosome formation. On the other hand, since they are conducting the pH and calcium imaging from the cytoplasm, I do not think that the effect of TMC7 channel function in Golgi is detectable with their methods.

      We agree with the reviewer that there are no direct evidences showing the effect of TMC7 channel function in Golgi. We have changed the description in the revised manuscript.

      (2) Rather, it is more likely that they are detecting apoptotic cells that have no longer normal ion homeostasis.

      We thank the reviewer for raising this concern. We apologize for not labeling the postnatal stage in original Figure 5. We measured intracellular Ca2+, pH and ROS in PD30 testes (revised Fig. S6a-c), no apoptotic cells were observed at this stage (revised Fig. S6e, f). Apoptotic cells were found in the seminiferous tubules and cauda epididymis of 9-week-old Tmc7–/– mice (revised Fig. 5e-f). We have included TUNEL data in testis of PD21, PD30 and 9-week-old mice (revised Fig. 5e, f and Fig. S6e, f). In accordance with our findings, Tmc1 mutation has also been shown to result in reduced Ca2+ permeability, thus triggering hair cell apoptosis (Fettiplace, R, PNAS. 2022) [1].

      (3) Another concern is that n is only 3 for these imaging experiments.

      As suggested by the reviewer, more replicates were included in imaging experiments.

      Reviewer #2 (Public Review):

      Summary:

      This study presents a significant finding that enhances our understanding of spermatogenesis. TMC7 belongs to a family of transmembrane channel-like proteins (TMC1-8), primarily known for their role in the ear. Mutations to TMC1/2 are linked to deafness in humans and mice and were originally characterized as auditory mechanosensitive ion channels. However, the function of the other TMC family members remains poorly characterized. In this study, the authors begin to elucidate the function of TMC7 in acrosome biogenesis during spermatogenesis. Through analysis of transcriptomics datasets, they identify TMC7 as a transmembrane channel-like protein with elevated transcript levels in round spermatids in both mouse and human testis. They then generate Tmc7-/- mice and find that male mice exhibit smaller testes and complete infertility. Examination of different developmental stages reveals spermatogenesis defects, including reduced sperm count, elongated spermatids, and large vacuoles. Additionally, abnormal acrosome morphology is observed beginning at the early-stage Golgi phase, indicating TMC7's involvement in proacrosomal vesicle trafficking and fusion. They observed localization of TMC7 in the cis-Golgi and suggest that its presence is required for maintaining Golgi integrity, with Tmc7-/- leading to reduced intracellular Ca2+, elevated pH, and increased ROS levels, likely resulting in spermatid apoptosis. Overall, the work delineates a new function of TMC7 in spermatogenesis and the authors suggest that its ion channel activity is likely important for Golgi homeostasis. This work is of significant interest to the community and is of high quality.

      Strengths:

      The biggest strength of the paper is the phenotypic characterization of the TMC7-/- mouse model, which has clear acrosome biogenesis/spermatogenesis defects. This is the main claim of the paper and it is supported by the data that are presented.

      Weaknesses:

      The claim is that TMC7 functions as an ion channel. It is reasonable to assume this given what has been previously published on the more well-characterized TMCs (TMC1/2), but the data supporting this is preliminary here, and more needs to be done to solidify this hypothesis. The authors are careful in their interpretation and present this merely as a hypothesis supporting this idea.

      We appreciate the insightful comment. It is indeed a limitation of our study that we lack strong evidences to support that TMC7 functions as an ion channel. We have planned to conduct cellular electrophysiology in GC-1 cells heterologous expression of TMC7. However, TMC7 was trapped in the endoplasmic reticulum like TMC1 and TMC2 (Yu X, PNAS. 2020)[2], and failed to localize to the Golgi. According to the reviewer’s suggestion, we have made careful and more detailed interpretation the molecular function of TMC7 in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      In this study, Wang et al. have demonstrated that TMC7, a testis-enriched multipass transmembrane protein, is essential for male reproduction in mice. Tmc7 KO male mice are sterile due to reduced sperm count and abnormal sperm morphology. TMC7 co-localizes with GM130, a cis-Golgi marker, in round spermatids. The absence of TMC7 results in reduced levels of Golgi proteins, elevated abundance of ER stress markers, as well as changes of Ca2+ and pH levels in the KO testis. However, further confirmation is required because the analyses were performed with whole testis samples in spite of the differences in the germ cell composition in WT and KO testis. In addition, the causal relationships between the reported anomalies await thorough interrogation.

      Strengths:

      The microscopic images are of great quality, all figures are properly arranged, and the entire manuscript is very easy to follow.

      Weaknesses:

      (1) Tmc7 KO male mice show multiple anomalies in sperm production and morphogenesis, such as reduced sperm count, abnormal sperm head, and deformed midpiece. Thus, it is confusing that the authors focused solely on impaired acrosome biogenesis.

      We are grateful to your comments and suggestions. We agree and have added these defects in spermiogenesis of Tmc7–/– mice in the abstract and discussion sections of revised manuscript.

      (2) Further investigations are warranted to determine whether the abnormalities reported in this manuscript (e.g., changes in protein, Ca2+, and pH levels) are directly associated with the molecular function of TMC7 or are the byproducts of partially arrested spermiogenesis. Please find additional comments in "Recommendations for the authors".

      Thank you for raising this concern. Per your comments, we have included data of intracellular Ca2+, pH and ROS in PD21 testes. The intracellular homeostasis was impaired as early as PD21, indicating TMC7 depletion impairs cellular homeostasis which in turn results in arrested spermiogenesis.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      As noted by all three reviewers, current flow cytometry data does not necessarily support the 'ion channel' hypothesis, thus the phenotypic analysis is compelling but the molecular mechanism of how TMC7 facilitates acrosome biogenesis remains incomplete. It is highly recommended for the authors to at least discuss or test alternative hypotheses (as reviewer #2 suggested) such as the possibility of acting as 'lipid scramblase'. Also, the authors need to provide further explanation for other morphological defects if TMC7 is truly a functional ion channel in Golgi (and thus later at acrosome), which is also related to the key question of whether TMC7 is a functional ion channel.

      We thank the reviewing editor for the comments and suggestions. We agree that our study lack strong evidences to support that TMC7 functions as an ion channel. We have discussed the possibility of TMC7 acting as 'lipid scramblase' as suggested. We have also included data of intracellular Ca2+, pH and ROS in PD21, PD30 testes.

      Indeed, Tmc7–/– mice exhibits other defects including abnormal head morphology and disorganized mitochondrial sheaths. As TMC7 is localized to the cis-Golgi apparatus and is required for maintaining Golgi integrity. Previous studies on Golgi localized proteins including GOPC (Yao R, PNAS. 2002)[2], HRB (Kang-Decker N. Science. 2001)[3] and PICK1(Xiao N, JCI. 2009)[4] exhibit similar defects in spermiogenesis with Tmc7–/– mice. It is possible that defects morphologies in Tmc7–/– mice might be due to impaired function of Golgi.

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors should provide more details about the imaging experiments using FACS. Since they only describe catalog numbers (Beyotime, S1056, S1006, S0033S) for imaging reagents, it is not immediately clear what reagents they actually used. Since they used Fluo3, BCECF, and DCFH, it would be better to mention their names.

      Thanks. We have provided more detailed antibody information as suggested.

      (2) I am also concerned that in the FACS there is no information at all about laser wavelength and filter properties. This is especially important for BCECF because the wavelength spectrum changes with pH. Also, if there are any positive controls for these imaging reagents, such as ionophores, it would be more convincing to include them.

      Thank you for your comment. Excitation wavelength is 488nm for detecting Ca2+, pH and ROS in FACS. BCECF is the most popular pH probe to monitor cellular pH and the reagent from Beyotime (S1006) has been used by other studies (Chen S, Blood. 2016)[5], (Liu H, Cell Death Dis. 2022)[6]. To make the results more reliable, we have repeated these experiments in PD21 testes (revised Figure 5a-c). No positive controls for these reagents were used in our experiments.

      (3) As noted above, it is better to avoid directly linking the cell's abnormal ion homeostasis to TMC7 ion channel function in the text. The discussion should be changed to emphasize that the TMC7-deficient cells are apoptotic and that these physiological phenomena are occurring as a side effect of this apoptosis.

      Thank you for raising this concern. We agree with the reviewer that there are no direct evidences showing the effect of TMC7 channel function in Golgi and we have changed the description in the revised manuscript.

      We performed new experiment to measure apoptosis and intracellular Ca2+, pH and ROS in PD21 testes. No apoptotic cells were observed at this stage. However, impaired cellular homeostasis was still found in testis of PD21 Tmc7-/- mice. These data suggest that TMC7 depletion impairs cellular homeostasis and hence induces spermatid apoptosis.

      (4) While I understand that it appears to be difficult to experimentally verify the ion channel function of TMC7, it may be supportive to compare its amino acid sequence and/or 3D predicted structure with that of TMC1/2. Including a supplemental figure for this purpose would emphasize the possibility that TMC7 functions as an ion channel.

      We thank the reviewer for making this great suggestion. We compared the amino acid sequence and structure of TMC1, TMC2 with TMC7 respectively. TMC1 had 81% sequence similarity with TMC7 and the RMSD (Root Mean Square Deviation) was 3.079. TMC2 had 82% sequence similarity with TMC7, the RMSD was 2.176. These data suggest that TMC7 has similar amino acid sequence and predicted structure with TMC1/2 and might functions as an ion channel. We have included the predicted structures in revised Fig. S7.

      Author response image 1.

      Reviewer #2 (Recommendations For The Authors):

      I do not have any experimental comments or concerns to address, but I do ask that the authors consider an alternative hypothesis. Based on prior data demonstrating that TMC1 is a mechanosensitive ion channel, the authors reasonably assume that TMC7 may also function as an ion channel. Although the authors observe alterations in cytosolic Ca2+ and pH upon loss of TMC7 by flow cytometry, which begins to support this hypothesis, these data do not directly demonstrate ion channel activity.

      I was wondering if the authors had considered whether TMC7 could also function as a lipid scramblase. TMC1 has also been proposed to function as a Ca2+-inhibited scramblase, where knockout of TMC1 leads to a loss of phosphatidylserine (PS) exposure and membrane blebbing at the apical region of hair cells (Ballesteros, A. and Swartz, K., Science Advances, 2022). Furthermore, TMC proteins are structurally related to the Anoctamin/TMEM16 family of chloride channels and lipid scramblases, where TMEM16A-B are bona fide Ca2+-activated chloride channels, and TMEM16C-H are characterized as Ca2+-dependent scramblases. Based on their structural similarity and the observation that TMC1 may also exhibit lipid scrambling properties based on the PS exposure, I wonder if the authors may have data that support a TMC7 scramblase hypothesis. I was intrigued by this idea, especially given the authors' observations of large vacuoles in the seminiferous tubules and cauda epididymis and the vesicle accumulation phenotype in their TEM data. Incorporating this hypothesis into the discussion section, at minimum, could provide a valuable perspective, and this line of thought may lead to interesting data interpretation throughout the paper.

      We thank the reviewer for the valuable suggestion. We have discussed the possibility of TMC7 acting as 'lipid scramblase' as suggested.

      Reviewer #3 (Recommendations For The Authors):

      (1) Gene symbols should be italicized, and protein symbols should be capitalized.

      Thanks. We have made changes to the manuscript as recommended.

      (2) Tmc7 KO males show reduced sperm count, which alters the germ cell composition in the testis (Figure 2g). Thus, it is inappropriate to compare protein levels using whole testis lysates (Figure 3e, 4h, 5d, 5f). Instead, the same immunoblotting analyses could be done with purified round spermatids or 3-wk-old testis. Likewise, the significance of the intracellular Ca2+ and pH measurements is potentially diminished by the differences in the germ cell composition in WT and KO mice.

      We appreciate this constructive suggestion. We agree with the reviewer that whole testis lysates diminished the differences between WT and _Tmc7-/-_mice. However, we are unable purify round spermatids due to the lack of specific markers.

      (3) Figures 2i, 2j: How sperm motility was measured should be specified in the Methods.

      We thank you for your significant reminding and have added sperm motility assessment in Methods section.

      (4) Figure 4g: It does not make sense to compare the fluorescence intensity of these proteins without making sure that the seminiferous tubules are in the same stage. As shown in Figures S5a and S5b, TMC7 exhibits varied abundance in spermatids at different steps.

      We thank the reviewer for the insightful comment. We have replaced images in the same stage seminiferous tubules and compared the fluorescence intensity of new images as suggested.

      (5) Figure 4h: How were the band intensities measured? The third band from the left is visually stronger than the first one, but it does not seem to be so according to the column graph. The reviewer measured the intensity of GRASP65 bands relative to alpha-tubulin by ImageJ and obtained relative intensities of 0.35, 0.87, 0.6, and 0.08 for the bands from left to right. Additional replicates of the western blots should be included in the supplementary figures.

      Thank you for this insightful comment. The density and size of the blots were quantified by Image J. We have checked the first band from the left of GRASP65 and it seems that the protein was not fully transferred onto the PVDF membrane. We have performed new experiments and replaced the original bands (Revised Fig. 4h). Additional replicates of the western blots have been included in revised Fig. S8.

      (6) Figures 5a, 5b: Based on the observation of abnormal intracellular Ca2+ and pH levels in the KO germ cells, the authors concluded that TMC7 maintains the homeostasis of Golgi pH and ion (Lines 223-224, 263-264). However, intracellular Ca2+ and pH levels do not directly reflect those in the Golgi apparatus.

      We thank the reviewer for this important comment. We agree and have changed “Golgi” to “intracellular” as suggested.

      (7) Figure 5c: ROS is produced during apoptosis. Thus, it is not appropriate to conclude that the increased ROS levels in Tmc7 KO germ cells lead to apoptosis.

      According to the reviewer’s comment, we measured ROS and apoptosis in testis of PD21 and PD30 mice. ROS levels were increased, but no apoptotic cells were observed in testis of PD21 and PD30 Tmc7–/– mice. Apoptotic cells were observed in testis of 9-week-old Tmc7–/– mice (Revised Fig. 5e-f). These data suggest that TMC7 depletion results in the accumulation of ROS, thereby leads to apoptosis.

      (1) Fettiplace, R., D.N. Furness, and M. Beurg, The conductance and organization of the TMC1-containing mechanotransducer channel complex in auditory hair cells. Proc Natl Acad Sci U S A, 2022. 119(41): p. e2210849119.

      (2) Yu, X., et al., Deafness mutation D572N of TMC1 destabilizes TMC1 expression by disrupting LHFPL5 binding. Proc Natl Acad Sci U S A, 2020. 117(47): p. 29894-29903.

      (3) Kang-Decker, N., et al., Lack of acrosome formation in Hrb-deficient mice. Science, 2001. 294(5546): p. 1531-3.

      (4) Xiao, N., et al., PICK1 deficiency causes male infertility in mice by disrupting acrosome formation. J Clin Invest, 2009. 119(4): p. 802-12.

      (5) Chen, S., et al., Sympathetic stimulation facilitates thrombopoiesis by promoting megakaryocyte adhesion, migration, and proplatelet formation. Blood, 2016. 127(8): p. 1024-35.

      (6) Liu, H., et al., PRMT5 critically mediates TMAO-induced inflammatory response in vascular smooth muscle cells. Cell Death Dis, 2022. 13(4): p. 299.

    1. eLife assessment

      This manuscript reports valuable findings on the role of the Srs2 protein in turning off the DNA damage signaling response initiated by Mec1 (human ATR) kinase. The data provide solid evidence that Srs2 interaction with PCNA and ensuing SUMO modification is required for checkpoint downregulation. However, experimental evidence with regard to the model that Srs2 acts at gaps after camptothecin-induced DNA damage is currently lacking. The work will be of interest to cell biologists studying genome integrity but would be strengthened by considering the possible role of Rad51 and its removal.

    2. Reviewer #1 (Public Review):

      Overall, the data presented in this manuscript is of good quality. Understanding how cells control RPA loading on ssDNA is crucial to understanding DNA damage responses and genome maintenance mechanisms. The authors used genetic approaches to show that disrupting PCNA binding and SUMOylation of Srs2 can rescue the CPT sensitivity of rfa1 mutants with reduced affinity for ssDNA. In addition, the authors find that SUMOylation of Srs2 depends on binding to PCNA and the presence of Mec1.

      Noted weaknesses include the lack of evidence supporting that Srs2 binding to PCNA and its SUMOylation occur at ssDNA gaps, as proposed by the authors. Also, the mutants of Srs2 with impaired binding to PCNA or impaired SUMOylation showed no clear defects in checkpoint dampening, and in some contexts, even resulted in decreased Rad53 activation. Therefore, key parts of the paper would benefit from further experimentation and/or clarification.

      Major Comments

      (1) The central model proposed by the authors relies on the loading of PCNA at the 3' junction of an ssDNA gap, which then mediates Srs2 recruitment and RPA removal. While several aspects of the model are consistent with the data, the evidence that it is occurring at ssDNA gaps is not strong. The experiments mainly used CPT, which generates mostly DSBs. The few experiments using MMS, which mostly generates ssDNA gaps, show that Srs2 mutants lead to weaker rescue in this context (Figure S1). How do the authors explain this discrepancy? In the context of DSBs, are the authors proposing that Srs2 is engaging at later steps of HR-driven DSB repair where PCNA gets loaded to promote fill-in synthesis? If so, is RPA removal at that step important for checkpoint dampening? These issues need to be addressed and the final model adjusted.

      (2) The data in Figure 3 showing that Srs2 mutants reduce Rad53 activation in the rfa1-zm2 mutant are confusing, especially given the claim of an anti-checkpoint function for Srs2 (in which case Srs2 mutants should result in increased Rad53 activation). The authors propose that Rad53 is hyperactivated in rfa1-zm2 mutant because of compromised ssDNA protection and consequential DNA lesions, however, the effects sharply contrast with the central model. Are the authors proposing that in the rfa1-zm2 mutant, the compromised protection of ssDNA supersedes the checkpoint-dampening effect? Perhaps a schematic should be included in Figure 3 to depict these complexities and help the reader. The schematic could also include the compensatory dampening mechanisms like Slx4 (on that note, why not move Figure S2 to a main figure?... and even expand experiments to better characterize the compensatory mechanisms, which seem important to help understand the lack of checkpoint dampening effect in the Srs2 mutants)

      (3) The authors should demarcate the region used for quantifying the G1 population in Figure 3B and explain the following discrepancy: By inspection of the cell cycle graph, all mutants have lower G1 peak height compared to WT (CPT 2h). However, in the quantification bar graph at the bottom, ΔPIM has higher G1 population than the WT.

    3. Reviewer #2 (Public Review):

      Summary:

      This is an interesting paper that delves into the post-translational modifications of the yeast Srs2 helicase and proteins with which it interacts in coping with DNA damage. The authors use mutants in some interaction domains with RPA and Srs2 to argue for a model in which there is a balance between RPA binding to ssDNA and Srs2's removal of RPA. The idea that a checkpoint is being regulated is based on observing Rad53 and Rad9 phosphorylation (so there are the attributes of a checkpoint), but evidence of cell cycle arrest is lacking. The only apparent delay in the cell cycle is the re-entry into the second S phase (but it could be an exit from G2/M); but in any case, the wild-type cells enter the next cell cycle most rapidly. No direct measurement of RPA residence is presented.

      Strengths:

      Data concern viability assays in the presence of camptothecin and in the post-translational modifications of Srs2 and other proteins.

      Weaknesses:

      There are a couple of overriding questions about the results, which appear technically excellent. Clearly, there is an Srs2-dependent repair process here, in the presence of camptothecin, but is it a consequence of replication fork stalling or chromosome breakage? Is repair Rad51-dependent, and if so, is Srs2 displacing RPA or removing Rad51 or both? If RPA is removed quickly what takes its place, and will the removal of RPA result in lower DDC1-MEC1 signaling?

      Moreover, It is worth noting that in single-strand annealing, which is ostensibly Rad51 independent, a defect in completing repair and assuring viability is Srs2-dependent, but this defect is suppressed by deleting Rad51. Does deleting Rad51 have an effect here?

      Neither this paper nor the preceding one makes clear what really is the consequence of having a weaker-binding Rfa1 mutant. Is DSB repair altered? Neither CPT nor MMS are necessarily good substitutes for some true DSB assay.

      With camptothecin, in the absence of site-specific damage, it is difficult to test these questions directly. (Perhaps there is a way to assess the total amount of RPA bound, but ongoing replication may obscure such a measurement). It should be possible to assess how CPT treatment in various genetic backgrounds affects the duration of Mec1/Rad53-dependent checkpoint arrest, but more than a FACS profile would be required.

      It is also notable that MMS treatment does not seem to yield similar results (Fig. S1).

    4. Reviewer #3 (Public Review):

      The superfamily I 3'-5' DNA helicase Srs2 is well known for its role as an anti-recombinase, stripping Rad51 from ssDNA, as well as an anti-crossover factor, dissociating extended D-loops and favoring non-crossover outcome during recombination. In addition, Srs2 plays a key role in ribonucleotide excision repair. Besides DNA repair defects, srs2 mutants also show a reduced recovery after DNA damage that is related to its role in downregulating the DNA damage signaling or checkpoint response. Recent work from the Zhao laboratory (PMID: 33602817) identified a role of Srs2 in downregulating the DNA damage signaling response by removing RPA from ssDNA. This manuscript reports further mechanistic insights into the signaling downregulation function of Srs2.

      Using the genetic interaction with mutations in RPA1, mainly rfa1-zm2, the authors test a panel of mutations in Srs2 that affect CDK sites (srs2-7AV), potential Mec1 sites (srs2-2SA), known sumoylation sites (srs2-3KR), Rad51 binding (delta 875-902), PCNA interaction (delta 1159-1163), and SUMO interaction (srs2-SIMmut). All mutants were generated by genomic replacement and the expression level of the mutant proteins was found to be unchanged. This alleviates some concern about the use of deletion mutants compared to point mutations. The double mutant analysis identified that PCNA interaction and SUMO sites were required for the Srs2 checkpoint dampening function, at least in the context of the rfa1-zm2 mutant. There was no effect of these mutants in a RFA1 wild-type background. This latter result is likely explained by the activity of the parallel pathway of checkpoint dampening mediated by Slx4, and genetic data with an Slx4 point mutation affecting Rtt107 interaction and checkpoint downregulation support this notion. Further analysis of Srs2 sumoylation showed that Srs2 sumoylation depended on PCNA interaction, suggesting sequential events of Srs2 recruitment by PCNA and subsequent sumoylation. Kinetic analysis showed that sumoylation peaks after maximal Mec1 induction by DNA damage (using the Top1 poison camptothecin (CPT)) and depended on Mec1. These data are consistent with a model that Mec1 hyperactivation is ultimately leading to signaling downregulation by Srs2 through Srs2 sumoylation. Mec1-S1964 phosphorylation, a marker for Mec1 hyperactivation and a site found to be needed for checkpoint downregulation after DSB induction did not appear to be involved in checkpoint downregulation after CPT damage. The data are in support of the model that Mec1 hyperactivation when targeted to RPA-covered ssDNA by its Ddc2 (human ATRIP) targeting factor, favors Srs2 sumoylation after Srs2 recruitment to PCNA to disrupt the RPA-Ddc2-Mec1 signaling complex. Presumably, this allows gap filling and disappearance of long-lived ssDNA as the initiator of checkpoint signaling, although the study does not extend to this step.

      Strengths

      (1) The manuscript focuses on the novel function of Srs2 to downregulate the DNA damage signaling response and provide new mechanistic insights.

      (2) The conclusions that PCNA interaction and ensuing Srs2-sumoylation are involved in checkpoint downregulation are well supported by the data.

      Weaknesses

      (1) Additional mutants of interest could have been tested, such as the recently reported Pin mutant, srs2-Y775A (PMID: 38065943), and the Rad51 interaction point mutant, srs2-F891A (PMID: 31142613).

      (2) The use of deletion mutants for PCNA and RAD51 interaction is inferior to using specific point mutants, as done for the SUMO interaction and the sites for post-translational modifications.

      (3) Figure 4D and Figure 5A report data with standard deviations, which is unusual for n=2. Maybe the individual data points could be plotted with a color for each independent experiment to allow the reader to evaluate the reproducibility of the results.

    5. Author response:

      eLife assessment:

      This manuscript reports valuable findings on the role of the Srs2 protein in turning off the DNA damage signaling response initiated by Mec1 (human ATR) kinase. The data provide solid evidence that Srs2 interaction with PCNA and ensuing SUMO modification is required for checkpoint downregulation. However, experimental evidence with regard to the model that Srs2 acts at gaps after camptothecin-induced DNA damage is currently lacking. The work will be of interest to cell biologists studying genome integrity but would be strengthened by considering the possible role of Rad51 and its removal. 

      We appreciate the editors and the reviewers for providing evaluation and helpful comments. As detailed below, we plan to adjust the writing and figures to address the points raised by the reviewers. We believe that these changes will improve the clarity of the work. Below is a summary of our plan to address the two main criticisms.

      (1) Regarding the sites of Srs2 action, our data support the conclusion that Srs2 removal of RPA is favored at a subset of ssDNA regions that have proximal PCNA, but not at sites lacking PCNA. A logical supposition for the former types of ssDNA regions includes ssDNA gaps and tails generated during DNA repair or replication, wherein PCNA can be loaded at the ssDNA-dsDNA junction with a 3’ DNA end. Examples of the latter type of ssDNA regions without proximal PCNA can form within negatively supercoiling regions or intact R-loops, both of which lack 3’ DNA end for PCNA loading. While we have stated this conclusion in the text, we highlighted ssDNA gaps as sites of Srs2 action in Discussion and in the model figure, which could be misleading. We will clarify our model, that is, Srs2 distinguishes among different types of ssDNA regions using PCNA proximity as a guide for RPA removal, and state that the precise nature of Srs2 action sites remain to be determined. Regardless, the feature of Srs2 revealed in this work provides a rationale for how it can remove RPA at subsets of ssDNA regions without unnecessary stripping of RPA at other sites.

      (2) While Rad51 removal is an important facet of Srs2 functions, it is not relevant to our current study based on the following observations and rationales.

      First, we have provided several lines of evidence to support the conclusion that Rad51 removal by Srs2 is separable from the Srs2-RPA antagonism (Dhingra et al., 2021). For example, while rad51∆ rescues the hyper-recombination phenotype of srs2∆ cells, it does not affect the hyper-checkpoint phenotype of srs2∆. Strikingly, rfa1-zm1/zm2 have the opposite effect. The differential effects of rad51∆ and rfa1-zm1/zm2 were also seen for the srs2-_ATPase dead allele (_srs2-K41A). For example, rfa1-zm2 rescued the hyper-checkpoint defect and the CPT sensitivity of srs2-K41A, while rad51∆ had neither effect.

      These and other data described in Dhingra et al suggest that Srs2’s effects on checkpoint vs. recombination are separable and that the Srs2-RPA antagonism during the DNA damage checkpoint is independent of Rad51.

      Second, our current work addresses which Srs2 features affect the Srs2-RPA antagonism during the DNA damage response and its implications. Given this antagonism is separable from Srs2 removal of Rad51, including Rad51 regulation would be distractive from the main points of this work.

      Third, in the current work, we began by examining all known regulatory and protein-protein interaction features of Srs2, including the Rad51 binding domain. Consistent with our conclusion summarized above based on the Dhingra et al study, deleting the Rad51 binding domain in Srs2 (srs2-∆Rad51BD) has no effect on rfa1-zm2 phenotype in CPT (Figure 2D). This is in sharp contrast to mutating the PCNA binding and the sumoylation sites of Srs2, which suppressed rfa1-zm2 for its CPT sensitivity and checkpoint abnormalities (Figure 2C). This data provides yet another evidence that Srs2 regulation of Rad51 is separable from the Srs2-RPA antagonism. 

      In summary, our work provides a foundation for future examination of how Srs2 regulates RPA and Rad51 in different manners, how these two facets of the Srs2 functions affect genome integrity in different capacity, and whether there is a crosstalk between them during certain DNA metabolism processes.

      Public Reviews:

      Reviewer #1:

      Overall, the data presented in this manuscript is of good quality. Understanding how cells control RPA loading on ssDNA is crucial to understanding DNA damage responses and genome maintenance mechanisms. The authors used genetic approaches to show that disrupting PCNA binding and SUMOylation of Srs2 can rescue the CPT sensitivity of rfa1 mutants with reduced affinity for ssDNA. In addition, the authors find that SUMOylation of Srs2 depends on binding to PCNA and the presence of Mec1. Noted weaknesses include the lack of evidence supporting that Srs2 binding to PCNA and its SUMOylation occur at ssDNA gaps, as proposed by the authors. Also, the mutants of Srs2 with impaired binding to PCNA or impaired SUMOylation showed no clear defects in checkpoint dampening, and in some contexts, even resulted in decreased Rad53 activation. Therefore, key parts of the paper would benefit from further experimentation and/or clarification.  

      We thank the reviewer for the positive comments on this work and address her/his remark regarding ssDNA gaps below in Major Comment #1. In addition, we detailed below our data and rationale in suggesting that the checkpoint dampening phenotype of srs2-∆PIM and -3KR (deficient for PCNA binding and sumoylation, respectively) is masked by redundant pathways. We further describe our plan to enhance the clarity of both text and model to address these points from the reviewer. 

      Major Comments 

      (1) The central model proposed by the authors relies on the loading of PCNA at the 3' junction of an ssDNA gap, which then mediates Srs2 recruitment and RPA removal. While several aspects of the model are consistent with the data, the evidence that it is occurring at ssDNA gaps is not strong. The experiments mainly used CPT, which generates mostly DSBs. The few experiments using MMS, which mostly generates ssDNA gaps, show that Srs2 mutants lead to weaker rescue in this context (Figure S1). How do the authors explain this discrepancy? In the context of DSBs, are the authors proposing that Srs2 is engaging at later steps of HRdriven DSB repair where PCNA gets loaded to promote fill-in synthesis? If so, is RPA removal at that step important for checkpoint dampening? These issues need to be addressed and the final model adjusted. 

      We appreciate the reviewer’s concern. Our conclusion is that Srs2 can be guided by PCNA to a subset of ssDNA regions for RPA removal, and that this Srs2 action is not favored at ssDNA regions with no proximal PCNA. It is important to note that CPT can produce both types of ssDNA regions. Besides ssDNA generated via DSB-associated recombinational repair, CPT can also lead to ssDNA gap formation upon excision repair and DNA-protein crosslink repair of trapped Top1 (Sun et al., 2020). ssDNA regions generated during these DNA repair processes often contain 3’ DNA end for PCNA loading, thus they can favor Srs2 removal of RPA. Another facet of CPT’s effects (besides DNA lesions) is depleting functional pool of Top1, thus causing topological stress and consequently increased levels of DNA supercoiling and R-loops (Koster et al., 2007, Petermann et al., 2022). ssDNA formed within the negatively supercoiled regions and in R-loops lacks 3’ DNA end unless it is cleaved by nucleases, thus these sites would be disfavored for Srs2 removal of RPA due to lack of PCNA loading. Our conclusion that ssDNA regions with nearby PCNA are preferred sites for Srs2 action provides a rationale for how Srs2 can remove RPA at certain ssDNA regions but minimize unnecessary stripping of RPA from other sites.

      We will clarify in Discussion that CPT can generate twp types of ssDNA regions as stated above, and that Srs2 could distinguish among them using PCNA proximity as a guide for RPA removal. While this conclusion was described in the text, we emphasized ssDNA gap as a Srs2 action site in the model. We will clarify that while this is a logical supposition, other types of ssDNAs with proximal PCNA could also be targeted by Srs2 and that our work paves the way to determine the precise nature of ssDNA regions for Srs2’s action. 

      The reasons for the less potent growth suppression of rfa1 mutants by srs2 alleles in MMS condition compared with CPT condition are unclear, but multiple possibilities should be considered, given that MMS and CPT affect checkpoint responses differently and that RPA and Srs2 affect growth in multiple ways. For example, while CPT only activates the DNA damage checkpoint, MMS additionally induces DNA replication checkpoint (Menin et al., 2018, Redon et al., 2003). It is thus possible that the Srs2-RPA antagonism is relatively more important for the DNA damage checkpoint than the DNA replication checkpoint. Further investigation of this possibility among others will shed light on differential suppressive effects seen in this work. We will include this discussion in the revised text.

      (2) The data in Figure 3 showing that Srs2 mutants reduce Rad53 activation in the rfa1-zm2 mutant are confusing, especially given the claim of an anti-checkpoint function for Srs2 (in which case Srs2 mutants should result in increased Rad53 activation). The authors propose that Rad53 is hyperactivated in rfa1-zm2 mutant because of compromised ssDNA protection and consequential DNA lesions, however, the effects sharply contrast with the central model. Are the authors proposing that in the rfa1-zm2 mutant, the compromised protection of ssDNA supersedes the checkpoint-dampening effect? Perhaps a schematic should be included in Figure 3 to depict these complexities and help the reader. The schematic could also include the compensatory dampening mechanisms like Slx4 (on that note, why not move Figure S2 to a main figure?... and even expand experiments to better characterize the compensatory mechanisms, which seem important to help understand the lack of checkpoint dampening effect in the Srs2 mutants) 

      Genetic interactions that involve partially defective alleles, multi-functional proteins, and redundant pathways are complex to comprehend. For example, a phenotype seen for the null allele may not be seen for partially defective alleles. In the context of this study, while srs2 null increased Rad53 activation (Dhingra et al., 2021), srs2-∆PIM and -3KR did not (Figure 3A-3B). However, srs2-∆PIM enhanced Rad53 activation when combined with another checkpoint dampening mutant slx4RIM, suggesting that defects of srs2-∆PIM can be compensated by Slx4 (Figure S2). Importantly, srs2-∆PIM and -3KR rescued rfa1-zm2’s checkpoint abnormality (Figure 3A3B), suggesting that Srs2 binding to PCNA and its sumoylation contribute to the Srs2-RPA antagonism in the DNA damage checkpoint response.

      A partially defective allele that impairs a specific function of a protein can be a powerful genetic tool even when it lacks a particular phenotype on its own. For example, a partially defective allele of the checkpoint protein Rad9 impairing its binding to gamma-H2A (rad9-K1088M) does not affect the G2/M checkpoint nor cause DNA damage sensitivity due to the compensation of other checkpoint factors (Hammet et al., 2007); however_, rad9-K1088M_ rescues the DNA damage sensitivity and persistent G2/M checkpoint of rtt107 and slx4 mutants, providing one of the evidences supporting a role of the Slx4-Rtt107 axis in removal of Rad9 from chromatin (via competing with Rad9 for gamma-H2A binding) (Ohouo et al., 2013).

      In order to highlight the checkpoint recovery process, the model in Figure 6 did not depict another consequence of the Srs2-RPA antagonism. In the presence of Srs2, DNA binding rfa1 mutants can lead to increased levels of DNA lesions and checkpoint, and these defects are rescued by lessening Srs2’s ability to strip RPA from DNA (Dhingra et al., 2021). We will modify the model in Figure 6 and its legend to clarify that the model depicts just one of the consequences of the Srs2 and RPA antagonism with a focus on the checkpoint recovery. We will also state these points more clearly in the Discussion. Further, a new schematic in Figure 3 as suggested by the reviewer will be added to outline the genetic relationship and interpretation. We will also follow reviewer’s suggestion to move Figure S2 to the main figures. Better characterizing the compensatory mechanisms among different checkpoint dampening pathways is very interesting but requires substantial amounts of work. While it is beyond the scope of the current study, it could be pursued in the future.

      (3) The authors should demarcate the region used for quantifying the G1 population in Figure 3B and explain the following discrepancy: By inspection of the cell cycle graph, all mutants have lower G1 peak height compared to WT (CPT 2h). However, in the quantification bar graph at the bottom, ΔPIM has higher G1 population than the WT. 

      We have added the description on how the G1 region of the FACS histogram was selected to derive the percentage of G1 cells in Figure 3B. Briefly, for samples collected for a particular strain, the G1 region of the “G1 sample” was used to demarcate the G1 region of the “CPT 2h” sample. Upon re-checking the included FACS profiles, we realized that a mutant panel and its datapoint were mistakenly put in the place for wild-type. We will correct this mistake. The conclusion remains that srs2-∆PIM and srs2-3KR improved rfa1-zm2 cells’ ability to exit G2/M, while they themselves do not show difference from the wild-type control for the percentage of G1 cells after 2hr CPT treatment. We will add statistics in figures to reflect this conclusion and adjust the order of strains shown in panel A and B to be consistent with each other.

      Reviewer #2:

      This is an interesting paper that delves into the post-translational modifications of the yeast Srs2 helicase and proteins with which it interacts in coping with DNA damage. The authors use mutants in some interaction domains with RPA and Srs2 to argue for a model in which there is a balance between RPA binding to ssDNA and Srs2's removal of RPA. The idea that a checkpoint is being regulated is based on observing Rad53 and Rad9 phosphorylation (so there are the attributes of a checkpoint), but evidence of cell cycle arrest is lacking. The only apparent delay in the cell cycle is the re-entry into the second S phase (but it could be an exit from G2/M); but in any case, the wild-type cells enter the next cell cycle most rapidly. No direct measurement of RPA residence is presented. 

      We thank the reviewer for the helpful comments. Previous studies have shown that CPT does not induce the DNA replication checkpoint, thus it does not slow down or arrest S phase progression; however, CPT does induce the DNA damage checkpoint, which causes a delay of G2/M cells to re-enter into the second cell cycle (Menin et al., 2018, Redon et al., 2003). Our result is consistent with previous findings, showing that CPT induces G2/M delay but not arrest. We will adjust the text to make this point clearer.

      We have previously reported chromatin-bound RPA levels in rfa1-zm2, srs2, and their double mutants, as well as in vitro ssDNA binding by wild-type and mutant RPA complexes (Dhingra et al., 2021). We found that Srs2 loss or its ATPase dead mutant led to 4-6 fold increase of RPA levels on chromatin, which was rescued by rfa1-zm2 (Dhingra et al., 2021). On its own, rfa1-zm2 did not cause defective chromatin association in our assays, despite modestly reducing ssDNA binding in vitro (Dhingra et al., 2021). This discrepancy could be due to a lack of sensitivity of chromatin fractionation assay in revealing moderate changes of RPA residence on DNA. Considering this, we decided to employ functional assays (Figure 2-3) that are more effective in identifying the Srs2 features pertaining to RPA regulation. 

      Strengths:

      Data concern viability assays in the presence of camptothecin and in the post-translational modifications of Srs2 and other proteins.

      Weaknesses:

      There are a couple of overriding questions about the results, which appear technically excellent. Clearly, there is an Srs2-dependent repair process here, in the presence of camptothecin, but is it a consequence of replication fork stalling or chromosome breakage? Is repair Rad51-dependent, and if so, is Srs2 displacing RPA or removing Rad51 or both? If RPA is removed quickly what takes its place, and will the removal of RPA result in lower DDC1-MEC1 signaling? 

      While Srs2 can affect both the checkpoint response and DNA repair in CPT conditions, the rfa1-zm2 allele, which affects the former but not the latter, role of Srs2, allows us to gain a deeper understanding of the former role (Dhingra et al., 2021). This role also appears to be critical for cell survival in CPT, since srs2∆ growth on CPT-containing media was greatly improved by rfa1-zm mutants (Dhingra et al., 2021). Building on this understanding, our current study identified two Srs2 features that could afford spatial and temporal regulations of RPA removal from DNA, thus providing a rationale for how cells can properly utilize this beneficial yet also dangerous activity. Study of Srs2-mediated repair in CPT conditions, either in Rad51-dependent or independent manner, before and after replication forks stall or DNA breaks, will require substantial efforts and can be pursued in the future. We will add this point to the revised manuscript.

      Moreover, it is worth noting that in single-strand annealing, which is ostensibly Rad51 independent, a defect in completing repair and assuring viability is Srs2-dependent, but this defect is suppressed by deleting Rad51. Does deleting Rad51 have an effect here? 

      We have shown in our previous paper (Dhingra et al., 2021). that rad51∆ did not rescue the hyper-checkpoint phenotype of srs2∆ cells in CPT condition (Dhingra et al., 2021), while rfa1-zm1 and -zm2 did (Dhingra et al., 2021). Such differential effects were also seen for the srs2 ATPase-dead allele (Dhingra et al., 2021). These and other data described in the Dhingra et al paper suggest that Srs2’s effects on checkpoint vs. recombination are separable at least in CPT condition, and that the Srs2-RPA antagonism in checkpoint regulation is not affected by Rad51 removal (unlike in SSA situation).

      Neither this paper nor the preceding one makes clear what really is the consequence of having a weakerbinding Rfa1 mutant. Is DSB repair altered? Neither CPT nor MMS are necessarily good substitutes for some true DSB assay. 

      In our previous report (Dhingra et al., 2021), we showed that the rfa1-zm mutants did not affect the frequencies of rDNA recombination, gene conversation, or direct repeat repair (Dhingra et al., 2021). Further, rfa1-zm mutants did not suppress the hyper-recombination phenotype of srs2∆, while rad51∆ did (Dhingra et al., 2021). In a DSB system, wherein the direct repeats flanking the break were placed 30 kb away from each other, srs2∆ led to hyper-checkpoint and lethality, both of which were rescued by rfa1-zm mutants (Dhingra et al., 2021). In this assay, rfa1-zm mutants themselves did not show sensitivity, suggesting the repair is largely proficient. Collectively, these data provide evidence to suggest that weaker DNA binding of Rfa1 does not have detectable effect on the recombinational repair assays examined thus far, rather it has a profound effect in Srs2-mediated checkpoint downregulation. In-depth studies of rfa1-zm mutations in the context of various DSB repair steps will be interesting to pursue in the future.

      With camptothecin, in the absence of site-specific damage, it is difficult to test these questions directly. (Perhaps there is a way to assess the total amount of RPA bound, but ongoing replication may obscure such a measurement). It should be possible to assess how CPT treatment in various genetic backgrounds affects the duration of Mec1/Rad53-dependent checkpoint arrest, but more than a FACS profile would be required. 

      Quantitative measurement of RPA residence time on DNA in cells and the duration of Mec1/Rad53-dependent checkpoint arrest will be very informative but requires further technology development. Our current work provides a foundation for such quantitative assessment.

      It is also notable that MMS treatment does not seem to yield similar results (Fig. S1). 

      Figure S1 showed that srs2-∆PIM and srs2-3KR had weaker suppression of rfa1-zm2 growth on MMS plates than on CPT plates. The reasons for the less potent growth suppression in MMS condition compared with CPT condition are unclear, but multiple possibilities should be considered, given that MMS and CPT affect checkpoint responses differently and that RPA and Srs2 affect growth in multiple ways. For example, while CPT only activates the DNA damage checkpoint, MMS additionally induces DNA replication checkpoint (Menin et al., 2018, Redon et al., 2003). It is thus possible that the Srs2-RPA antagonism is more important for the DNA damage checkpoint than the DNA replication checkpoint. Further investigation of this and other possibilities will provide clues to the differential suppressive effects seen in this work. We will include this discussion in the revised text.

      Reviewer #3:

      The superfamily I 3'-5' DNA helicase Srs2 is well known for its role as an anti-recombinase, stripping Rad51 from ssDNA, as well as an anti-crossover factor, dissociating extended D-loops and favoring non-crossover outcome during recombination. In addition, Srs2 plays a key role in ribonucleotide excision repair. Besides DNA repair defects, srs2 mutants also show a reduced recovery after DNA damage that is related to its role in downregulating the DNA damage signaling or checkpoint response. Recent work from the Zhao laboratory (PMID: 33602817) identified a role of Srs2 in downregulating the DNA damage signaling response by removing RPA from ssDNA. This manuscript reports further mechanistic insights into the signaling downregulation function of Srs2. 

      Using the genetic interaction with mutations in RPA1, mainly rfa1-zm2, the authors test a panel of mutations in Srs2 that affect CDK sites (srs2-7AV), potential Mec1 sites (srs2-2SA), known sumoylation sites (srs2-3KR), Rad51 binding (delta 875-902), PCNA interaction (delta 1159-1163), and SUMO interaction (srs2SIMmut). All mutants were generated by genomic replacement and the expression level of the mutant proteins was found to be unchanged. This alleviates some concern about the use of deletion mutants compared to point mutations. The double mutant analysis identified that PCNA interaction and SUMO sites were required for the Srs2 checkpoint dampening function, at least in the context of the rfa1-zm2 mutant. There was no effect of these mutants in a RFA1 wild-type background. This latter result is likely explained by the activity of the parallel pathway of checkpoint dampening mediated by Slx4, and genetic data with an Slx4 point mutation affecting Rtt107 interaction and checkpoint downregulation support this notion. Further analysis of Srs2 sumoylation showed that Srs2 sumoylation depended on PCNA interaction, suggesting sequential events of Srs2 recruitment by PCNA and subsequent sumoylation. Kinetic analysis showed that sumoylation peaks after maximal Mec1 induction by DNA damage (using the Top1 poison camptothecin (CPT)) and depended on Mec1. These data are consistent with a model that Mec1 hyperactivation is ultimately leading to signaling downregulation by Srs2 through Srs2 sumoylation. Mec1-S1964 phosphorylation, a marker for Mec1 hyperactivation and a site found to be needed for checkpoint downregulation after DSB induction did not appear to be involved in checkpoint downregulation after CPT damage. The data are in support of the model that Mec1 hyperactivation when targeted to RPA-covered ssDNA by its Ddc2 (human ATRIP) targeting factor, favors Srs2 sumoylation after Srs2 recruitment to PCNA to disrupt the RPA-Ddc2-Mec1 signaling complex. Presumably, this allows gap filling and disappearance of long-lived ssDNA as the initiator of checkpoint signaling, although the study does not extend to this step.

      Strengths 

      (1) The manuscript focuses on the novel function of Srs2 to downregulate the DNA damage signaling response and provide new mechanistic insights. 

      (2) The conclusions that PCNA interaction and ensuing Srs2-sumoylation are involved in checkpoint downregulation are well supported by the data. 

      We thank the reviewer for carefully reading our work and for his/her positive comments. 

      Weaknesses 

      (1) Additional mutants of interest could have been tested, such as the recently reported Pin mutant, srs2Y775A (PMID: 38065943), and the Rad51 interaction point mutant, srs2-F891A (PMID: 31142613). 

      srs2-Y775A was shown to be proficient for stripping RPA from ssDNA and behaved like wild-type Srs2 in assays such as gene conversion and crossover control, and exhibited a genetic interaction profile as the wildtype allele. The authors suggest that the Y775 pin can contribute to unwinding secondary DNA structures. Collectively, these findings do not provide a strong rationale for srs2-Y775A being relevant for RPA removal from ssDNA. 

      We have already included the data showing that a srs2 mutant lacking the Rad51 binding domain (srs2-∆Rad51BD, ∆875-902) did not affect rfa1-zm2 growth in CPT nor caused other defects in CPT on its own (Figure 2D). This data suggest that Rad51 binding is not relevant to the Srs2-RPA antagonism in CPT, a conclusion fully supported by data in our previous study (Dhingra et al., 2021). Collectively, these findings do not provide a strong rationale to test a point mutation within the Rad51BD region. 

      (2) The use of deletion mutants for PCNA and RAD51 interaction is inferior to using specific point mutants, as done for the SUMO interaction and the sites for post-translational modifications. 

      We agree with this view generally. However, this is less of a concern for the Rad51 binding site mutant (srs2∆Rad51BD), as it behaved as the wild-type allele in our assays. The srs2-∆PIM mutant (lacking 4 amino acids) has been examined for PCNA binding in vitro and in vivo in several studies (e.g. Kolesar et al., 2016, Kolesar et al., 2012); to our knowledge no unintended defect was reported. We thus believe that this allele is suitable for testing whether Srs2’s ability to bind PCNA is relevant to RPA regulation.

      (3) Figure 4D and Figure 5A report data with standard deviations, which is unusual for n=2. Maybe the individual data points could be plotted with a color for each independent experiment to allow the reader to evaluate the reproducibility of the results. 

      We will include individual data points as suggested and correct figure legend to indicate that three independent biological samples per genotype were examined in both panels.

      References:

      Dhingra N, Kuppa S, Wei L, Pokhrel N, Baburyan S, Meng X, Antony E and Zhao X (2021) The Srs2 helicase dampens DNA damage checkpoint by recycling RPA from chromatin Proc Natl Acad Sci U S A 118

      Hammet A, Magill C, Heierhorst J and Jackson SP (2007) Rad9 BRCT domain interaction with phosphorylated H2AX regulates the G1 checkpoint in budding yeast EMBO Rep 8: 851-857

      Kolesar P, Altmannova V, Silva S, Lisby M and Krejci L (2016) Pro-recombination Role of Srs2 Protein Requires SUMO (Small Ubiquitin-like Modifier) but Is Independent of PCNA (Proliferating Cell Nuclear Antigen) Interaction J Biol Chem 291: 7594-7607

      Kolesar P, Sarangi P, Altmannova V, Zhao X and Krejci L (2012) Dual roles of the SUMO-interacting motif in the regulation of Srs2 sumoylation Nucleic Acids Res 40: 7831-7843

      Koster DA, Palle K, Bot ES, Bjornsti MA and Dekker NH (2007) Antitumour drugs impede DNA uncoiling by topoisomerase I Nature

      448: 213-217

      Menin L, Ursich S, Trovesi C, Zellweger R, Lopes M, Longhese MP and Clerici M (2018) Tel1/ATM prevents degradation of replication forks that reverse after topoisomerase poisoning EMBO Rep 19

      Ohouo PY, Bastos De Oliveira FM, Liu Y, Ma CJ and Smolka MB (2013) DNA-repair scaffolds dampen checkpoint signalling by counteracting the adaptor Rad9 Nature 493: 120-124

      Petermann E, Lan L and Zou L (2022) Sources, resolution and physiological relevance of R-loops and RNA-DNA hybrids Nat Rev Mol Cell Biol 23: 521-540

      Redon C, Pilch DR, Rogakou EP, Orr AH, Lowndes NF and Bonner WM (2003) Yeast histone 2A serine 129 is essential for the efficient repair of checkpoint-blind DNA damage EMBO Rep 4: 678-684

      Sun Y, Saha S, Wang W, Saha LK, Huang SN and Pommier Y (2020) Excision repair of topoisomerase DNA-protein crosslinks (TOP-

      DPC). DNA Repair 89: 102837

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors use point light displays to measure biological motion (BM) perception in children (mean = 9 years) with and without ADHD, and relate it to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three BM tasks, but that those tasks loading more heavily on local processing relate to social interaction skills and those loading on global processing relate to age. There are still some elements of the results that are unclear, but nevertheless, the important and solid findings extend our limited knowledge of BM perception in ADHD, as well as biological motion processing mechanisms in general.

      We thank the editors and reviewers for their valuable feedback and constructive comments. In the revised manuscript, we have incorporated all statistics for the models and also provided detailed analytical evidence about the distinct contributions of local and global BM processing. We hope these clarifications could enhance the robustness of our conclusions.

      Public Reviews:

      Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      We appreciate the your positive feedback very much.

      Weaknesses:

      The manuscript has improved in clarity and conceptual and methodological considerations in response to the last review. However, the reported results still provide incomplete support for the claims the authors make in the paper.

      In relation to other reviewers' earlier comments, the model notation used is still not consistent and model results are reported incompletely, which make it difficult to gain a full picture of the data and how they support the authors' secondary claims. For instance, across the models in the supplementary materials, ß coefficients are only reported selectively which makes it difficult to assess the model as a whole. Furthermore, different terms (task 1, task 2 vs. BM-Local, BM-global) are used to refer to the same levels of a variable, and it is unclear which levels of a dummy variable correspond to which task, making it overall very difficult to comprehend the modelling procedure.

      Thanks for pointing out these issues. In the revised version, we have unified the terminology by consistently referring to task types as BM-Local, BM-Global, BM-General. Additionally, we have provided clarification on the interpretation of dummy variables in relation to model construction. Furthermore, we corrected the model results and included all statistics in Table S1, S2, and S3. For more detailed information, please refer to the response to your Recommendations for the authors.

      Reviewer #3 (Public Review):

      The authors presented point light displays of human walkers to children (mean = 9 years) with and without ADHD to compare their biological motion perception abilities, and relate them to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three biological motion tasks, but that those loading more heavily on local processing related to social interaction skills and global processing to age. The valuable and solid findings are informative for understanding this complex condition, as well as biological motion processing mechanisms in general. However, the correlations present a pattern that needs further examination in future studies because many of the differences between correlations are not significant.

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      Thanks for this positive assessment of our work.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test this further. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      We apologize for not clarifying these points earlier. We did identify correlations between performance on all BM tasks and SRS scores. However, it is noteworthy that this finding is not unexpected, given the significant distinctions in SRS scores between TD and ADHD children, alongside their marked differences in all BM tasks. Correlation analyses involving data from both groups may reflect group differences. To elucidate the relationship between social ability impairment and diminished BM processing in children with ADHD, we conducted additional subgroup analyses and found correlations only in the BM-local task. To further support the specificity of this correlation, we compared the differences in coefficients. We revised our modelling procedure for testing differences between correlations in supplementary materials and presented all models statistics in Table S2, S3. Discrepancies in these coefficients, which exclude the influence of differences between groups, suggest that social factors specifically influence the performance of the BM-Local task in children with ADHD. We acknowledge that the analysis for differences between correlations is based on a relative small sample size and provided modest interpretation in discussion. Future studies will aim to increase the sample size to validate our findings.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that may have been made in previous studies, but would appear controversial and not definitive. E.g., that local BM processing does not improve with age and is uninfluenced by attention.

      Thanks for your comment. To the best of our knowledge, there have been fewer developmental studies conducted on local BM processing compared to global BM processing. Our study is the first one to directly explore the relationship between local BM processing and age. Additionally, we used QbInattention to evaluate sustained attention function (considered as “top-down” attention) and examined its correlation with local BM processing. Some indirect evidence supported that the ability to process local BM cues remained stable and was unaffected by top-down attention. For example, local BM processing did not show a learning trend (Chang 2009) and was linked to the activation of subcortical regions (Hirai 2020). Research has demonstrated that local BM cues can convey information about walking direction without participants’ explicit attention or recognition (Chang 2009, Hirai 2011, Thompson 2007, Wang 2010), indicating the involvement of “bottom-up” processing (Hirai 2020, Troje 2023). Consistent with previous findings, we did not find significant correlation between local BM processing and age or QbInattention. We acknowledge that the statement such as “local BM processing does not improve with age and is uninfluenced by attention” should be approached with cautions. Therefore, we interpreted our results carefully:

      “Once a living creature is detected, an agent (i.e., is it a human?) can be recognised by a coherent, articulated body structure that is perceptually organised based on its motions (i.e., local BM cues)71. This involves top-down processing and probably requires attention25,72, particularly in the presence of competing information26. Our findings are consistent with those of previous studies on the cortical processing of BM73, as we found that the severity of inattention in children with ADHD was negatively correlated with their performance in global BM processing, whereas this significant correlation was not found in local BM processing, which may involve bottom-up processing61,65 and might not need participants’ explicit attention21,23,74,75. However, further studies are needed to verify this hypothesis.” (lines 461-470)

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Supplementary materials: For all reported results, I suggest the authors use consistent model notation with complete reporting of all statistics in line with common conventions (ideally tables reporting beta values, error terms and confidence intervals for all model predictors, as well as R squared values). In particular the beta values for the reference category are needed to be able to fully interpret the beta values for the reported contrasts.

      We appreciate the your suggestion. In the newly revised manuscript, we reported all statistics including beta values, error terms and confidence intervals for all model predictors, and R squared values. These detailed statistics can be found in Table S1, S2 and S3. We hope this additional information will offer readers a more comprehensive understanding of our study.

      Please also address the following inconsistencies:

      - At least when reporting the model results, the same term should be used when refering to task type (either task 1/2/3/ or local/global/general BM).

      Thank the your for this feedback. We use the same term (BM-Local/Global/General) to refer to task type in the whole text.

      - Second linear model in the Supplementary Materials: The authors state that the results suggest that the correlation between SRS and task 1 is greater than that between task 2 and SRS scores. First of all, to be able to support this claim the authors need to provide the coefficient for task 1 (which, if task 1 is the reference variable should be ß1). Second, as I currently understand the reported model results, the fact that ß4 (representing the difference in relationship to SRS scores between task 2 and task 1; the authors refer to ß3 here although I assume they mean ß4) is negative and shows a trend towards significance would actually mean the relationship between BM processing accuracy and SRS scores is more negative for task 2 relative to task 1 and not, as the authors state, that the correlation with SRS scores is greater for task 1. I realise this contradicts the individual r values and scatter plots and hope the authors can clarify the model results.

      We thank you for pointing out these issues. For the second linear model (Model 4 in revised manuscript), we reported the coefficients for all predictors and model summaries including the coefficient for task 1 (ß1). In addition, we have made correction to the model results. The values of ß4 (representing the difference in relationship to SRS scores between BM-Global and BM-Local) and ß5 (representing the difference in relationship to SRS scores between BM-General and BM-Local) were positive and showed a trend towards significance, indicating that the correlations with SRS total score were more negative for BM-Local relative to BM-Global and BM-General:

      “A general linear model was constructed (Table S2, Model 4): SRS = β0 + β1 * ACC + β2 * D1 + β3 * D2 + β4 * (ACC * D1) + β5 * (ACC * D2). If the effect of the interaction term (i.e., β4 or β5 ) is statistically significant, it indicates a difference in correlations with SRS total score between BM-Local and BM-Global (or BM-General). The results suggested trends where the correlations with SRS total score were more negative for BM-Local relative to BM-Global (standardized β4 \= 0.580 p = 0.074) and BM-General (standardized β5 = 0.550 p = 0.073).” (lines SI 36-42)

      - Third linear model in the Supplementary Materials: In the dummy variable representing task, when local BM is the reference level, which task is represented by d1 and d2, respectively? If I understand the authors' procedure correctly, d1 should represent the difference between local and global BM and d2 the difference between local and general BM. If this is true, ß4 should code for the difference between local and global BM and not, as stated by the authors, for the difference between local and general BM. Also, what is d3?

      Thank you for pointing out this issue. We corrected and clarified the results of third model (Model 5 in revised manuscript) in the revised version and pointed out what is represented by d1 (D1) and d2 (D2), respectively:

      “We recoded task types into two dummy variables, D1 and D2, using BM-Local as a reference. The coefficient of D1 represents the difference in relationship to age between BM-Local and BM-Global, and the coefficient of D2 represents the difference in relationship to age between BM-Local and BM-General. The following model was created for each group (Table S3, Model 5-6): ACC = β0 + β1 * age + β2 * D1 + β3 * D2 + β4 * (age * D1) + β5 * (age * D2). If the effect of the interaction term (i.e., β4 or β5) is statistically significant, it indicates a difference in the effect of age on ACC between BM-Local and BM-Global (or BM-General). In the ADHD group, we observed a significant difference in the effect of age on ACC between BM-Local and BM-General (standardized β5 \= 0.462, p < 0.001) and marginally significant differences in the effect of age on ACC between BM-Local and BM-Global (standardized β4 \= 0.228, p = 0.073).” (lines SI 47-57)

    2. eLife assessment

      The authors use point light displays to measure biological motion (BM) perception in children (mean = 9 years) with and without ADHD, and relate it to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three BM tasks, but that those tasks loading more heavily on local processing relate to social interaction skills and those loading on global processing relate to age. There are still some elements of the results that need clarification with future work, but nevertheless, the important and solid findings extend our limited knowledge of BM perception in ADHD, as well as biological motion processing mechanisms in general.

    3. Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      Weaknesses:

      The manuscript has improved in clarity and conceptual and methodological considerations in response to the last review. However, the reported results still provide incomplete support for the claims the authors make in the paper, due to differences between correlations not passing significance thresholds.

    4. Reviewer #3 (Public Review):

      The authors presented point light displays of human walkers to children (mean = 9 years) with and without ADHD to compare their biological motion perception abilities, and relate them to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three biological motion tasks, but that those loading more heavily on local processing related to social interaction skills and global processing to age. The valuable and solid findings are informative for understanding this complex condition, as well as biological motion processing mechanisms in general. However, the correlations present a pattern that needs further examination in future studies because many of the differences between correlations are not significant.

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test this further. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that may have been made in previous studies, but would appear controversial and not definitive. E.g., that local BM processing does not improve with age.

    1. eLife assessment

      This valuable paper explores the role of translational regulation in the establishment of differential gene expression between neurons and glia in Drosophila. The paper uses Ribo-seq to show extensive variation in the translation efficiency of specific transcripts between neurons and glia. The evidence supporting the model is solid, although only one example (that exhibits very strong differential transcriptional expression between one class of neurons and glia) is studied in detail for translation efficiency.

    2. Reviewer #1 (Public Review):

      This study seeks to understand how selective mRNA translation informs cellular identity using the Drosophila brain as a model. Using drivers specific for either neurons or glia, the authors express a tagged large ribosomal subunit protein, which they then use as a handle for isolating total mRNA and ribosome footprints. Throughout the study, they compare these data sets to transcriptional and ribosome profiles from the whole fly head, which contains multiple cell types including fat tissue, pigment cells and others, in addition to neurons and glia. Using GO term analyses, they demonstrate the specificity of their cell-type-based ribosome profiling: known glial mRNAs are efficiently translated in glia and likewise in neurons as well. In further examining their RNAseq data set, they find that "neuronal" mRNAs, such as ion channels, are expressed in both neurons and glia, but are translated at higher rates in neurons. Based on this, they hypothesize that neuronal mRNAs are actively suppressed in glia, and next seek to determine the underlying mechanism. By meta-analysis of all mapped ribosome footprints, they find that glia have higher ribosome occupancies in the 5' leader of neuronal mRNAs. This is corroborated by individual ribosome occupancy profiles for several neuronal mRNAs. In 5'leaders containing upstream AUG codons, they find that the glial data sets show an enrichment of ribosomes at these upstream start sites. They thus conclude that that 5' leaders containing upstream AUGs confer translational suppression in glia.

      Overall, the sequencing data sets generated in this study and their subsequent bioinformatic analyses seem robust and reliable. Their data echo the trends of cell-type specific translational profiles seen in previous studies (e.g. 27380875, 30650354), and making their data sets and analyses accessible to the broader scientific community would be quite helpful. The findings are presented in a logical and methodical manner, and the data are depicted clearly. The authors' results that 5' leaders facilitate translation suppression is well-supported in literature. However, they overinterpret their data by claiming that such suppression is key for maintaining glial/neuronal identity (it is even featured in their title), but do not present any evidence that loss of such regulation has any impact on cellular identity. In many places, the authors do not acknowledge possible biases in their analytical methods, or consider alternate explanations for their data. These weaken the manuscript in its current form, but many of these issues which I describe below, are rectifiable with modest effort.

      (1) The authors' data in Fig. 2-S1A-B shows substantial cell-to-cell variation in RpL3::FLAG expression. The authors do not consider that this variation may cause certain neuronal/glial types to be overrepresented in their datasets. In related, the authors do not discuss whether RpL3::FLAG only present in the cell body or if it is also trafficked to the neuronal/glial processes where localized translation is known to occur (reviewed in 31270476).

      (2) The RNA-seq data set that they use to calculate translation efficiency (TE) only represents mRNAs associated with RpL3::FLAG, which is part of the large ribosome subunit. As the authors are likely aware, there are mRNAs on which the full ribosome moiety does not assemble and these are effectively excluded from this data set. Ideally, a more complete picture of the mRNA landscape can be obtained by 40S subunit profiling but I appreciate that this is technically very challenging. At minimum, this caveat needs to be acknowledged.

      How does the TPM of differentially regulated transcripts (such as those in Fig. 2H) compare between whole heads, neurons and glia? Since the whole head RNA-seq data was not from an enriched sample, this might serve as a decent proxy for showing that the neuron/glia RNA-seq data sets are representative of RNA abundance.

      (3) The analysis in Fig. 2F shows that low abundance mRNAs in glia are further translationally suppressed, which the authors point out in lines 151-152. However, this data also shows that mRNAs with a 1:1 ration in neuron:glia (which fall in the 0.5-1 and 1-2 bin) have a TE-1; this suggests that on average, mRNAs that are equally abundant are translated equally efficiently. This is the opposite of the thesis presented in Fig. 2G-H where many mRNAs of equal abundance in neurons and glia are actually poorly translated in glia. How do the authors reconcile these observations?

      It is also unclear from the manuscript whether all mRNAs were considered for the analysis in Fig. 2F or if some cutoff was employed.

      (4) Throughout the manuscript the authors favor a "translation suppression" model wherein glia (for example) actively suppress neuronal mRNAs, and this is substantiated in Fig. 3C showing higher ribosome occupancy on 5' leaders than in coding regions. However, they show no evidence that glial mRNAs (such as those indicated in Fig. 2B and 2-S2B) present a different pattern, say that of higher ribosome occupancy in CDS vs. 5' leaders. This type of a positive control is a glaring omission from many of their analyses, including ribosome occupancy at upstream AUG codons (Fig. 4).

      In related, to make a broad case (as they do in the title) that differential translation regulation specifies multiple cell types, it is necessary to show the corollary: that glial mRNAs (repo, bnb, pnt, etc) are suppressed in neurons. There is an inkling of this evidence in Fig. 3-S1 where fat body mRNAs in neurons are shown to have low ribosome occupancy in the CDS regions and enhanced occupancy in the 5' leader region. This data is not quantified, nor is a control neuron mRNA shown as a reference for what the ribosome occupancy profile of an actively translated mRNA looks like in a neuron.

      (5) The cell-type specific ribosome profiling data sets in the manuscript are from mRNAs associated with 80s subunits that have been treated with cycloheximide during sample preparation. Cycloheximide, and many other translation inhibitors, are known to non-uniformly bias reads towards start codons (PMID: 22056041,22927429). This important caveat and its implications on the start-codon occupancy analysis in Fig. 4 are not acknowledged in the manuscript.<br /> Again, the ideal resolution would be ribosome profiling data set from 40S footprinting or harringtonine-treated samples (PMIDs: 32589966, 27487212, 32589964) to show true accumulation of ribosomes at AUG codons. In the absence of such a data set, a comparative meta-analysis of the ribosome distribution around upstream and initiation AUG codons of differentially translated transcripts from neurons would be a useful control.

      (6) The authors chose Rhodopsin 1 (Rh1) as a model mRNA which is translated efficiently in neurons but suppressed in glia. Though the data in Fig. 2-S3B shows higher TE for Rh1 in neurons, the data in 5A show lower ribosome occupancy in the Rh1 CDS in neuron samples (at least in the fragment of the CDS visible). These data are somewhat contradictory.<br /> Further, given that the neuron data are from all nsyb-positive cells but that Rh1 is expressed only in R1-R6 photoreceptors, it is unclear what motivated them to chose Rh1 as opposed to an mRNA that is more broadly expressed in neurons.

      (7) Similar to the heterogeneity in nsyb- and repo-GAL4 expression in Fig. 2-S1A-B, Fig. 5C shows substantial variation in the expression of the UAS-GFP reporter driven by tub-GAL4. This variable GAL4 activity makes the mRNA abundance data difficult to interpret. Also, since the authors presume that Rh1 mRNA is expressed in glia (it is not annotated in the RNA-seq analysis in Fig. 2-S2B), would Rh1-GAL4 not be a more apt driver?<br /> These issues are further compounded by the lack of a cellular compartment marker (repo marks glial nuclei) which makes it impossible to determine which cell the mRNA signal is in. There are also no negative controls are presented for the mRNA probes.

      Most confoundingly though, the control reporter itself seems to show variable translation efficiencies from one cell to another, with high-GFP protein cells showing lower GFP mRNA and vice versa.<br /> The mRNA:protein ratio may be easier to examine by using repo-GAL4 to specifically drive the Rh1-reporter expression in glia (such as in Fig. 5-S1A) rather than simultaneous expression in both neurons and glia using tub-GAL4.

      Comments post revision: The authors have satisfactorily addressed most of my concerns with the study. I appreciate their patient clarification of many of my points, and the revision to text+figures appending more controls. My only minor gripe remains that while their data beautifully show that there is differential regulation of transcripts across neurons and glia, they do not provide evidence that such regulation is required for cell identity. However, I appreciate this is a large experimental ask worthy of another study in and of itself. Overall, I peg this an excellent study that adds substantially to the field of cell-type specific mRNA translation regulation.

    3. Reviewer #3 (Public Review):

      It is well established that there is extensive post-transcriptional gene regulation in nervous systems, including the fly brain. For example, dynamic regulation of hundreds of genes during photoreceptor development could only be observed at the level of translated mRNAs, but not the entire transcriptomes. The present study instead addresses the role of differential translational regulation between cell types (or rather classes: neurons and glia, as both are still highly heterogenous groups) in the adult fly brain. By performing bulk RNA-seq and Ribo-seq on the same lysates, the authors are able to compare translation efficiency (TE) of all transcripts between neurons and glia. Many genes display differential TE, but interestingly, they tend to be the genes that already show strong differences at their mRNA level. The most striking observation is the finding that neuronal transcripts in glia display increased ribosome stalling at their 5' UTR, and in particular at the start codons of short "upstream ORFs". This could suggest that glia specifically employ a mechanism to upregulate upstream ORF translation, enabling them to better suppress the expression of the genes that have them. And neuronal genes tend to have longer 5' UTRs, perhaps to facilitate this type of regulation.

      However, it is difficult to evaluate the functional significance of these differences because the authors provide only one follow-up experiment to their RNA-seq analysis. Venus expressed with the Rh1 UTR sequences may be displaying differential levels between glia and neurons, but I find this image (Fig. 5C) rather unconvincing to support that conclusion. There are no quantifications of colocalization, or even sample size information provided for this experiment. And if there is indeed a difference, it would still be difficult argue this is because of the 5' stalling phenomenon authors observe with Rh1, because they switched both the 5' and 3' UTRs.

      I also find it puzzling that the TE differences between the groups are mostly among the transcripts that are already strongly differentially expressed at the transcriptional level. The authors would like to frame this as a mechanism of 'contrast sharpening'; but it is unclear why that would be needed. Rh1, for instance, is not just differentially expressed between neurons and glia, but it is actually only expressed by a very specific neuronal type (photoreceptors). Thus it's not clear to me why the glia would need this 5' stalling mechanism to fully suppress Rh1 expression, while all the other neurons can apparently do so without it.

      Response to authors' revisions:

      The authors have addressed most of the technical points in their revised manuscript. However, it is still rather unclear whether this mechanism would have any significant impact on differential gene expression between cell types in vivo. Considering that it's mostly occurring on genes that are already strongly differentially transcribed, that doesn't appear very likely.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 3:

      Response to authors' revisions:

      This reviewer is not convinced that the authors have done enough to satisfactorily address either of the major issues described in the original public review, above.

      They're still not providing a quantification of Fig. 5D (originally 5C).

      Their response regarding the expression pattern of Rh1 is particularly concerning, as it represents a misinterpretation of previously published data.

      The gene encoding Rh1, ninaE, is expressed at such high levels in R1-6 PRs that any RNA-seq data (bulk or single-cell) generated from the optic lobes, no matter what cell-type, will display some ninaE transcripts that are present in the background, as they leak from R1-6 during dissociation steps. This phenomenon has been well described, for instance in Davis et al., 2020, eLife, and in fact led to the development of computational tools to abate such artifacts. In other words: no, rh1 is not expressed in glia, or any other neuron besides PRs for that matter. Therefore, I remain deeply suspicious about the functional relevance of the regulatory mechanisms described in this paper.

      We thank the reviewer for her or his critical comments.

      We quantified the cell-type differences in translation of the reporter with Tub-GAL4 and now show the results in Figure 5F. Consistent with other results, this analysis revealed that the glia-to-neuron ratio of the reporter protein expression is significantly lower when it contains the UTR sequences of rh1.  

      We removed the mRNA counts (former Figure 5A and Figure 5 - figure supplement 1A), as we agree that these may well be contaminated by the very high rh1 expression in R1-6. We also amended the graph showing the ribosome distribution on the rh1 mRNA (Figure 5B) to better compare the translational efficiency (footprints normalized with mRNA, in a similar manner to Figure 3C). Now it clearly highlights the cell-type differences of footprint distributions; ribosomes are much more enriched on the CDS (being translated) in neurons, while the fraction of ribosomes on the 5ʹ leader (being stalled) is much higher in glia. We summarized this differential ribosome distribution in a new graph (now Figure 5C).  

      We apologize for the misleading description of the reporter experiments. Despite the high level of mRNA expression in the R1-6, we chose the 5ʹ leader of rh1 for the translation reporter, as it contains clear uORFs and differential ribosome accumulation thereon (Figure 5B). This biased ribosome distribution and differential translation are the consistent features for many neuronal genes (Figure 3). We revised the text to clarify this point (Line 195-203).

      In summary, we provide more rigorous analysis and extensive revision, which we hope clarified the concern.

    1. Reviewer #3 (Public Review):

      Summary:

      The authors elucidated the role of USP8 in the endocytic pathway. Using C. elegans epithelial cells as a model, they observed that when USP8 function is lost, the cells have a decreased number and size in lysosomes. Since USP8 was already known to be a protein linked to ESCRT components, they looked into what role USP8 might play in connecting lysosomes and multivesicular bodies (MVB). They observed fewer ESCRT-associated vesicles but an increased number of "abnormal" enlarged vesicles when USP8 function was lost. Then they observed that the abnormally enlarged vesicles, marked by the PI3P biosensor YFP-2xFYVE, are bigger but in the same number in USP8 (-) compared to wild-type animals, suggesting homotypic fusion. They confirmed this result by knocking down USP8 in a human cell line, and they observed enlarged vesicles marked by YFP-2xFYVE as well. They finally propose that USP8 dissociates Rabx-5 from early endosomes facilitating endosome maturation.

      Strengths:

      The authors have created significant, multifaceted tools for investigating systems involved in endosome dynamics control in both worm and human cells, which will help many members of the cell biology community. The study discovered an intriguing relationship between USP8 and the Rab5 guanine nucleotide exchange factor Rabx5, expanding USP8's targets and modes of action. The results provide significant contributions to our knowledge of how endosomal maturation works.

      Weaknesses:

      The rationales could have been stated clearer to help the readers.