428 Matching Annotations
  1. Feb 2021
    1. Reviewer #1 (Public Review):

      In the days of the COVID-19 pandemic vaccines, mechanisms of vaccine administration are important and of broad interest. Vaccines are most often given into the skin. Antigen-presenting cells of the skin are responsible for eliciting the immune response in draining lymph nodes. Langerhans cells, the dendritic cell variant of the epidermis, are one of these cutaneous antigen presenting cells that are believed to do this job. They migrate from the skin, the site of antigen/vaccine uptake to the draining lymph nodes, where lymphocytes are located and where the immune reaction will be initiated. With their sophisticated experiments, the authors challenge this view. They use leading edge methodology (mouse models) that strongly suggest that there may be yet another subset of skin antigen presenting cells, that is responsible for carrying antigen from skin to lymph — at least in the steady-state skin. This population resides in the dermis (the connective tissue part of skin), as opposed to the classical Langerhans cells, which sit in the epidermis. This may be relevant to the maintenance of immunologic tolerance to innocuous substances in the absence of an overt inflammation. The data suggest that Langerhans cells may not play the crucial role they were thought to play. This is certainly a conceptual advance that — like always in science, especially when experimental systems are complex, as they are here — needs to be underpinned by future studies. In the long run, it will be very interesting (but much more difficult to study) to see whether this also holds true for human skin.

    2. Evaluation Summary:

      The present study uses innovative approaches to further our knowledge of skin immunobiology. The corresponding results explain the expression of langerin on two fractions of dermal DC (CD103+ and CD103-) observed several years ago. The demonstration that epidermal LC do not contribute to LN populations in the steady state is completely unexpected and raises important questions about the in vivo function of the LC-like subset unveiled in the present study.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

    1. This manuscript is in revision at eLife

      The decision letter after re-review, sent to the authors on February 2 2021, follows.

      Summary

      The reviewers concur that this article offers an interesting conclusion regarding optimal foraging and chemosensory valence. However, they also agree that it would benefit from a second round of revision, aiming at an improved precision of language and a better discussion of the assumptions of the model and experimental conclusions.

      Public Review 1:

      The authors present experiments that demonstrate how C. elegans worms bias their foraging decisions depending on feeding history and sensory cues (here, called pheromones) that reflect the density of worms. Navigational preference for these sensory cues is found to change from attractive to repulsive depending on the time at which worms leave a food patch, and additional experiments that condition worms under different combinations of conditions (with/without the sensory cues, with/without food, with/without repellent) indicate that associative learning is involved in this inversion of preference. A mathematical model is provided to argue that this inversion represents an optimal foraging strategy that is also evolutionarily stable.

      Public Review 2:

      The authors use the nematode C. elegans to reveal how animals associate social signals with specific contexts and modify their behaviors. Specifically, they show that C. elegans leaving a food patch are attracted to pheromonal cues, while those leaving later are repelled from pheromones. The authors using a behavioral model to suggest that the switch from attraction to repulsion is likely due to a change in learning. This study links learning with social signals providing a framework for further analysis into the underlying neuronal pathways.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 29 2021, follows.

      Summary

      This paper builds on recent studies that have made the connection between chronic endothelial damage and cellular senescence among endothelial cells in PH. Here, using a transgenic mouse that expresses in endothelial cells a dominant negative form of the TRF2 protein needed for telomere maintenance, the authors induce cellular senescence in the endothelium and show that these mice demonstrate worse PH characteristics following exposure to chronic hypoxia. They go on to test the effect of this dominant negative protein on human pulmonary artery endothelial cells in vitro and show that transfected ECs increase expression of secreted and surface-bound signaling molecules, and that when co-cultured in direct contact with pulmonary artery smooth muscle cells the SMCs increase their proliferation, an effect blocked by pharmacologic inhibition of Notch signaling. Notch blockade in vivo attenuates pulmonary hypertension in both transgenics and wild type controls. These data provide an intriguing framework for understanding how endothelial damage alters signaling to neighboring cells in the vascular wall and provides further evidence that Notch signaling plays a key role in the development of PH vasculopathies.

      Essential Revisions

      1) It's unclear where within the pulmonary vasculature the TRF2DN transgene is expressed, and therefore which vessels are effected by senescence. That the transgene is driven by a well-established endothelial promoter (VEcad) is not sufficient to demonstrate universal expression. Especially in the case of a transgene whose expression is intended to result in chromosomal abnormalities, DNA damage and a halt to cell proliferation, significant mosaicism is to be expected. In situ hybridization with probes specific to TRF2DN or an antibody stain that specifically recognizes the transgenic protein on lung tissue sections would address this problem. Both representative images and a careful characterization of the classes of arteries (subdivided by diameter, for example), capillaries, veins, and lymphatics that express the transgene and with which levels of mosaicism would be ideal.

      2) Validation of the EC expression changes, specifically of the Notch ligands identified as upregulated in vitro, need to be validated in situ to ensure they are upregulated in the endothelium of arteries where the PH phenotype (increased muscularization, increased SMC proliferation) is observed in this model. Whole lung dissociation followed by enrichment for CD146+ ECs will result in an overwhelming number of capillary ECs and a tiny number of artery ECs (Figure 3E). Similarly for the in vivo validation of Notch reception in SMCs through qPCR for indicators of Notch reception (Figure 3F, 4I) - this experiment was done on whole lung lysate and does not demonstrate increased expression of these genes in the artery wall. In situ hybridization with probes specific to TRF2DN or an antibody stain that specifically recognizes the transgenic protein on lung tissue sections would address this problem. Both representative images and a careful characterization of the classes of arteries (subdivided by diameter, for example), capillaries, veins, and lymphatics that express the transgene and with which levels of mosaicism would be ideal.

      3) The method by which PAs are identified (Figure 1D, 4F) and the metrics by which artery muscularization from images of tissue sections is quantitated (Figure 1F, 1H, 4H) are somewhat unclear and appear to be made from very few fields from an unspecified number of animals. There appears to be significant variance in artery response to hypoxia (comparing Figure 1E with vehicle in 4G), which is not a problem and very much to be expected, but means there must be absolute clarity in how the data for graphs summarizing imaging data were obtained. A supplementary figure with representative images demonstrating how arteries were scored would be very helpful. The number of independent mice for each analysis must appear either in the figure legends or in the relevant sections of the methods. A reader's understanding of how the PAs were identified in Figure 1D and 4F would be helped by using a vascular specific antibody stain. And a supplementary figure with a large panel of artery images from Tg and Wt animals before and after hypoxia exposure, with and without DAPT, so the reader can grasp the range of effects on vessels in each case would be immensely helpful.

      4) Please describe the in vivo relevance of endothelial progeria induced by decreased TERF2 function in patients with PAH.

      5) While endothelial senescence leads to decreased proliferation and apoptosis of EC, which have been shown to occur in PAH, clonal proliferation of EC is also a hallmark of advanced disease in PAH. The study does not comment on this varied phenotype of EC in the pulmonary circulation in PAH patients and the relationship of senescence of EC to SMC migration.

      6) Increased levels of Jag1 have been linked to excess proliferation in several cancer cell lines. In the context of senescence with decreased EC proliferation, increase in Jag1/Jag2 levels is surprising and the paper does not comment on this phenotype as being distinct from cancer cells.

      7) The mechanism for increased notch ligand expression in response to progeria was indirectly addressed with 5-Aza studies which presumably leads to inhibition of DNA methylation. However, it is unclear how this inhibits increase in notch ligand expression. In the discussion, the authors mention (Line 17, page 10) that DNA hypomethylation promotes specific transcriptional programs as a result of senescence. However, 5-Aza prevented the induction of Notch ligand expression in senescent EC (Suppl Fig 2). The discussion of these results needs further clarification. It is unclear what specific epigenetic modifications occur to increase the expression of Jag1/2 and Dll4 in senescence associated changes.

      8) The study did not report whether aorta and other systemic vessels demonstrate senescence changes in endothelial cells-endothelial progeria in the TG mice would involve all vasculature. Presumably, vascular remodeling was limited to the lung, given the unique response of the lung to hypoxia. However, examination of a systemic vascular bed would have strengthened the conclusions of the study. Do the EC and SMC derived from aorta or coronary vessels show similar responses in vitro compared to human PAEC with DN-TERF2 transfection?

  2. Jan 2021
    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 14 2020, follows.

      Summary

      This study examined osmolarity-dependent dendritic signaling in oxytocin magnocellular neurosecretory cells (OT-MNCs). The authors show that repetitive depolarizations evoke larger calcium responses in proximal dendrites relative to distal dendrites. When these neurons were exposed to hyperosmotic stimuli, the distal calcium responses were found to be inhibited to a greater extent compared to proximal dendritic calcium responses. Propagation of glutamate evoked depolarizations from the dendrite towards the soma were also found to be reduced following increases in osmolarity. These effects of hyperosmotic stimuli are likely mediated by changes in membrane resistance of dendrites. A non-selective blocker of the channels, ruthenium red, blocked these effects of hyperosmolarity, indicating the non-selective cation channels (e.g. TRPV types) may be responsible.

      All three reviewers agreed that the finding is potentially important and could address fundamental questions about MNC dendritic physiology. However, the reviewers identified a number of technical concerns, as summarized below. These concerns need to be addressed for further consideration.

      Essential Revisions

      1) The title and abstract are not exactly reflecting what this study is about. The title of the paper is "Dendritic membrane resistance modulates activity-induced Ca2+ influx in oxytocinergic magnocellular neurons of mouse PVN". However, dendritic membrane resistance is never actually measured. As such, a title that does not mention membrane resistance may be more appropriate. Also, the purpose and rationale of this study are not clearly communicated in the abstract and introduction. The implication to the regulation of soma-dendritic release of oxytocin, but not hyperosmotic responses, was mentioned in Introduction, while the entire Results and Discussion sections are about hyperosmotic stress.

      2) Figure 3: The reviewers believe that stimulation paradigm is not physiological (neurons voltage-clamped at -70 mV with repetitive voltage steps to +50 mV for 5 ms). It is important to show that action potentials in the current clamp, instead of the +50mV voltage step in the voltage-clamp, can produce similar signals.

      3) A major focus of the manuscript is on Ca2+ elevations in MNC dendrites. However, the authors have not performed the essential experiments to identify what the Ca2+ entry/release pathways are. It is important to show that Ca2+ is through voltage-gated Ca2+ channels for their main conclusions. In addition, it should also be established whether dendritic propagation is active or passive.

      4) It is essential to report the effect of the osmotic stimulus alone on dendritic resting Ca2+, as this would affect the interpretation of the Ca2+ data.

      5) Figure 8: What is the effect of RR on proximal EPSCs? This information is needed to interpret the effect of RR on distal EPSCs. It would be required to also test the effect of RR on the modulation of Ca2+ responses in distal dendrites to see their effects on the dendritic conductance.

      Statistical handling:

      Please provide the statistical methods (t-test, 2-way ANOVA with Hom-Sidak corrections, 2-way repeated-measures ANOVA, etc.) used for each measurement in the text or figure legend (not just in the method section). For repeated measures ANOVA, please indicate how measurements were repeated.

      For the statistics of sex differences (Fig. 2-1, 4-1 etc), it is required to use 3-way ANOVA to assess variability by cells, animals, and sex. The number of males and females used is not clear in some cases, but it appears that only 2 females and 2 males are used (Line 203-204). If this is the case, the statistical comparisons between males and females are not meaningful and should be removed.

    1. Reviewer #3:

      The glypicans Dally and Dlp have important roles in morphogen signaling, and this work is of particular interest for me because it significantly advances our understanding of the multiple roles they appear to have in signal processing, signal presentation and signal reception. It is unfortunate that most of the literature has presented results and phenotypes in simplistic or simple-minded ways that do not recognize the different roles or the glypicans, or do not take experimental approaches that might distinguish them. This work of the Guerrero lab is an exception, as it is an important contribution to understanding these different roles, especially given the additional complexity introduced by the role of cytonemes. If its thoroughness and in-depth analysis are typical of work from this lab, so is the challenging presentation that makes understanding it so difficult. My recommendation to the authors is to clearly describe the different roles that have been attributed to the glypicans and for every experiment they present, clearly articulate how the results might implicate or distinguish any or several of them.

      Although the figures are excellent, the manuscript is not well-written and would benefit from a rewrite.

    2. Reviewer #2:

      This manuscript interrogates function of Ihog and Boi adhesion molecules in cytoneme-based transport of the Hedgehog morphogen in Drosophila. The cell biology of how cytonemes are regulated to deliver morphogen signals is not yet well understood, so the work addresses an important topic that will be of interest to a broad audience. However, much of the study refines previous work from the same group to provide only a modest advance in understanding of how Ihog impacts cytoneme behavior.

      The authors use genetic strategies in Drosophila to investigate how Ihog and Boi influence cytoneme dynamics. They find that the two proteins act differently with regard to cytoneme function. Boi effects are not exhaustively analyzed, but a number of genetic experiments are performed to interrogate Ihog. The authors reveal that the extracellular domains of Ihog interact with the glypicans Dally and Dlp to stabilize cytonemes that originate from Ihog over-expressing cells. Knockdown of Ihog does not alter cytoneme dynamics.

      The most novel aspect of the study - that Boi functions differently than Ihog in cytonemes - is, unfortunately, not expanded upon. Some experiments lack controls or are presented in a manner that prevents clear interpretation of results.

      Key points to be addressed:

      Figure 1: Null alleles and RNAi silencing are used interchangeably to reduce Ihog, Boi, Dally and Dlp function in vivo. Results between methods are directly compared. Oftentimes, controls are not included to confirm the level of knockdown following RNAi. If possible use null alleles due to consistency. However, if this is not possible due to experimental reasons, give an explanation and state impact in the discussion.

      Ihog levels decrease following loss of Dally or Dlp and Boi levels appear to increase following knockdown of Ihog, Dally, or Dlp. These stability changes have previously been reported. The mechanism is not clear, so should have been investigated here - especially the increased Boi protein level. How does this occur? Is stabilization occurring at the protein level or is gene expression changing? Is this a compensatory upregulation?

      Based upon the supplement for Figure 2, it looks like the Ihog truncation mutants show variable stability. Might this be affecting the extent to which they alter Dally or Dlp stability? The western blot data are presented as crops of single bands adjacent to crops of a molecular weight ladder. Blots should be shown as intact images, preferable with all variants compared across a single gel with a loading control. As presented, relative stability/expression levels are impossible to assess.

      Figures 3-4: Ihog mutant transgenes are tagged with either HA or RFP. Best to be consistent with tags when mutant function is being directly compared. Given that the HA tag is a small epitope and the RFP is a protein tag, they may differentially alter protein functionality. To be consistent it would be preferable to use the same tags. However, if this is not possible due to experimental reasons, the technical implication can also be mentioned in the discussion.

      Figure 5: Investigation of histoblast cytonemes reaching into ttv, botv mutant clones: The ability of cytonemes to invade double mutant clones is altered only under the engineered situation of glypican dysfunction combined with Ihog over-expression. From this, it is concluded that Ihog is acting with glypicans to stabilize cytonemes. This may be the case, but they ability to see it only under an engineered situation of compound mutation plus Ihog over-expression leads this review to question the physiological relevance of the observation. Of similar concern is that the authors state the ability of Ihog over-expressing cell cytonemes to cross small vs. large ttv, botv clones differs. The difference is very difficult to appreciate from the results presented.

      Figure 6: The apparent functional difference between Ihog and Boi in the ability to stabilize cytonemes is potentially very interesting, but is not investigated, which limits the advance of the current study.

    3. Reviewer #1:

      In the article "Glypicans specifically regulate Hedgehog signalling through their interaction with Ihog in cytonemes" Simon et al. elucidate the function of Glypicans in Hh transport via cytonemes. The manuscript describes convincingly that the fly glypicans Dally and Dally-like are required to maintain the expression of the Hh co-receptor Ihog. Ihog - in turns - stabilises Hh cytonemes through its interaction with Glypicans to establish the Hh gradient in the wing imaginal disc. The authors further carried out an extensive molecular analysis of Ihog and identified the relevant domains within the protein required for interactions with Glypicans, Patched, and Hh. In general, this is a very thorough, detailed analysis of Ihog function. The images and videos are excellent. However, prior publication, there are two major criticisms, which needs to be addressed, in my opinion.

      Firstly, the first part of the manuscript, the molecular analysis of Ihog (Fig.1-4) seems to be detached from the second cytoneme-focussed part (Fig. 5, 6). Independent evidence is needed to show support for the idea that the Ihog-Gly mediated stabilisation of cytonemes is responsible for the expansion of the signalling gradient. Are the static cytonemes involved in a flattened gradient or are the receiving cells just sensitised for Hh? Can cytonemes be (de-) stabilised w/o interfering with Hh components to untangle these observations? The authors write "Intriguingly, the same Ihog domains that regulate cytoneme dynamics are those also involved in the recruitment of Hh ligand, glypicans and the reception complex."

      My concern is that cytoneme dynamics and Hh gradient formation could be two parallel, independent events -> one needs to show this interdependency in a clear way. I could imagine an analysis of the consequences when Ihog is overexpressed, and cytoneme formation is inhibited (by other means). Consistently, could one stabilise cytonemes in an Ihog-reduced background and analyse gradient formation?

      Secondly, the authors demonstrate an effect of Ihog alterations on the formation of the gradient. However, what is the physiological relevance? What are the consequences of Ihog/Gly-mediated cytoneme stabilisation and gradient formation on tissue patterning and wing formation? If this is not possible to show experimentally, this needs to be discussed.

    4. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 12 2021, follows.

      Summary

      In summary, this manuscript elucidates the function of Glypicans in Hh transport via cytonemes. The reviewers felt that that the manuscript describes convincingly that the fly glypicans Dally and Dally-like are required to maintain the expression of the Hh co-receptor Ihog, which stabilises cytonemes to establish the Hh gradient in the wing imaginal disc. A molecular analysis of Ihog domains was well executed.

      Although the manuscript provides an in-depth analysis, the reviewers believe that the presentation of the data is rather challenging for the readers. The authors need to clearly describe the different roles that have been attributed to the glypicans and for every experiment presented, a clear explanation of the impact of the results is needed e.g. Figure 5. In addition, the stability of Ihog and Boi by altered Glypican levels and their ability to stabilize cytonemes needs to be investigated. Finally, linking the Ihog analysis to cytoneme stability analysis needs improvement.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 17 2020, follows.

      Summary

      We feel that the major conclusions are right but the manuscript and story is not quite clear enough at present and there is a lack of deeper cellular and molecular mechanistic understanding of these phenomena to distinguish this work from the previous published studies. That ASC senescence impairs adipogenic differentiation capacity, has been previously reported in eLife in 2015 (doi: 10.7554/eLife.12997). For example, you concluded that adipose derived progenitor cells from older adults have higher potential to become senescence, which impaired adipogenesis. The percentage of senescent cells in adipose tissues is low, but the mechanism of how they could significantly affect adipose tissue functions is unknown. Is this through paracrine effects? or cross talk with other immune cells? etc.

      Essential Revisions

      1) Although the authors found that ASC senescence is associated with mitochondrial dysfunction and oxidative stress, it the nature of the links between these cellular events is unclear. It is well-known that mitochondrial dysfunction can directly lead to senescence. If the authors meant to prove that ASC senescence causes early adipocyte mitochondrial dysfunction, more evidence is required.

      2) It has already been reported that ASC senescence impairs adipogenic differentiation capacity, in Elife in 2015 (doi: 10.7554/eLife.12997). Furthermore, although the authors found that metformin prevents the onset of senescence and associated dysfunctions in ASCs, it has been shown in many publications that metformin is a senomorphic drug that can reduce the senescence-associated secretory phenotype. So it is not surprising that metformin can block the effects of senescent ASCs. Also regarding the increased adipogenesis by metformin, it has been reported that metformin can directly regulate adipogenic transcription factors, such as peroxisome proliferator-activated receptor (PPARγ), CCAAT/enhancer binding protein α (C/EBPα). As such, sufficient novelty is lacking at this point, and would require demonstration of causal links among these cellular events.

      3) Several conclusions need to be smoothed out and discussed in more detail. Methods must be described with more details, especially with regard to fat depot digestion (type of collagenase, concentration of collagenase, amount of tissue used for the digestion, are cell yields similar between young and old adipose tissue? Number of plated ASC?). The authors must consider that the term ASC is nowadays related to Adipose stromal cells and not Stem cells. As described in the introduction and method sections, ASC are stromal cells that adhere to plastic including fibroblast, smooth muscle cells, pericytes, endothelial cells, resident macrophages, preadipocytes and progenitors. This must be discussed since distribution and repartition of stromal cells are modulated with aging. The term "adipocyte" must be changed to "differentiated ASC" because adipocytes are characterized by unique lipid droplet (not the case here). The title must be modified. Senescence is related to ASC and not to adipocytes.

      4) Figure 1: It is unclear why the authors conclude about they are recapitulating in vivo aging. If so, one might expect that senescent "young ASC" phenotype may recapitulate the one of "old ASC" with a time lag, what is not the case for all the studied parameters. For example, the % of bgal cells is equivalent between P7 old cells and P11 young cells what is also true for P16, P21 and prelamin A but not for reactive oxygen species or mitochondrial potential. The authors must discuss this point.

      5) Figure 2: Was Cell number at confluency controlled and similar between "young" and "old" ASC? Since post-confluent mitosis are necessary for adipogenesis, one might speculate that the decreased adipogenesis might be related to less cell number and proliferation.

      6) Figure 3 and 4: Cells were treated from P3 with metformin. Do the authors consider potential "resistance" effect? When taking into account the large number of individuals treated with metformin, is there any evidence of an impact of metformin treatment on age-related loss in subcutaneous adipose tissue? Finally, inhibition of senescence may lead to cancer development. The authors must discuss this point.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 18 2021, follows.

      Summary

      This is an exhaustive study of different phenotypes associated with Histone H3-G34 mutations in a fission yeast model. Because mutations at this site occur in certain human cancers, teasing apart their different phenotypes in a model system helps to understand their potential effects in pathology. The phenotypes vary widely, suggesting a key role for this residue in a variety of genome maintenance functions.

      The authors systematically examine histone modifications, transcription, and use genetic and cytological assays to measure genome stability. The phenotypes vary widely, suggesting a key role for this residue in a variety of genome maintenance functions. Direct extrapolation to human cells is limited due to the absence of multiple H3 variants in fission yeast, and the absence of the PRC1/2 pathway. However, this is balanced by the rapid and thorough analysis of numerous variants that is enabled in this model system.

      It is not possible to draw a simple model as there is little consistency in the phenotypes. This suggests that the G34 residue independently affects multiple activities. These will require laborious efforts to tease out.

      Essential Revisions

      This is overall a technically very well done paper with a variety of methods to examine different mutations in H3-G34. The strength is the consistent approach applied to numerous mutations. However, as there is no single response, it's rather descriptive overall. We have no major concerns about the data, but feel that the conclusions need to be tempered in two areas where the assays were not direct.

      1) In the absence of NHEJ repair assays it needs to be noted that conclusions about NHEJ proficiency based on drug sensiutivty are indirect.2)

      2) The authors imply that the H3G34 mutants affect the activity of the Set2 H3K36 methyltransferase. In the absence of an in vitro H3K36 methylation assay on the mutant histones with recombinant or affinity purified Set2 the authors need to note that this conclusion is speculative as they have not measured it directly.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 29 2020, follows.

      Summary

      This paper describes very clearly a set of experiments to assess collateral sensitivity to certain antibiotics that is created by carriage of beta-lactam (incl carbapenenem) resistance plasmids in E. coli. This addresses one of the limitations of existing literature on CS, which typically focuses on the effects of resistance point mutations, which are clinically less significant. By documenting multiple ways that this CS is real and selectable and to a degree generalizable across genetic backgrounds, this is an important contribution in showing that CS is a real phenomoenon for clinically important resistance mechanisms.

      Essential Revisions

      1) The primary screen of 'antibiotics x plasmids' to identify collateral sensitivity, presented in Figure 1B, lacks an analysis of the statistical significance of results. Supplementary data shows that measurements of MIC are a little too noisy to robustly identify 2-fold changes with only 4 or 5 replicates. Defining "significant" as "mean more than 2x" is not adequate. Using a power calculation derived from the data in the manuscript, a sample size should be determined to have a 90 (or other high) % chance of detecting a 2x difference given the variability observed between assays, and then they should be done. Ideally this would be for all organism-plasmid pairs, but at least for the ones that the preliminary screen found a mean of 2x for.

      2) Recommendation (not required for acceptance, but please temper claims of clinical relevance if not done): The comparative killing data should be repeated in competition. This is technically more challenging but I believe not more so than the comparative growth curves. This would establish as proof of principle that a mixed population could be purified of plasmid-bearers by CS. Without this, the clinical relevance will still remain speculative. Also, two reviewers initially misread these as competition assays. The text and legend should emphasize that these are separate populations

      3) (Not required for acceptance but suggestion for future work:) The presented work is solid but, as pointed out by the authors, there is no mechanistic explanations for the observations. It would be highly interesting to know if the collateral effects are due to specific genes (OXA-48 would be a good place to start) and/or if the observed effects are due to the plasmid backbone.

      4) The experiment in Figure 3 demonstrates the exploitation of collateral sensitivity to preferentially inhibit plasmid-bearing bacteria. The terminology in this section refers to 'eradicate', 'mortality' etc, but in practice, the experiment defines survival as OD>0.2 after ~24 hours. It seems likely that in the 'non-surviving' conditions, waiting another day or two would show regrowth of some bacteria in these conditions.We don't think this requires any change to the experiment, only how the results are described: they show preferential inhibition of growth, not eradication. A more patient approach to identifying regrowth would be necessary to definitively state that these bacterial populations have been eradicated. Suggest tempering claims.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 11 2021, follows.

      Summary

      Adiponectin is a key adipokine, and much of our knowledge about this molecule has come from the Scherer lab. It is well known that adiponectin promotes improved insulin sensitivity and glucose tolerance, along with anti-inflammatory effects, which can be followed by decreased fibrosis. In this paper the Authors use loss and gain of function mouse models to explore whether the beneficial effects of adiponectin on metabolism can be translated into greater healthspan or lifespan. They show that lifespan decreases in adiponectin KOs and increases in the transgenic (ΔGly) mice. The expected effects on glucose metabolism, insulin sensitivity, inflammation, and fibrosis are also demonstrated.

      Essential Revisions

      1) Given the known significant effects of adiponectin on metabolic fitness, the effects on healthspan which the Authors observed in their 2 models, was expected. However, while median survival time is definitely less in the APN-KOs and greater in the ΔGly mice, the effects are relatively modest compared to other longevity studies. Any increase in lifespan is a good thing, particularly when accompanied by a corresponding increase in healthspan. We would've hoped for greater effects on lifespan than those observed but even modest effects are worthwhile. The Authors should comment in their discussion on this point. In other words, it would be good to know the Authors' thinking as to why these impressive effects on glucose, insulin, inflammation, fibrosis, etc. do not lead to even greater effects on lifespan. Also, is there any information on the causes of mortality in the WT vs. KOs that might point to why lifespan is decreased?

      2) APN-KO clearly leads to impaired glucose tolerance, but it is a bit surprising why insulin levels aren't increased, which is the typical metabolic response to insulin resistance.

      3) Can the Authors please comment on adipose tissue mass in the KOs, particularly if they have any information on subq fat?

      4) In Figure 3, they show increased staining for ATMs with Mac2 in the KOs. What about the expression of other inflammatory gene markers, such as those shown in Figure 3G for the liver?

      5) With respect to hepatic effects, this paper shows increased inflammation in the liver in APN-KOs. However these gene expression patterns are in total liver tissue, and it would be helpful to understand the origin of these inflammatory markers. Are they from Kupffer cells, monocyte-derived macrophages, etc. In a similar vein, various fibrosis marker genes are increased in total liver from the APN-KOs. Most likely these expression differences reflect stellate cell effects. Do the Authors have any information on the effect of adiponectin on stellate cell function. Although fibrosis-related genes are elevated in the APN-KO, is there histologic evidence of increased fibrosis in the liver sections?

      6) The Authors suggest that the increased inflammation in the liver is the cause of the increased fibrosis. Presumably they think that the immune cells in the liver are signaling to stellate cells to produce this effect. Is this the scenario the Authors propose. If so, it should be made more explicit and corroborated by histologic staining of hepatic fibrosis.

      7) It would be of interest to know the extent of inflammation in the kidneys with APN-KO, beyond Mac2 staining (Figure 3D).

      8) In the results in the ΔGly mice, is the enhanced lifespan statistically significant. Unless we are misreading it, the p value suggests it is not. Also, why have only study chow fed mice and not HFD mice in the transgenics, as they did in KOs?

      9) ITTs are shown in Figure 4G, but the basal glucose values are different between the 2 groups. Can the Authors also present the data normalized to the basal value to determine whether the kinetics of the curve are different?

      10) The resulting changes in tissue fibrosis are clearly important when thinking about healthy tissue function. It would help if the authors could show histologic staining for collagen deposition in the various tissues, particularly liver and kidney. Although it might be asking for too much if the they don't already have this information, it would also be useful to know which cell types within the various tissues are responsible for the changes in inflammatory markers and collagen related genes. This could also be discussed.

      11) From an aesthetic point of view there is a certain lack symmetry in this paper, since some of the measurements made in the KOs are not performed in the transgenics and HFD was not utilized in the transgenics either.

      12) Much of the data could be predicted from studies by them or the other investigators in the field (Nature Med. 8, 731 (2002), J. Biol. Chem. 277, 25863 (2002), J. Biol. Chem. 277, 34658 (2002), J. Biol. Chem. 278, 2461 (2003), Endocrinology 145, 367 (2004), J. Biol. Chem. 281, 2654 (2006), Am. J. Physiol. Endocrinol. Metab. 293, 210 (2007), J. Clin. Invest. 118, 1645 (2008) . IT would be helpful if authors could provide insights into the life-promoting mechanism by adiponectin that has not been clarified so far.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 5 2021, follows.

      Summary

      In this manuscript, Olive and colleagues used a genetic screen to identify Complex I (CI) of the electron transport chain (ETC) as a regulator of IFNg-mediated gene expression in macrophages. They attribute this role of CI to effects on the activity of the JAK-STAT pathway downstream of the IFNg receptor.

      While a potential link between CI activity and the activity of the JAK-STAT pathway would be interesting, the reviewers think that additional analyses are needed to substantiate this claim and rule out alternative interpretations.

      Essential Revisions

      1) Lines 204-205: The authors find that sgRNAs targeting other complexes of the ETC, including CIII and CIV, had no effect on the ability of IFNg to stimulate expression of cell surface markers. How do the authors interpret these findings, since CI does not work in isolation in the ETC and is rather dependent on CIII and CIV activity?

      2) How does IFNg stimulation affect oxidative metabolism as assessed by Seahorse? In order to corroborate the authors' conclusions regarding activity of individual ETC complexes (point 1 above), Seahorse analysis of individual complexes is also advised.

      3) The authors do some limited analyses in human MDMs to suggest that their findings in the mouse macrophage cell line can be generalized to other macrophage populations. It would be great if the analyses in the human MDMs could be extended to further strengthen the generality of their central findings.

      4) Fig 6D: Not clear whether similar exposures were used in different panels. Would be better to load samples in the same gel so that the same exposure can be used and a direct comparison between conditions can be made.

      5) Fig 6D: Does acute treatment with rotenone (but not inhibitors of other ETC complexes) have similar effects in reducing JAK-STAT signaling as knockdown of CI subunits? If not, then stable, long-term knockdown of CI subunits may have some effect independent of respiration in influencing JAK-STAT signaling (for example, on expression of some component of the JAK-STAT pathway). This interpretation could also explain why knockdown of other components of the ETC do not have similar effects to CI. Rotenone treatment could be tried (and compared with inhibitors of other ETC complexes), and if the data are different from knockdown of CI subunits, then related data in the study could be re-interpreted and conclusions modified.

      6) In Fig. 3H a key control is missing. What about survival of the cells when the import of the only energy substrate is blocked?

      7) The authors could consider placing their findings in the context of the broader literature. (As just one example, Ivashkiv Nat Imm 2015 described a role for mTORC1 and metabolism in IFNg-mediated transcriptional and translational regulation in macrophages.) This would increase the impact of their findings.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 7 2021, follows.

      Summary

      In this study, Olive and colleagues used a genetic screen to identify new regulators underpinning the ability of the cytokine IFNg to upregulate MHC class II molecules, of relevance to our understanding of how macrophages are activated by IFNg to confer host defense during microbial infection. They identified the signaling protein GSK3b, and MED16, a subunit of the Mediator complex previously implicated in gene induction.

      Essential Revisions

      1) Experimental treatment with IFNg may not be physiological. In key experiments, authors should try co-culture with activated NK cells +/- IFNg neutralization. A dose and time response curve of IFNg treatment may be valuable in key experiments.

      2) Comparison to cells not stimulated with IFNg is needed in key experiments. Comparison to WT cells is needed in Fig 5A,B.

      3) Stimulation with Type I IFN and other PAMPs in key experiments, as comparison to the effects of IFNg and to broaden the relevance of their findings.

      4) More insight into how IFNg signaling interfaces with GSK3 and MED16 is needed (e.g. role of mTORC1 pathway in regulating GSK3).

      5) Can the authors extend their data to an in vivo setting?

      6) Can the authors clarify the relative roles of GSK3a and GSK3b? For example, how do the authors explain the lack of a robust phenotype in Fig 3B-F?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 17 2020, follows.

      Summary

      In the paper, the authors used metabolomics to identify Valine and TDCA as metabolites depleted in diet-induced obesity (DIO) and replenished after sleeve gastrectomies (SGx) in mice. Intraperioneal injection of these two metabolites mimics many of the benefits of SGx, including weight loss, reduced adipose stores and insulin sensitivity. These benefits are related to Val/TDCA's ability to reduce food intake without altering locomotor activity, leading to a negative energy balance. Val/TDCA injection eliminated the fasting-associated rise in hypothalamic MCH expression in obese mice, and central injections of recombinant MCH blunted weight loss induced by Val/TDCA. Overall, this paper reports interesting and surprising observations related to the impact of metabolomic disturbances in obesity, and suggests a role for Val and/or TDCA in regulating food intake through MCH.

      Essential Revisions

      1) It is unclear from the data whether the effects are derived from valine, TDCA, or both. Both reviewers felt that any reader would want to see experiments where either of these metabolites is injected alone.

      2) No quantitative metabolite concentration values are provided anywhere, making it difficult to evaluate the robustness of the data. How much do the levels of TDCA and valine change with SGx in mice and humans, and what levels are achieved with the injections of these metabolites in the mice?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 10 2021, follows.

      Summary

      The reviewers agree that this is an interesting and useful contribution for understanding LQ extinctions, and that it is generally well-presented. It shows that the factors that increase extinction risk are de-coupled from the factors that eventually lead to extinction and thus in its timing. However, the reviewers also note that although the modelling approach is novel, it is reliant on datasets that are biased and at times these biases are not well-accounted for. Because much of the conclusions drawn from the modelling could already be drawn from existing records and using literature that is glossed over here, attention to that literature should be improved and the contributions beyond the megafauna debate should be emphasized. Furthermore, the authors should take care to improve clarity in the framing of the models, the presentation and interpretation of results, figures, and discussion.

      Essential Revisions

      1) ADDITIONAL ANALYSES (no additional data collection). The reviewers had specific concerns about the effects of sampling on the extinction chronology and the influence of body mass on a number of things (recovery potential, life history/demographic correlates, etc). Specifically, the analytical issues that present the biggest problems revolve around sampling uncertainty and body mass correlation. The former could be addressed by introducing some sensitivity tests. These could be directed towards chronological biases (how does removing one date affect the confidence intervals?), as well as geographical sampling biases (how does removing a region affect the trends?). The latter in particular would be important in the claims of a continental trend. It is also possible that biases are a function of taxon sampling. There are an increasing number of small mammal Pleistocene extinctions being recognized in Australia, and it is unclear if these follow the same trends as the megafauna. If so, that would indeed remove the body size issues.

      2) BETTER FRAMING OF THE FIVE PUTATIVE DRIVERS OF EXTINCTIONS:

      (i) appears to assume that only human hunting will differentially affect demographically sensitive species. However, novel or extreme climate change can also affect such species (e.g. Selwood, K.E., McGeoch, M.A. and Mac Nally, R., 2015. The effects of climate change and land‐use change on demographic rates and population viability. Biological Reviews, 90(3), pp.837-853.)

      (ii) this mechanism is predicated on using a modelling result [ref. 25] as data. It also makes the bold claim that species inhabiting certain habitats are less accessible to human hunters without any consideration of the archaeological or modern record on this point (e.g. Roberts, P., Hunt, C., Arroyo-Kalin, M., Evans, D. and Boivin, N., 2017. The deep human prehistory of global tropical forests and its relevance for modern conservation. Nature Plants, 3(8), pp.1-9; Fa, J.E. and Brown, D., 2009. Impacts of hunting on mammals in African tropical moist forests: a review and synthesis. Mammal Review, 39(4), pp.231-264).

      (iv) many of the supporting references here do not seem like logical choices for this argument. E.g. [28] refers to coral-reef fishes. Moreover, this hypothesis conflicts with much modern data showing that extinction risk and body size are correlated under climate and environmental change (e.g. Cardillo, M., Mace, G.M., Jones, K.E., Bielby, J., Bininda-Emonds, O.R., Sechrest, W., Orme, C.D.L. and Purvis, A., 2005. Multiple causes of high extinction risk in large mammal species. Science, 309(5738), pp.1239-1241. Liow, L.H., Fortelius, M., Bingham, E., Lintulaakso, K., Mannila, H., Flynn, L. and Stenseth, N.C., 2008. Higher origination and extinction rates in larger mammals. Proceedings of the National Academy of Sciences, 105(16), pp.6097-6102. Tomiya, S., 2013. Body size and extinction risk in terrestrial mammals above the species level. The American Naturalist, 182(6), pp.E196-E214.)

      3) MORE NUANCED INTERPRETATION OF MODEL OUTPUT.

      The major weakness in this manuscript is in the discussion. The authors should be very clear in their discussion that their model does not indicate that demographic factors had no part in extinct events per se, but rather that they don't explain extinction chronology. Extinction chronologies reflect a number of different factors and processes, but they don't take away from the fact that certain life history traits can make a species more likely to go extinct from those factors.

      The authors seem to argue that demographics don't explain the megafaunal extinction in the Sahul, but in fact, their results suggest that they do; the only thing demographics by themselves don't explain is the chronology. Extinction risk as determined by demographic susceptibility is highly related to body mass and generation time (which in turn is also affected by body mass) but differential survival (timing of extinction) is determined by factors such as geographic range size, dispersal ability, access to refugia, and behavioral and morphological adaptations against hunting, and the ability to survive catastrophic events. A reiteration of this point would be beneficial to the clarity of this otherwise well written manuscript.

      The authors clearly (and elegantly) show that extinct species, which were all large, and had long generation times, had demographic traits that made them more susceptible to extinction. This is evident in figures 3 and 4. However, in the discussion, in lines 301-303, they state that no demographic trends explain the extinction. This is not supported by the results. While the timing of when species go extinct doesn't correlate with demographic susceptibility, the peculiar nature of the extinction-a large size biased extinction-is explained by demographic factors, and is a phenomenon that has been explored in a global analysis by Lyons et al. 2016 Biol. Lett. Therefore, demographic trends DO explain why certain species go extinct, while others survive. The authors should be careful when they say that "that no obvious demographic trends can explain the great Sahul mass extinction event"; instead, they should re-iterate that no obvious demographic trend explains the extinction chronology.

      4) MORE CAREFUL DISCUSSION OF RESULTS RELATIVE TO LITERATURE. The authors further go on to suggest that their results suggest that the extinctions were random, but the size-selectivity clearly shows that the extinctions were in fact not random with respect to body size.Their analyses do show that the rate of extinction doesn't exceed background to the same degree that it's been suggested in prior studies, and this is something that researchers need to explore further. Also, the authors raise an important point in lines 309-311 that human hunting could have interacted with demographic susceptibility, something that Lyons et al. 2016 Biol. Lett. show, and the results of the present study should be discussed in light of the 2016 paper.

      They also raise an important point in lines 312-320 that behavioral or morphological adaptations may have allowed some seemingly "high risk" species to persist despite anthropogenic pressure. These model "mis-matches" have been reported by Alroy 2001 Science as well in a multispecies overkill simulation. It would be beneficial to discuss the present results within the context of other examples of model mismatches, such as those from Alroy 2001.

      In lines 353-358, the authors once again state that their results show no clear relationship between body-mass and demographic disadvantage, despite clearly showing these relationships in Figures 3 and 4, and even stating as much in the beginning of the discussion. The plots clearly show that large bodied taxa were at a demographic disadvantage. There is a difference between explaining why certain taxa go extinct vs. why they go extinct at a certain point in time, and this should be made clear. The authors are correct in stating that demographic factors don't explain the relative extinction chronology, i.e. when species go extinction relative to each other, but they do explain why large species go extinct, and why these extinctions take place after human arrival. Moreover, generation length, which is also correlated with demographic susceptibility, is highly correlated with body mass (Brook and Bowman 2005 Pop. Ecol), once again showing that body mass-related effects do help explain the extinctions.

      The authors rightfully point out earlier in the discussion that spatial variation, local climates, ecological interactions, etc. all influence how and why a particular population disappears. Extinction chronologies reflect a number of different factors and processes, but they don't take away from the fact that certain life history traits can make a species more likely to go extinct from those factors. Large proboscideans like mammoths had a high risk of extinction based on life history traits, but managed to survive on island refugia into the mid-Holocene. Similar other examples exist, and show that extinction chronologies can vary vastly.

      Therefore, the lack of correlation can be explained by these factors, and the authors need to expand on these in their discussion, perhaps if possible, by giving specific examples. They should be more careful in their discussion by clearly distinguishing drivers of extinction risk, and how these drivers can be de-coupled from timing, but at the same time providing a good explanation for the biological factors leading to the extinction. Here again the authors should consider the work of Brook and Bowman and Lyons et al.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 7 2021, follows.

      Summary

      This manuscript describes a detailed investigation of the sigma-1 receptor, with an emphasis on the effects of membrane cholesterol content. The authors report that sigma-1 receptor clusters in cholesterol-rich microdomains in the endoplasmic reticulum (ER), contributing to its previously-described localization at mitochondria-associated ER membranes. A series of reconstitution experiments show cholesterol-dependent clustering of the sigma-1 receptor, an effect which is modulated by membrane thickness and drug-like ligands of the receptor. These findings are supplemented by an investigation of the effects of sigma-1 receptor on IRE1a signaling, leading to the finding that sigma-1 knockout attenuates IRE1a function.

      Essential Revisions

      The reviewers agreed that the manuscript was likely to be of broad interest and addresses important biological questions surrounding the poorly understood sigma-1 receptor. However, concerns were raised regarding a number of points that need to be addressed in order for the manuscript to be suitable for publication. Specifically:

      Most of the imaging experiments throughout the manuscript are interpreted only qualitatively, and many of these show relatively minor differences. See "MINOR POINTS" below for a list of specific examples. Objective quantitative analysis should be provided wherever possible. Any subjective assessments should be conducted using blinding to avoid introduction of bias.

      The connection between the biological effects on IRE1a activation and cholesterol-dependent clustering is relatively indirect. The reviewers agree that additional experimental data should be provided to further assess the validity of the authors' proposed model. For example, inclusion of rescue experiments in sigma-1 knockout cells using the cholesterol-binding mutants would help to strengthen the connection between IRE1a function and membrane cholesterol content. Similarly, disruption of cholesterol-rich domains by addition of beta-cyclodextrin could provide additional evidence to support the model. In addition, testing the effects of ligands in the cellular imaging experiments would strengthen the link between in vitro biophysical experiments and cellular physiology.

      A related issue is that cholesterol binding is not tested explicitly for certain sigma-1 receptor mutants, potentially confounding interpretation of experimental data. These include experiments where alterations were made to the S1R sequence, with results interpreted in light of S1R no longer being able to bind cholesterol. Two specific places where this issue arises are:

      1) Studies described on pages 6-7 and shown in Figure 3B where wild-type sigma-1 receptor is compared to S1R-Y201S/Y206S, S1R-Y173S, S1R-4G, and S1R-W9L/W11L. These mutations had differential effects on receptor distribution that were attributed to alterations in cholesterol binding without confirming the changes in cholesterol binding. This is particularly relevant for the explanation given for why S1R-W9L/W11L fails to cluster in both cells and the cholesterol supplemented GUV system, while the S1R-4G mutant exhibited cholesterol-induced clustering in the GUV system but not in cells (page 7, lines 27-31).

      2) Another example is the membrane thickness experiment described at the top of page 8 and shown in Figure 4A. Shortening the S1R by deletion of 4 aa in the TM region produced a sigma-1 receptor that exhibited a more diffuse distribution when expressed in HEK293 cells. The authors appear to be attributing this only to the decreased length of the sigma-1 receptor transmembrane domain. However, it seems feasible (based on their other data) that if this construct fails to bind cholesterol, the same result would be observed. Confirming that the truncated sigma-1 receptor does in fact bind cholesterol would strengthen the argument being made here.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 5 2021, follows.

      Summary

      Your work analyzes the impact of the INPP5E inositol lipid-5 phosphatase on immune synapse formation and function. INPP5E is a cilium enriched protein. Although T cells do not display primary cilia, previous work by several laboratories showed that several ciliary proteins are involved in immunological synapse formation and in T cell activation and your work intends to further this view. Although the work has potential for publication in eLife, it requires essential additional data to support the central claims of the paper. Each reviewer raised substantive concerns (see below) that need to be resolved experimentally. For instance, experiments involving knockout in primary T cells will need to be performed. A better time series will also help deciding in what process the INPP5E protein is involved in. Moreover, imaging data should be quantified more precisely to assess spatial and dynamic differences.

      Reviewer #1:

      An important aspect of mature synapse formation is signal termination and ... effector responses, such as secretion of cytokines, exosomes and CD40L on synaptic ectosomes (Huse et al, 2006; Mittelbrunn et al, 2011). The demonstration of ESRCT function in both TCR signal termination and CD40L release to B cells on synaptic ectosomes likely involves inositol lipids that lack phosphorylation on the 5' position.

      It might make sense for the author to investigate a synapse effector function like degranulation of CD8 or CD40L transfer of synaptic ectosome in CD4 T cells as these effector functions actually link into synapse formation more directly than bulk IL-2 secretion.

      The ESCRT machinery is also highly entwined with ciliary biology and several ESCRT components important for signal termination and effector function will also require PIP metabolism.

      Reviewer #2:

      Interesting similarities between the primary cilium and the immunological synapse have been noted and investigated extensively over the last few years. In this context and beyond, the role of phosphatidylinositol lipids in the organisation of the immunological synapse and T cell function has been extensively investigated. Here Chiu et al. add to these topics by investigating INPP5E, a primary cilium-associated 5' phosphatidylinositol lipid phosphatase that can use PIP3, PI(4,5P)P2 and PI(3,5)P2 as substrates, in T cell activation. The authors show that INPP5E is recruited to the interface of a T cell with an activating antigen presenting cell. INPP5E binds to TCRzeta, ZAP-70 and Lck. INPP5E knockdown reduces TCR recruitment to the T cell/APC interface, clearance of PI(4,5)P2 from the centre of the interface, and TCR and ZAP-70 phosphorylation. These findings are consistent with the large body of existing work on the role of phosphatidylinositol lipids in the organisation of the immunological synapse and T cell function and, therefore, don't constitute a conceptual advance. Nor do they provide new mechanistic insight into phosphatidylinositol lipids in T cell activation. The data add another molecule to the existing body of work.

      In the first two figures Chiu et al. show that a number of cilium-associated proteins, including INPP5E are recruited to the interface of a Jurkat cells with a Raji B cell presenting superantigen. Such recruitment is not surprising. On the contrary, because of the reorientation of the MTOC to the centre of the cellular interface and the accompanying shift of the nucleus to the back of the T cell to create more cytoplasmic space at the interface, most proteins associated with vesicular trafficking shift their subcellular distribution towards the interface. Only data showing spatial or temporal distinctions in such recruitment within the small cytoplasmic space underlying the T cell/APC interface could provide interesting new insight. Reduced detection of INPP5E interface recruitment after INPP5E knockdown could be trivially caused by the worse signal to staining background noise ratio (Fig. 2A-E). The STORM data showing that INPP5E interface recruitment occurs in the T cell not the APC are welcome. However, spatial and temporal features provided by the higher resolution of these experiments are not explored.

      In the investigation of the contribution of different INPP5E domains to its interface recruitment the representative imaging data in Fig. 3A suggest that substantial quantitative differences exist. The '% conjugate with recruitment' metric doesn't capture such differences. Some form of a recruitment index as used in other parts of the manuscript would be more powerful. A more complex picture of INPP5E domain contributions to INPP5E interface recruitment is likely to emerge.

      The immunological synapse is a highly dynamic structure. TCR interface recruitment and PI(4,5)P2 clearance in response to various manipulations of PI turnover are only analysed at a single time point. A dynamic picture should provide more insight. For example, interface recruitment of the TCR may be consistently impaired, delayed or shifted in time. Reduced interface recruitment of the TCR upon overexpression of PIP5Kgamma (Fig. 5D, E) has already been described in the cited Sun et al. reference. This should be acknowledged.

      In Fig. 6E, the authors show a small reduction in IL-2 secretion in Jurkat cells stimulated with anti-CD3/CD28 upon knockdown of INPP5E. As INPP5E is expected to exert its functional effects through the control of the spatiotemporal organisation of the immunological synapse, activation of Jurkat cells with APCs would be more appropriate.

      The knockdown efficiency of INPP5E should be quantified.

      Reviewer #3:

      The work is fully performed in Jurkat cells, which a very good and widely used model to investigate T cell activation, yet, not perfect. Actually, in the case of events related with phosphoinositide function, Jurkat cells present a strong caveat. These cells lack the Phosphoinositide phosphatase PTEN, therefore having altered phosphoinositide turnover.

      Therefore, as a first critical point, the authors should confirm most of the central data of this work in primary T cells. They should also discuss this point, since it might bias some of their data.

      Additional points needing attention are detailed below.

      1) Regarding data in Fig 1D, the authors say the they find INPP5E localized with the centriole in the absence of SEB stimulation. The pattern shown is in the picture is very diffuse and blurry, not showing at all a centriole pattern.

      It seems to be more visible in Fig S1. The authors should replace Fig1D panel by a better "quality" picture if they wish to convey that message.

      2) What do the authors mean with "number of events" in the figures ? Please explain or replace by another term or means of quantification. If it means counting conjugates with INPP5E recruited "by visual observation", it would be much more appropriated to quantify fluorescence enrichment at the synapse making a ratio.

      It is also bizarre to plot "pairs" which are all at 100%. What does that mean?

      3) In Fig 2 D, E the authors observe by TIRF the presence of INPP5E at the planar pseudosynapse. They do in parallel TCRz. It would be interesting to better take advantage of that type of microscopy images to also quantify the impact of INPP5E on TCRz recruitment and to assess co-localization between INPP5E and TCRz using Pearse corelation on images with a very good resolution. From that image they look like they do not co-localize at all.

      4) The reasoning of the authors in Fig 2 H is somehow strange: "Since the distribution of INPP5E signals mostly appear at the T cell-APC contact site, it was necessary to examine whether INPP5E belonged to T or B cells" Although they use dSTORM the resolution of the image is not single molecule as they claim, but relatively large clusters. Moreover, they say that INPP5E is inside the T cell while TCRz is at the plasma membrane. In that image there are spots labelled far on the B cell. Moreover, it has been shown by several authors that TCRz largely occupies intracellular vesicular compartments. So the conclusion is not accurate. Finally, they claim that the overlap in some regions is suggestive possible interactions. The overlap is really minimal and in zones of clustering. So the comment is far from accurate. A proper colocalization analysis in TIRF_dSTORM images of INPP5E and TCRz quantified by Pearson correlation would be much more appropriate and accurate.

      By the way, the authors could use panel F of T cells transfected with Flag-INPP5E that relocalizes to the synapse to say that INPP5E in T cells relocalizes to the synapse.

      5) Fig 4A: The strongest interactor with INPP5E seems to be Lck, rather than TCRz. It would be interesting to also assess the effect of INPP5E silencing on Lck recruitment at the synapse.

      Is there a mistake in labeling IP in horizontal and IP in vertical. I guess one of them should be IB (immunoblotted / Western blot). Please clarify and correct if necessary. Same in B, there is labelled IP-Flag everywhere, is one of them input? Please clarify/correct if mistaken.

      The term INPP5E "interacted" with TCRz, ZAP and Lck in the text (line 168-169) is not fully correct here, since these molecules make complexes during TCR activation. The term "co-immunoprecipitated" would be more accurate here.

      Fig 4D Not clear here why the authors use cells transfected with TCRz-GFP while to conclude that INPP5E is required for exogenous CD3z clustering, they could just stain for endogenous TCR.

      Fig 6B: If the authors normalized the pProtein band density with respect to the total same protein, the Y axis should be expressed as band density ratio rather than "optical intensity (a.u.)"

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on January 4 2021, follows.

      Summary

      In this manuscript, the authors have generated a new mouse model for the severe disease, Ataxia Telangiectasia (A-T). They introduce null mutations in Atm onto the background of mice that are somewhat sensitized since they also harbor mutations in the Aptx gene. The outcome is the mice show a set of phenotypes that are strikingly similar to symptoms seen in human patients. These include cerebellar degeneration, cancer, and immune system abnormalities. The also deliver small molecule readthrough (SMRT) compounds into tissue explants and show that such a manipulation can restore the production of ATM protein. The success in producing an Atm model with cerebellar degeneration is a compelling advance as this particular phenotype has been incredibly difficult to reproduce in animal models. The authors perform an interesting set of analyses to confirm that the other important features of the disease are also present in their mice. This paper has broad interest to multiple fields including neuroscience, cancer, and immunology.

      Essential Revisions

      1) It is not clear how progressive the cerebellar degeneration is. What is the spatiotemporal pattern of degeneration? Please consider the lobule by lobule effects over time.

      2) For the electrophysiology, what stage cells have you recorded from? That is, what was the structure of the Purkinje cells that you recorded? If the cells look really "normal" but fire abnormally, then please comment on how they are being affected. If the morphology is abnormal, then please explain what defects you see and how they might impact function. Essentially, the authors need to disentangle cell autonomous effects and non-cell autonomous effects with more clarity. That is, are you studying the "dying" cells or the cells that that escaped the genetic defect?

      3) Are both the Atm and the Aptx genes expressed in all (or the same) Purkinje cells? What is the experimental evidence?

      4) Please provide more context and rationale for Aptx in the abstract. As it stands, its mention comes out of nowhere.

      5) In the Introduction, please provide more information as to why previous studies/models might have failed to produce severe Atm-related cerebellar phenotypes.

      6) In the Introduction, the rationale for the choice of paring the Atm mutations with defects in the Aptx gene is unclear. Are they in the same pathway? Are the genes located in close proximity to one another? There are many issues that need to be discussed.

      Related to above, ATM and APTX, while involved in DDR, are involved in parallel pathways-ATM in DNA double stranded break repair, and APTX in single stranded break repair. Homozygous mutations in APTX causes human ataxia (AOA1), but there is nothing to indicate an intersection mechanistically between AT and AOA1. One could just as well call the AT-APTX double mutation a model of AOA1. As indicated above, please expand on the rationale of the experimental design.

      Also, are there more single stranded DNA breaks? Double stranded DNA breaks? Is there a sequestration of SS DNA break repair components including PARP1? How are the changes in PC firing related to DDR (it would be worthwhile for the authors to examine the following papers Hoch et al. Nature. 2017 Jan 5;541(7635):87-91, Stoyas et al. Neuron 2020 Feb 19;105(4):630-644) to give insight into studies that can explore mechanism for DDR and changes in cerebellar morphology/function.

      Therefore, the authors need to address whether single vs double stranded break repair is present and the authors could do a better job of linking the change in PC firing to DNA damage.

      7) Figure 2B: Apologies if I am missing something, but I do not understand the reason or explanation for what determines the probability of survival for the green, gold, and orange traces (the three severe cases in the graph). That is, why is the gold so strong?

      8) How come rotarod was not used as a test? This is a standard motor behavior test that is useful for comparing across animal models and studies.

      9) Related to above, why not use in vivo recordings? I can understand using slice recordings to tackle the biophysical and intrinsic mechanisms, although the authors did not do that. It seems to me that extracellular recordings would have been more informative in the in vivo, awake context.

      10) The authors picked specific regions of the cerebellum to target their slice recordings, which is perfectly reasonable. But why did you pick these regions? Please provide a full justification and discussion for the importance of these particular lobules in relation to what you are trying to solve.

      11) Given the use of slice recordings and that Purkinje cell degeneration is a key aspect of the phenotype, it would be very compelling if the authors showed some filled cells. As it stands, it is very hard to appreciate what the severity of neuropathology actually looks like, especially in relation to what the functional defects are teaching us.

      12) The authors state that "The largest differences were detected in the anterior [38.6{plus minus}3.4 Hz (n=187) vs. 88.1{plus minus}1.8 Hz (n=222)] and posterior [46.9{plus minus}1.9 Hz (n=175) vs. 84.1{plus minus}2.4 Hz (n=219)] medial cerebellum [1-way ANOVA, p<0.0001; Fig. 4B]." Okay, but what does this mean? What is your interpretation for why these regions were more heavily impacted (cell sensitivity based on circuit architecture, gene expression and protein make-up, neuronal lineage?) and how might it impact the phenotype?

      13) The authors state and reference "Previous studies in mouse models of heritable ataxia indicate that physiological disruption in PN firing not only includes changes in frequency but also affects its regularity (Cook, Fields, and Watt 2020)." I agree with having this reference, but what about other models of ataxia? There are a number of other excellent models that should be discussed.

      14) Purkinje cell firing data (figure 4B) should not be averaged across all of the ages, as this is not standard practice, and would be akin to averaging all behavior across ages. I think the data in fig. 4C suffices. If you want to compare across lobules on one graph, simply choose a particular age (perhaps when behavioral changes are first observed?) or at the oldest age.

      15) Why examine Purkinje cell firing deficits in different lobules but not make that distinction for Purkinje cell loss? The Purkinje cell loss analysis focussed on the areas with most pronounced firing deficits but this means that we don't know whether the cells that fire abnormally are the only ones that die. Also see point #2 above.

      16) Figure 4E and related text: Please provide a much more extensive set of images to show the cerebellar pathology. 1) Please show views of the different lobules to demonstrate the pattern of degeneration. 2) Please show different ages to show the progression of degeneration. 3) Please show higher power images of the Purkinje cells to clearly demonstrate their morphology.

      17) The authors need to need provide more data for what is actually happening in relation to cell death. Why not perform Tunel or caspase staining etc.? The authors must show that there are actually acellular gaps where cells have died, or some other indication that cell death has occurred or is occurring.

      18) Also in relation to the Purkinje cell degeneration, what do the dendrites look like? What about the axons? Do you see any torpedoes or axonal regression?

      19) In regards to the cerebellar degeneration, what happens to the other cell types in the cerebellar cortex? Are they intact? What about the cerebellar nuclei?

      20) The authors state "Of interest, APTX deficiency by itself had the greatest effect on the loss of DN4 cells...". Okay, but it is hard to see what this means for A-T as a disease. Interesting as it is, what is the relevance of this gene and these findings to the actual disease?

      21) Please provide a more extensive description and rationale for why this explant system was chosen.

  3. Dec 2020
    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 14 2020, follows.

      Summary

      This study addresses an important topic of broad ecological interest and provides important insights into the role of local-scale processes in shaping patterns of species diversity, aiming to (i) assess if there is a global latitudinal diversity gradient (using alpha diversity) of rocky shore organisms and its functional groups and, (ii) whether there are any large scale or local environmental predictors of richness patterns. The strength of this paper is the global coverage of studies analyzed, showing for the first time that rocky shore richness does not appear to peak in the tropics - in contrast to many other studies of marine and terrestrial systems. These outcomes are not specific for rocky intertidal systems, with an increasing number of studies showing that the search for global ecological patterns may be elusive. While sampling in the tropics and the polar regions is poor (acknowledged by the authors), this should be viewed as a call for further research in these regions - not as a weakness of the paper per se. There are also some reservations on how the analysis has been conducted, including the lack of standardization of sampling effort and other details (e.g., size of sampling units) to derive a comparable measure of diversity across sites.

      Public Review

      The latitudinal gradient of diversity has been studied and confirmed in many aquatic and terrestrial habitats and species across the globe. In the vast majority of cases, richness increases towards the tropics. Using an impressive global dataset of latitudinal diversity gradients in 433 rocky intertidal assemblages of algae and invertebrates from the Arctic to the Antarctic, Thyrring and Peck show that rocky shore ecosystems may not follow this general pattern. The authors show that there is no clear latitudinal gradient for rocky shore organisms using alpha diversity - as posited by prevailing theories - although some functional groups exhibit contrasting patterns. Diversity within functional groups of predators, grazers and filter-feeders decreased towards the poles, whereas the opposite was observed for macroalgae. Correlation with environmental drivers highlighted the importance of local-scale processes in driving spatial patterns of diversity in rocky intertidal assemblages. The paper is well written and the many of the analyses are well done, but there is the concern, which the authors acknowledge, that sampling within tropical latitudes is sparse and needs to be carefully considered when interpreting the results of this paper.

      The work can be improved in the following manner:

      1) The relevant data to standardize species richness may not be available from the primary literature. However, it should be possible to employ relevant standardization methods within the 5{degree sign} latitudinal bands in which the data have been aggregated. An analysis based on standardized data, at least for the more data-rich latitudinal bands, must be added.

      2) Employ models that allow assessing unimodality, which is stated but untested. At the bare minimum, a quadratic relationship with latitude should be included in the GLMM. As implemented here, the GLMM employed to relate diversity to latitude can only detect linear trends, but not unimodal patterns and the mid-latitude peak suggested by LOESS for the northern hemisphere. To provide a formal test for unimodality, models with or without a quadratic term could be contrasted using standard model comparison procedures. Alternatively, GAM could be used to evaluate nonlinear effects.

      3) Clarify whether p-values are relevant or not. As is, it is confusing. For example, the legend of Table 1 mentions p-values, but these are not reported. Materials and Methods indicate that 95% confidence intervals are used to take decisions on null hypotheses, suggesting that p-values are not used in the analysis (lines 436-439). Nevertheless, p-values are reported in Table 2.

      4) Provide a rationale for distinguishing between canopy and other algal forms (the distinction is compelling, but it is not explained).

      5) We like the conclusion on the importance of local-scale processes. This should be placed in the context of previous studies that have quantified patterns and processes at multiple scales reaching the same conclusion.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 4 2020, follows.

      Summary

      This study combines high resolution imaging experiments with mechanical modeling to elucidate the energetics of flagellar propulsion and understand the role of internal dissipation in this system. The experiments use mouse sperm cells that are chemically tethered to a glass slip. For each cell, the flagellum shape is imaged over time and segmented into a mathematical curve. This data is analyzed based on a planar Kirckhoff rod model that includes hydrodynamic drag forces (based on resistive force theory), bending elasticity, and an unknown active moment density. An energy balance is written that also includes internal viscous dissipation generated inside the flagellum, with an ad hoc internal friction coefficient. By calculating the various terms in the energy balance based on the reconstructed filament shapes, the authors are able to estimate the active power density along the flagellum. This calculation leads to two unexpected findings: (1) the authors find that the active power density can be negative along some portions of the flagellum, meaning that along these portions the dynein motors act against the local deformation of the structure, and (2) the main origin of dissipation in the system comes from internal dissipation, which exceeds viscous dissipation in the fluid in magnitude.

      Essential Revisions

      1) It is not completely clear from the manuscript what the configuration of the sperm is with respect to the glass slide where the head is tethered. What is the orientation of the cells with respect to the slide, and in which plane are the deformations measured? (from above or from the side?) We would expect that different configurations may lead to slightly different waveforms. In particular, we are surprised that the mean shapes shown in figure 2(a) have a net asymmetry which is observed in nearly all the cells: could this have to do with the relative configuration of the flagellum with respect to the surface?

      2) The experiments are done with flagella very near a no-slip surface, since the cells are chemically adhered to the chamber boundary. Yet, the authors use resistive force theory for filaments in free space, without any reference to the nearby no-slip surface. As the rate of energy dissipation near the surface will be considerably larger than estimated by RFT, it is possible that some (or much, or perhaps all) of the additional dissipation found by the authors is actually within the fluid and simply not accounted for by RFT. Thus, all of the calculations must be redone with the appropriate Blake tensor for stokeslets near a no-slip wall before the results can be considered definitive. The paper must also more carefully illustrate and quantify the proximity of the flagella to the surface in order to make these calculations precise. Absent this analysis, the claims of the paper do not stand up to scrutiny.

      A related point is the need to understand the effect of tethering the cell on its kinematics and energetics? In other words, do the conclusions still hold for freely swimming cells?

      3) Is there any evidence of 3D dynamics? Some recent experiments with human sperm have suggested that sperm beats can take place in 3D (Gadelha et al., Science Advances 2020). As the model in the paper is 2D, this could also affect the energy balance.

      4) The authors should examine the work of K.E. Machin ["The control and synchronization of flagellar movement", Proc. Roy. Soc. B 158, 88 (1963)], which provided the first theoretical formalism to study active moment generation within beating flagella based on examining the difference between known force contributions from viscous dissipation and elastic bending. It seems that this same kind of analysis could be done here to identify directly the non-viscous contribution, rather than having to postulate a particular form.

      Stated another way: Why not try to estimate the active power density directly from the active moment density, which could be calculated from the moment balance of equation (4) where all the other terms are known? This would provide a direct estimate of the active power. The force balance could then be used to estimate the internal friction, which would then no longer rely on an assumed value for the internal friction coefficient. In fact, this could be used to obtain an estimate for that coefficient.

      5) The paper addresses in detail the use of Chebyshev fitting methods for the filaments, but does not appear to address the physical boundary conditions one would expect on elastic objects (particularly at the free end), involving the vanishing of moments and forces. Unlike, for example, the biharmonic eigenfunctions of simple elastic filament dynamics which are tailored to those boundary conditions [see, e.g. Goldstein, Powers, Wiggins, PRL 80, 5232 (1998)], it is not clear how the Chebyshev functions satisfy those conditions. Some explanation is needed.

      6) If indeed internal dissipation dominates, that would suggest that essentially all prior theoretical approaches to calculating sperm waveforms must be quantitatively in error by very large factors. It would be very appropriate for the authors to examine some of those theoretical works to determine if this is the case.

      7) The authors note in the Discussion that the beating waveform changes dramatically in fluid with higher viscosity. Yet, if external dissipation plays such a small role how can this be rationalized?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on November 8 2020, follows.

      Summary

      The manuscript from Perez-Garcia et al. follows up on a prior study by the same authors in which they identified the tumor suppressor BAP1 as a regulator of mouse placentation and trophoblast stem cells (TSCs) (Perez-Garcia et al., Nature, 2018). In their preceding work the authors showed that CRISPR-mediated knockout of BAP1 in TSCs results in upregulation of key stem cell markers Cdx2 and Esrrb and biased differentiation towards trophoblast giant cells at the expense of the syncytiotrophoblast lineage. Here the authors have expanded on these observations by demonstrating that BAP1 modulates the epithelial-to-mesenchymal transition (EMT) in TSCs and that a similar phenotype can be obtained by genetic deletion of Asxl1/2. Declining protein levels of BAP1 during differentiation of human TSCs into extravillous trophoblast suggest that the role of BAP1 may be conserved in humans. As the molecular mechanisms of trophoblast development, including EMT and invasive behaviors of trophoblast giant cells in the mouse and extravillous trophoblast cells in human are only beginning to be understood, this study provides an important advance.

      This is a well-written and technically sound study that clarifies the role of BAP1 in trophoblast development. Overall, the work presented is very important to the fields of EMT and trophoblast stem cell biology, and it warrants publication in eLife in principle. However, the claims in the abstract, the model in Figure 5, and the conclusions in the discussion are not well-supported. Therefore, additional experimental work will be essential for the manuscript to become suitable for publication in eLife.

      Essential Revisions

      1) There was consensus that the current manuscript lacks functional data to demonstrate conservation of BAP1/ASXL1/2 function in human TSCs. These are crucial claims in the abstract that are not supported, and some elements of these claims are necessary for the manuscript to have impact beyond the previous Nature 2018 publication. Currently, the studies in human TSCs are purely observational (Fig. 6D-E). The authors should employ genetic approaches to interrogate whether the functions of BAP1 in TSC self-renewal and differentiation are truly conserved between mouse and human.

      2) The main takeaway from Figure 2 is that BAP1 is dispensable for mouse TSC maintenance and that BAP1 knockout results in increased expression of stem cell markers Cdx2 and Esrrb. Both of these findings were previously reported in the authors' 2018 paper (see Fig. 4b in the Nature paper). Therefore, the statement that "BAP1 deletion does not impair the stem cell gene regulatory network" is not surprising and the authors should state clearly that these experiments confirm their prior observations.

      3) The overexpression data in Figure 4 is difficult to interpret. Vector transduced TSCs show a tight, epithelial morphology (Figure 3A), whereas the NT-sgRNA control cells look like they are undergoing EMT (Figure 4C). Why does the introduction of the NT-sgRNA induce EMT characteristics? Bap1 sgRNA1 cells seem less epithelial than the Vector transduced cells. Do NT-sgRNA TSCs have less BAP1 than Vector transduced TSCs?

      4) Moreover, all the data in Figure 4 are based on a single sgRNA that could activate BAP1 expression. To exclude off target effects, the authors should confirm the effect of BAP1 overexpression using another sgRNA or cDNA overexpression system.

      5) The authors need to examine the gene expression data more closely as well as the functional consequences of BAP1 overexpression on TSC proliferation and differentiation. In particular it would be important to compare the list of DEG in BAP1 KO and overexpression condition. Are they mirror-image or are there differences? For example, Zeb2 expression is strongly upregulated in BAP1 mutant line but not significantly altered in cells overexpressing BAP1. This should be discussed.

      6) In the abstract, the authors state that BAP1 function during trophoblast development is dependent on its binding to Asxl1/2/3. However, the data presented in this manuscript do not address whether BAP1 and Asxl1/2/3 are indeed part of the same complex in TSCs. Furthermore, the fact that Asxl1/2 KO increases expression of syncytial genes (Fig. 5) does not provide direct evidence of functional synergy between these proteins and BAP1. This conclusion could be strengthened by demonstrating that Asxl1 and BAP1 indeed have a protein-protein interaction in TSCs and/or by deleting the BAP1 binding domain in Asxl1/2. It would also be instructive to examine whether the phenotype of BAP1 overexpression in TSCs (e.g. gain of epithelial features and reduced invasiveness) is dependent on Asxl1. This could be examined by overexpressing BAP1 in Asxl1-deficient TSCs.

      7) In some cases, experiments are carried out to "confirm" and "corroborate" hypotheses rather than test them. For example, the similarity between the gene expression signature of Bap1 mutant murine TSCs is and Bap1 mutant melanocytes and mesothelial cells is shown and emphasized. One wonders how unique is this similarity? Is Bap1 expression modulation observed in other EMT processes during development or in cancer? This should be explored and discussed.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on December 1 2020, follows.

      Summary

      The study isolates bacteria from diverse Antarctic samples which utilise DMSP as the sole carbon source. It initially focuses on a Gammaproteobacterium, Psychrobacter sp.D2, which the authors establish lacks a known DMSP lyase enzyme despite having DMSP lyase activity (this needs to be quantified). Through RNA-seq and bioinformatics, they identify the gene cluster responsible for this activity and identify a novel DMSP lyase somewhat related to DddD in that it involves CoA, but critically also ATP, which distinguishes it from the pack of other known Ddd enzymes. This enzyme is a ATP-dependent DMSP CoA synthase required for growth on DMSP and its transcription is upregulated by DMSP availability. The novel mechanism of this enzyme is proposed from a strong structural component to the study. The authors propose the downstream pathway for DMSP catabolism, which we find to be oversold and requiring gene mutagenesis to confirm, and to be preliminary in comparison with the authors' other findings. Finally, the study attempts to show how widespread the enzyme is in sequenced bacteria, confidently showing it to be functional in other related Gammaproteobacteria and some Firmicutes.

      Essential Revisions

      1) Title: "a missing route", was it really missing? We would suggest a more precise title. Would be better to say "that releases DMS" or an alternative.

      2) This is a Ddd enzyme by definition and should be named as such.

      Line 27- We disagree with the use of a new gene prefix when there is a strong precedent for the use of Ddd for "DMSP-dependent DMS". If this enzyme is a DMSP lyase and is in bacteria then its naming should follow protocol and be called Ddd"X"-X-being a letter not currently utilised in known systems. Deviating from this convention causes confusion and is not appropriate. Furthermore, AcoD is already assigned in some bacteria to acetaldehyde dehydrogenase II.

      3) As presented, the bioinformatics-based evidence regarding the broad distribution of this enzyme (as claimed e.g. in the Abstract, line 33) does not stand up. Currently as presented in the manuscript, especially Fig 6, we are led to believe the enzyme is more widespread than can be demonstrated based on the authors' evidence (i.e., the authors allow a very low threshold of sequence identity and claim function outside of the groups they have tested). Either more work is needed to show that claims of such a wide distribution are merited, or the authors should limit their claims to what can be substantiated by their work. Specifically, the authors cannot comment on the "functional" enzyme being widespread outside of the Gamma's and Firmicutes that were tested, let alone the importance of the role in DMSP cycling. Only three "AcoD" enzymes were ratified in this study, which are relatively closely related to each (Psychrobacter sp. D2 Sporosarcina sp. P33 and Psychrobacter sp. P11G5 that are > 77% identical to each other). As can be seen in Fig 6, these three proteins cluster together and are far removed from all the other sequences on the figure, for which we have no evidence of their function (i.e., nothing can realistically be said on Deltas, Actinos or Alphas or the MAGS). Just to be clear, these other proteins shown in clades above and below the functional "AcoDs" in fig 6 are only ~30% identical to ratified "AcoD". Furthermore, only strain D2 was shown to make DMS; none of the other strains were tested. Far more testing of the diverse enzymes and strains are needed to make these statements as this study only tests one strain and three of the closely related enzymes (defined on Fig 6). Additional specific comments on this issue:

      Line 280. The sentence on MAGS and the environments containing them does not stand up for reasons summarised above. All MAGS shown on Fig 6 are not similar enough to "AcoD" to be termed as functional Ddd enzymes. More work has to be done on the strains and enzymes that are more divergent to true "AcoDs" before such a statement is supported. Please delete. Line 509-We agree with what the authors write about stringency. However, these parameters do not seem to have been utilised as stated here. Their stringency statement holds up for comparison between the D2 "AcoD" and two other tested "AcoD" enzymes and all those in the middle clade on Fig.6. But this is not the case for the proteins shown above and below this "AcoD" clade in Fig 6 which have at best around 30% identity to characterised enzymes. See below for examples. As the authors state in their methods, high-stringency methods are needed to exclude other acetyl-CoA synthetase family proteins. Thus, most of the genes shown on fig6 cannot be taken as having this Ddd activity.

      "To further validate that these AcoD homologs" the authors examined the activity of two closely related enzymes from a group of nine homologs with > 65 % sequence identity (starting line 283, Figure 6). It is not surprising that these enzymes have the same activity. Homologs outside this group of nine (Figure 6) are far less related to the characterized AcoD (< 32 % seq. identity). Conservation of the phosphate-transferring His (His292) and an active site Trp (Trp391) does not seem to be strong evidence for functional conservation. The manuscript does not provide any additional evidence that these less related enzymes also degrade DMSP. Either more experimentation is necessary, or the paragraph on the "Distribution of the ATP DMSP lysis pathway in bacteria" must be revised.

      For example: Psychrobacter AcoD (WP_068035783.1) is 31% identical to Bilophila sp. 4_1_30 (WP_009381183.1) in the below group of bacteria on Fig 6. Psychrobacter AcoD (WP_068035783.1) is 29% identical to Thermomicrobium roseum (WP_041435830.1) in the above group of bacteria on Fig 6. Line 283. This is not the case! The two sequences that were chosen to "validate" are far to close to the D2 "AcoD" than to MAGS and other potential "AcoDs" shown above and below the functional Ddd clade on Fig 6. This section design is weak and does not lend weight to the expansiveness of this family. More work on the more diverse enzymes and bacteria is needed to support the authors claims. Please delete or study the activity of the more diverse strains and their candidate "AcoDs". Fig. 6. This is a nicely presented figure that unfortunately slightly deceives the reader. The authors need to clearly show which strains they have shown to have Ddd activity (currently one as I understand it) and which enzymes they have shown to have the appropriate activity (currently three closely related enzymes as I understand it). If I am not wrong these are all confined to the middle clade of Gammas and Firmicutes. These stand clearly apart form the other strains (above and below) which have not been studied and which are only ~ 30% Identical to "AcoD" at the protein level. This is not clear on the figure and definitely misleads in the abstract and throughout the manuscript.

      4) We expect to see kinetics done on the new enzyme in line with what the authors have done in other related studies on Ddd and Dmd enzymes.

      This is important to place the work in context with previously identified Ddd and Dmd enzymes, many of which have been analysed by these authors in previous publications. The characterization of the AcoD activity remains entirely qualitative. The authors only provide relative activities measured at a single substrate concentration. This data does not support the following statement: "Mutations of these two residues significantly decreased the enzymatic activities of AcoD, suggesting that these residues play important roles in stabilizing the DMSP-CoA intermediate" (l.223-225).

      5) The manuscript does provide unambiguous evidence for the activity of AcoD and its function during growth on DMSP. On the other hand, the description of the "ATP DMSP lysis pathway" is less clear.

      Transcriptomics analysis (Figure 2C) suggest that growth on DMSP upregulate the genes 1696 (BCCT), 1697 (AcoD), 1698 and 1699. The function of the third and fourth protein remain unclear (line 253). Instead, a reductase (AcuI) encoded somewhere else on the same genome was shown to transform the acryloyl-CoA to propionate-CoA. What was the transcription profile of acuI acuH in the RNA-seq? were they induced by growth on DMSP? Is the 1696-1697-1698-1699 gene cluster conserved? What is the function of 1698 and 1699? These questions are only relevant if the authors plan to maintain the claim of having identified a new pathway. This pathway prediction component is very weak and could be supplemented by KO mutagenesis of the dddCB and acuI. Without such work this is speculation and needs to be written as such.

      6) Appropriate controls, units and quantification should be used:

      Line 102- Please give a normalised value for the level of DMS produced from DMSP per time and protein/cells.

      Figure 2. A. One would expect to see a growth curve of D2 on DMSP compared to acrylate, a conventional carbon source (e.g. pyruvate, glycerol or succinate) and a no carbon control. As "AcoD" is predicted to ligate CoA to DMSP it would be good to know if the strain grows on acrylate. It might be predicted to have different properties to e.g. Halomonas which does grow on acrylate. At least a no carbon and conventional carbon source should definitely be included.

      B. The units for this figure are not appropriate. It would be more appropriate to show the actual amount of DMS that is produced by the strain, ideally normalised to protein, cells or absorbance and time. Detail in the figure what the control is.

      C. Would like to see error bars on this figure. Also would have been sensible to colour code these to match panel D.

      Figure 3. B and C. as with Figure 2 we need to see levels of DMS normalised to cells/protein and time.

      Line 374 - No controls. Please include these as detailed above. No carbon, conventional carbon source, acrylate?

      Quantitative data supporting Supplementary Fig. 12 would be helpful. After all this route would have to explain that the bacteria can use acrylate CoA as sole carbon source (or at least alternatives would have to be discussed). Is the identified activity sufficient for this task?

      Line 388 - This method is/should be quantitative. It is standard practice to report DMS production normalised to time and cells/protein. Here we are only given peak area.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on November 30 2020, follows.

      Summary

      The exact relationship between G6PD deficiency and malaria protection remains uncertain. This study provides evidence that the G6PD Med mutation (563 C>T) protects against clinical Plasmodium vivax disease. It uses a Bayesian statistical approach which specifically elucidates the particular protection which female heterozygotes versus male hemizygotes (or female homozygotes) for the Med mutation may experience. This is an important contribution to our understanding of the relationship between G6PD deficiency and P. vivax.

      Overall, the reviewers were positive about the work and its potential, but have some clear concerns that will require additional data, analyses, and interpretation. Below are the main points raised by the reviewers that would need to be addressed to for a revised manuscript.

      Essential Revisions

      1) The presence of mixed infections: although the work is focused on P. vivax, the majority (95%) of malaria in Afghanistan is caused by P. falciparum that means mixed species infections are likely high and P. falciparum infections may be obscuring P. vivax infections. It is not clear to what extent G6PD deficiency may impact the chance of being coinfected with both falciparum and vivax. Ideally, PCR verification of these samples would be performed to confirm the species for samples included in the analysis. Without this molecular data, the overall assessments of susceptibility to vivax malaria in association with G6PD Med is incomplete.

      2) The analysis relies on a number of assumptions made about Pashtun population genetics (e.g. is it reasonable to assume the same frequency of the relevant mutation throughout all the tribes in the study, and should this be at Hardy Weinberg equilibrium?) and it is not clear to what extent these assumptions are justified since little evidence/support is provided. In particular, the assumptions about Hardy Weinberg equilibrium of G6PD Med within the Pashtun population need to be justified and supported since the analysis is highly reliant on this assumption.

      3) The exclusion criteria does not appear to have been uniformly applied - in particular anemia was an exclusion criteria for only part of the data. This was not clear and may impact the overall significance of statistical results.

      4) While the manuscript makes a number of conclusions about female homozygotes, these are not strongly supported by the evidence. In particular, the study is likely under-powered with regard to clinical associations among female homozygotes with G6PD Med, but this is not addressed and the stated conclusions are likely stronger than what can be supported by the data/analyses provided.

  4. Nov 2020
    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 14 2020, follows.

      Summary

      Injuries of the meniscus are associated with the future development of articular cartilage damage and ultimately osteoarthritis (OA). Prior work in this field has suggest that there are undifferentiated progenitor cells residing in the meniscus and it has been hypothesized that these cells could be harnessed to aid in meniscal healing after injury. The authors provide evidence that Hedgehog activation promotes meniscal healing and identify population of cells that are positive for Gli1, an effector protein in the Hedgehog signaling pathway. Using a combination of approaches, including lineage tracing, in vitro cell culture approaches and cell transplantation experiments, they show that this Gli1 populations contains putative meniscal progenitors involved in meniscus development and healing. Overall, the reviewers found this paper to have high potential and were particularly enthusiastic about the therapeutic potential of Purmorphamine to promote meniscal healing. However, all reviewers felt that the conclusions (and title) of this paper overstated the utility of Gli1 as a marker of de facto meniscal progenitor cells. It is specifically requested that the title of this manuscript be reworded in light of the previous work showing that Gli1 can be found in a number of cell types. In several places, additional work was requested to support the conclusions made in this manuscript. Please see the detailed comments below.

      Essential Revisions

      1) The authors describe many of their results as "novel". Gli1 reporter mice have been used extensively in other tissues to non-specifically describe progenitor cells (bone marrow, periosteum, peri-vascular spaces and others). Further, the role of Gli1+ cells in enthesis and and periodontal ligament (PDL) formation and healing has been previously explored. Gli proteins, which have a half-life of minutes-to-hours, may be a relatively unstable foundation for defining cellular identity. While the value of Gli1 as a general Hh reporter is clear, its utility as a putative stem cell marker (Title) does not seem adequately substantiated. The authors must temper their statements on novelty, exclusivity and utility of Gli1. The title of this paper also should be reworded.

      2) The Hedgehog (Hh) signaling manipulation conducted is rather straightforward and some overlapping studies have been performed in murine joints. Many of the experimental results could have been predicted. Other elements that contribute to the superficial nature of the studies are that Gli1 reporter activity is the only marker of Hh signaling examined (for example Gli2/Gli3 are not), and that the abundance and cellular source of an Hh ligand during development or repair is never entertained. Of note, these reporters for Ihh and Shh are available.

      3) It is a stretch to say that Gli1;tdTom labels meniscus progenitor cells (Lines 268-271). There is relative enrichment of Sca1/CD90/CD200/PDGFRa in Gli1+ cells (Fig 2B), yet the vast majority of cells positive for those markers are Gli1-negative (Fig S5). Positive outcomes during in vitro differentiation and scratch assays may primarily result from increased Hh-mediated proliferation. This logic extends all the way through the in vivo experiments (which are quite promising, translationally).

      4) The spatial profile of Gli1-expressing cells in the meniscus is beautifully described, however an interpretation for the superficially restricted zonation of Gli1 reporter activity is not given. Do these superficial cells have more or less cartilage antigen expression? Is there something clearly physiologically different in the Gli1-rich superficial layers that could be determined? Line 401 cites an osteoblast paper to set up the relevance of Gli1+ cells in development of musculoskeletal tissues. However, the meniscus is much more similar to the enthesis and the PDL. The authors should therefore lead with that literature. The PDL literature in particular is not cited and should be added. Also missing are recent enthesis development/regeneration papers (PMID: 30504126, 26141957, and 28219952).

      5) The characterization of Gli1+ and Gli1- FAC sorted cells could be expanded on a bit.

      6) CFU-F images should be provide in addition to quantification. The differentiation studies in Fig 2E are non-quantitative and not convincing. Further, it is a little contradictory that under certain contexts Gli1+ cells form more cartilage (2E), but under other culture conditions they have reduced cartilage markers (2F). These points need to be clarified.

      7) In Fig 5, changes in distribution or survival of Gli1+/- cells may underlie the difference, but survival nor Gli1- cell distribution were not assessed.

      8) Cartilage differentiation within the meniscus appears to be promoted with Gli1+ cell therapy and Purmorphamine. This could be assessed. Similarly, Hh signaling is known to induce osteogenesis. Osteoblastic antigens and/or presence of osteophytes should be assessed for in purmorphamine treated joints.

      9) One topic that is not covered in the paper is the role of Hh signaling in chondrocyte mineralization. This has been well studied in the growth plate (esp. related to PTHrP / IHH feedback loop) and may have relevance to the meniscus as well. The healing studies should consider this carefully, as ectopic mineralization is a possible negative side effect of Hh treatment.

      10) There are a number of places in the results where it is unclear if the authors are talking about Gli+ cells or Gli1-lineage cells. This should be clarified throughout, perhaps with specific nomenclature that defines "Gli1+" as cells that are positive for Gli and "Gli1-lineage" for cells that are descendants of Gli+ cells. Supplemental Figure 1A should be in the main document. Similar schematics in other figures are very useful for understanding the experiment.

      11) What are the temporal expression patterns of Gli1 and other Hh related genes during development and healing? It would be informative to see localized expression (e.g., in situ hybridization) or qPCR expression for healing tissues.

      12) The authors should clarify a number of things with meniscal cell isolation: (a) There are clearly differences in cell phenotype between superficial and deep areas and between attachment and midsection; was this considered for cell isolation? (b) I assume TAM injections were performed and then cells were isolated a few days later via FACS; please clarify details to show that Gli1+ (not Gli1-lineage) cells were characterized. (c) Fig 2: 3-month old mice were used, but again, Gli+ vs. Gli1-lineage cells is not indicated.

      13) The mechanisms by which Gli1+ and Hh treatments work is not explored. Some of the results are counter-intuitive. For example, why would Hh stimulate proliferation if Gli1+ cells if these are thought to be slow turnover resident stem cells? Furthermore, why would Hh stimulation lead to proliferation rather than differentiation, in contrast to what is know in growth plate biology)?

      14) The assessment of healing is qualitative/semi-quantitative (histomorphometry). The authors should perform a more rigorous assessment of healing to demonstrate the effectiveness of the Gli1+ cell and Hh therapies. This should include quantitative outcome(s) such as qPCR, mechanics, etc.

      15) The Gli1+ cell therapy histologic results are impressive. This is surprising because the delivery method was relatively simple. How much cell engraftment was there? Can the authors comment further (or add experiments to elucidate) on how long the cells were present and what their direct involvement was in healing?

      16) The authors show that native Gli1+ cells expand after injury. If this is the case, what is the rationale for adding more Gli1+ cells? Is the idea that the tissue has the capacity to heal but there aren't enough native Gli1+ cells to do the job?

      17) Figures and text jump between methodologies, making interpretation of results difficult. Fig1 shows that superficial cells of the meniscus generally have active Hh signaling 24-hours prior to a variety of postnatal-to-adult timepoints (A, B, E, F), and postnatal Hh signaling drives proliferation of early meniscus cells (C, D). It does not appear to report any long-term pulse/chase lineage tracing experiment as suggested in the text (Lines 223+). If this interpretation is incorrect, perhaps this could be addressed by increased clarity of figures and text (Methods, Results, Figure organization and captions).

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on November 19 2020, follows.

      Summary

      Kang et al. eloquently describe the active suction organ that the larvae of aquatic insects of the Dipterian family Blephariceridae use to adhere robustly to complex surfaces. While the morphology of the mechanism has been reported previously, it's biomechanical adhesion function and performance across different substrates is unknown. The authors present three advances. First, they quantify the adhesion performance on rough, micro-rough, and smooth surfaces using an effective centrifugal setup. The ultimate adhesion tests show the larvae can resist shear forces up to 1100 times their body weight on smooth surfaces. Second, they visualize the suction function in vivo using interference reflection microscopy. This reveals that small hair like microtrichia can enter gaps in the surface. Because the microtrichia are angled inward, the authors surmise that the microtrichia's angle and small size helps increase adhesion contact area on rough surfaces. Finally, they compare the adhesion performance of the Blephariceridae larvae to other species, showing it is 3-10 times greater than found in stick insects. The finding that the larvae have such high attachment forces is impressive and the study offers new biological insights that may inspire engineers to invent new underwater suction mechanisms.

      Essential Revisions

      Although the reviewers were generally appreciative of the well-written manuscript and the remarkable performance reported for the active suction mechanism, the consensus is that the mechanism itself is not described in sufficient detail for the reader to fully appreciate the advance. Hence the main critiques focus on helping the authors to further flesh out the mechanism and report it in more mechanistic detail like how other adhesion mechanism are described functionally across the biomechanical literature. Further the presentation of the figures does not meet graphic design clarity standards essential to inform eLife's broad readership. To provide guidance, we list the following essential revisions.

      1) The introduction states that the suction organs have been observed, however, the manuscript does not communicate the observed mechanism as one would expect in the biomechanical adhesion literature. Instead it reports the measurements of the force and a suggestion that the microtrichia may be involved. We were hoping to find a quantitative report of the mechanism integrating the force data and microscopy images into biomechanical diagrams and to the extent possible, equations, that capture and communicate the mechanism as quantitatively as possible. Whereas we are not requesting further measurements, because the performance of the mechanism is well documented, we do ask a more in-depth biomechanical analysis that spells out the mechanism in a way it can be compared to the other classic mechanisms that the authors compare to. If this requires some additional measurements to inform the model, those efforts would be well worth it. In case the authors can use a mechanistic analysis lead, we recommend reviewing a couple of papers. E.g. Jeffries, Lindsie, and David Lentink. "Design Principles and Function of Mechanical Fasteners in Nature and Technology." Applied Mechanics Reviews 72.5 (2020). Or any other review or research paper that the authors find more useful.

      2) Please clarify if the experiments are done in air or underwater. We consider underwater as most appropriate; at minimum the surface should be wetted. The authors mention that the Stefan adhesion forces underwater would be higher than in air, but it's not clear if that statement pertains to the experiment. Please provide a full clarification, and in case the experiments were performed in air we would prefer to see them performed in water. If this is not possible, the manuscript should be entirely transparent on this matter so the reader can evaluate the precise merit of this study and its limitations fully.

      3) We found the images confusing at times. To resolve this we would like to see clear schematics (avatars) that ground the reader's perspective in all figures.

      4) Considering eLife's broad multidisciplinary readership and the appeal of this study for bioinspired designers and engineers, Fig 1d,e has to provide better anatomical readability. Please assume a Biology and Engineering undergrad level for the first figure, ensuring all definitions and anatomical names can be fully comprehended without reference to other literature. Please provide clear connections to the different views and perspectives presented in the panels leveraging graphic design to the benefit of the interested reader not familiar with insect morphology.

      5) Likewise, Fig 2 is also confusing. A schematic is in order to show the reader what they are looking at, how the images relate, and why they matter (significance) for understanding the main findings reported in this manuscript.

      6) Fig 3 clearly shows that course-rough surfaces provide far less adhesion force. We wonder, are there any images similar to Fig 6 showing that the microtrichia cannot enter the gaps? To comprehend what causes the differences, we would like to see a report of the length scale of the microtrichia compared to that of the gap's dimensions, both for the rough and micro rough surfaces. To clarify this in a universal fashion, please consider reporting gap size non-dimensionally based on the relevant microtrichia length scale. More discussion of the relevant length scales would help bring the force measurements and the observations of the microtrichia together.

      7) Fig 6 is an important figure, so it would help the reader to more easily grasp the viewing perspective using diagrams and avatars. I panel a, a schematic should clearly define the suction disc fringe and the perspective shown. What part is the suction disc and what is the length scale of this image compared to the suction disc? Also, it would be useful if the columns of the microstructure could all be aligned for clarity.

      8) Currently, the authors provide an estimate of the shear stress. It would be helpful to also include the normal stress based on the normal force data on smooth surfaces for lugubris. It would be informative for the reader to know if it exceeds 1 atm. If so, that is a very interesting finding. Please report and discuss what you find in the revised manuscript.

      9) Discussion: Please include a comparison of the magnitude of shear and normal stress that this suction mechanism creates with that of other organisms. Currently the comparison is done with force per body weight, which is biologically relevant. However, reporting stress provides an objective bio-mechanistic perspective on adhesion performance.

      10) Discussion, Ln 300: The suggestion that the inward-facing microtrichia may function to prevent inward slipping of the suction cup is interesting. Please discuss the tradeoff between smooth and micro-rough surfaces: is it possible that on micro-rough surfaces the microtrichia are better able to resist slip, but on smooth surfaces, the seal is better? And if so, this would suggest the effect of a better seal is more important than preventing slip, since performance is better on smooth surfaces? In-vivo visualization during failure would be very informative (in future work).

      11) Please discuss why there may be an intricate branching of the fan-fibres into the microtrichia. E.g. in the gecko, the branched tendons insert into the lamella, supporting the large tensile loads applied to the adhesive. However, here it is less clear if large tensile loads would be applied to the microtrichia. It seems logical that applying large normal loads to the suction cup should be done at its centre, resulting in decreased pressure if no slip occurs (as opposed to applying the normal force to the rim, which would not decrease pressure). So, this would not explain the intricate network of fan-fibres. However, for shear loads, it could make more sense: pulling in shear would engage the microtrichia on the far side of the cup, and the fan-fibres could help transmit this tension. It might be worth thinking this through and discussing the outcome in the paper to strengthen the mechanistic analysis.

      12) We would be excited to learn if the authors have thoughts on the slight curvature of the microtrichia and how it may be involved in the adhesion mechanism. In case this is purely speculative, this could go into the last paragraph of the paper, alternatively it could go into the biomechanical model of the mechanism.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on November 19 2020, follows.

      Summary

      The paper reports the involvement of isoleucine 553 in targeting Drp6 to cardiolipin containing nuclear membrane. The data are interesting, but there is no mechanistic understanding of how a single amino acid can target this protein so specifically to cardiolipin enriched membranes.

      Essential Revisions

      The authors are strongly requested to address the issues that were raised in the previous review. The authors state in their rebuttal that they plan to address them in a timely manner. The additional request of one reviewer that should be addressed is to test the involvement of residue 552 and 554 to highlight the significance of isoleucine in position 553 in targeting Drp6 to cardiolipin.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 28 2020, follows.

      Summary

      This manuscript presents new tool to detect and classify mice ultrasonic vocalizations (USVs). The tool (VocalMat) applies neural network technology for categorization of the various USVs to predetermined categories of pup calls. The paper in the form submitted seems to fit more as a methodology paper. Indeed, the authors state that the goal of their work is to: "create a tool with high accuracy for USV detection that allows for the flexible use of any classification method."

      The paper is well written and presents a useful tool to identify and classify USVs of mice. However, the reviewers think that the authors did not provide enough supporting evidence to claim that their method is significantly superior to other tools in the literature that attempted USV classification. For example Vogel et al (2019) - https://doi.org/10.1038/s41598-019-44221-3] - reported very similar (85%) accuracy using more mainstream ML approaches than attempted in this study with CNNs.

      Moreover, some of the reviewers were not convinced that the comparison to other tools was conducted in an unbiased and completely fair manner and that the approach described in this paper really represents a significant advantage over other tools. For example, two reviewers claim that the authors used DeepSqueak on their dataset without properly training it for this type of data, while their tool is specifically trained for it. Also, the reviewers expect to see a confusion matrix to assess model performance and establish whether the model does indeed replicate accurately classes (or how skewed it is with dominating classes).

      Overall, all the reviewers agree that they would like to see a more rigorous attempt to validate the findings presented (ideally also on an external database) and proper (unbiased) comparison with other similar software, to justify the claim that VocalMat performance in classification of USVs is indeed superior and novel to the methods already in use.

      If the authors wish to have the manuscript considered as a research paper and not in the form of a methods paper they should change the focus of the paper and provide more data showing a novel biological application of their pup calls classification findings. If not, we will be happy to consider a suitably revised version of the manuscript for the Tools and Resources section of eLife.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 28 2020, follows.

      Summary

      This work by Berger et al examined the process of De Novo infection by a model gamma herepesvirus, MHV68, using two complementary single cell approaches - CyTOF and scRNAseq. Using CyTOF and scRNA-seq, they characterize host and viral expression of protein and RNA during infection by the gammaherpesvirus MHV68. From CyTOF of numerous host proteins and one viral protein, they propose that the DNA damage marker pH2AX along with the viral protein vRCA are more precise indicators of progressive infection than a standard LANA reporter. Using a single viral (ORF18) and host RNA (Actin), they demonstrate that pH2AX+, vRCA+ cells uniformly express ORF18. To more closely examine viral RNAs, they performed scRNA-seq on infected cells and observe a high level of heterogeneity in viral gene expression.

      The manuscript is very well written and could potentially be a very welcome addition to the growing field of single cell virology. However, some concerns were raised regarding some of the conclusions and validation of the results. In particular, the variability in gene expression does not fall into existing models of kinetically regulated waves of viral transcription. This and their previous work convincingly argue that bulk measurements of protein and RNA are insufficient to represent the complexity of de novo MHV68 infection. However, in the absence of functional significance to the many clusters identified the impact of the conclusions is limited. With regard to validation, the authors must also consider that inherent variability in scRNAseq technology that could complicate the accurate measure of viral RNA. This should be discussed and addressed with additional data and/or experiments (see below).

      Essential Revisions

      1) The reviewers agreed that this article will be a very useful resource for the single cell virology community, but require further validation to realize that potential. As such, this article should be resubmitted as a "Tools and Resources" article. Furthermore, this revision should pay careful attention to the additional essential revisions that follow this point, in particular there are areas that require more data for validation. Ideally, existing data or experiments closely related to those conducted can be used.

      2) One of the more dramatic conclusions from the paper is that while the median infected cell expressed 52 viral genes, this ranges from 12 to 66 with only a handful of genes expressed uniformly. However, there are a number of indications that this may be explained instead by the stochastic failure to detect lowly expressed viral genes: 1) Figure 1A shows a tight distribution of the # of viral genes detected, which would be unlikely if there were multiple classes of infected cells expressing different subsets of viral genes. 2) Figure 1B shows a strong relationship between the average expression level and the frequency of detection, most easily explained by poor capture efficiency or another technical artifact resulting in undersampling. 3) These results fail to recapitulate known kinetic classes or uniform LANA expression. 4) Figure S3 indicates that even among host genes, the median cell had only a ~1,000 genes per cell detected, likely an insignificant fraction of expressed genes detected to assess viral gene number. These inconsistencies make it difficult to assess whether the observed heterogeneity is a true reflection of the gene expression profiles during infection or a reflection of the inability to detect lowly expressed transcripts by scRNA-seq.

      Given the inherent "noisy" nature of scRNA-seq, it is usually hard to quantify how much of a given mRNA expression variability among individual cells is due to technical limitations, and how much is due to biological differences. The authors could settle this question for at least a small amount of genes, by comparing the variability they see in scRNAseq to that they measure in PrimeFlow and CyTOF (although the latter has the added complication of comparing RNA to protein, but would still be valuable to discuss). If they compare the heterogeneity observed for the given proteins in CyTOF with what they observe for the corresponding transcripts in scRNAseq they will both validate their finding and will be able to estimate how much of their variability in scRNAseq translates to the protein level. They can do the same with their FlowPrime data, which would be even more informative as both measure transcripts. These approaches would be ideal as the data should be readily available. Alternatively, some of the expression should be correlated by RT-qPCR or by Northern blot or if single cell is necessary, then by in situ hybridization.

      The fact that the data do not pick up the established signatures of early vs. late gene expression goes against the bulk of work on viral gene expression control. More discussion about why this may be, including limitations of scRNAseq for less abundant transcripts is warranted.

      3) In figure 3A, the authors observe and note both pH2AX+, vRCA- and pH2AX-, vRCA+ cell populations; based on ORF18 or Actin expression, a significant fraction of these cells are infected. The proportion of cells in each gate is not quantified, but it appears that these single-positive cells represent a significant fraction of the total infected cells. However, in Figure 1C their appears to be no major single-positive populations, and the authors note that vRCA and pH2AX levels are highly correlated. This suggests that the cells are missing from the CyTOF analysis (perhaps lying outside of the two gates presented in figure S1A). These missing cells undercuts the value of the dataset and analysis and may lead to incorrect interpretations of pH2AX's value as a marker. Addressing this discrepancy in the FlowPrime/CyTOF data and some form of validation of scRNAseq (either by leveraging their protein data or via independent experiments) will be important for establishing the datasets as a reliable resource.

      Two related issues in the text: Line 217. "demonstrate that pH2AX+ and vRCA+ show progressive infection.." Progression implies that the study occurs over different time points, but the time parameter is not measured in these studies. It is not clear to me that these different phenotypes relate to different temporal stages of the infection or if they are different terminal outcomes. The authors should use another term than "progressive" in this context. Line 423 - the use of the work "progression" implies temporal studies which were not performed in this work. The study is a snapshot of a single time point and "progression" is inferred.

      4) Phenotype variation may be due to variation in cell cycle stage, cell viability and age, and asynchronous infection. To what extent are these variables controlled or considered in the analysis?

      5) In Figure 3, the authors show that ~20% of mock-infected cells are negative for beta-actin RNA. This seems quite odd for a house keeping gene, and the corresponding PrimeFlow data is not shown. I assume that this has to do with the authors gating strategy, or some technical issue with PrimeFlow that prevents all RNA molecules from being labeled. In either case, it would be helpful if the authors clarify this point and include the data for the mock cells in the figure.

      6) Could the authors explain their rational for including the CycKO mutant in the analysis and in combining the wt and KO data into one analysis? A-priori, if the mutant has no effect on the current question (de novo infection of fibroblasts) I would suggest excluding it from the paper and only showing the wt data, or to present the data for the mutant in a supplementary file, stating similar results were obtained with it. Although the authors states that only five genes were differently expressed between the wt and mutant, it seems wrong to aggregate the data from the different viruses into a single analysis.

      7) In Figure 6 and the accompanying text, the authors make a distinction between "virus-biased" and "host-biased" cells, based on the % of viral genes expressed in each cell. They go on to claim that "no significant difference in host gene expression among expressed genes" was found between these two groups. The statistical analysis for this result seems to be an ANOVA test, which I believe is not appropriate for this analysis. As the authors are comparing two distribution, something like a Kolmogorov-Smirnov test is needed. Additionally, in the text (line 314), the authors claim that no substantial difference is seen for cell-cycle genes between "virus-biased" and "host-biased" cell (Figure S6A). Looking at the data, it seems to me that G2 cells are highly enriched in the "host-biased" group. A formal quantitative analysis is needed to make this point.

      8) In line 316 the authors state that "host-biased cells expressed a number of interferon-response genes (Figure S6 and Table S3), suggesting a potential role in resistance to infection". I think this claim is not fully supported by the data. Since single-cell RNA-sequencing is a "zero sum" technique, cells with a higher proportion of viral gene expression are bound to show less host genes (as the authors have shown in Figure 6), including ISGs. To show that these cells are indeed expressing more ISGs than the "virus-biased" cells, would require sorting the different populations, as well as mock-infected cells, and measure ISGs (by methods such as qPCR, RNAseq, PrimeFlow, WB etc.), or at least have some analysis that takes into account the increased drop-off of host genes in cells with high levels of viral genes (something like a permutation test?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 27 2020, follows.

      Summary

      The referees agree your work on reconstitution of Drosophila septin hexamers into filaments on supported bilayers and their characterization and comparisin with the yeast counterparts is interesating and important. However, the referees raise a number of important points, all of which need to be addressed satisfactorily before publication.

      Essential Revisions

      1) In all sets of experiments different amounts of PS, PIP2 and septin concentrations are used. How can the obtained data be discussed in terms of these parameters, if they are not really comparable? What is the rationale of using the different conditions? For example: mEGFP-tagged fly septins are crowed with methylcellulose on neutral PC SLBs. Septin concentrations of 100-500 nM were used (Figure1). In Figure 2, however, 1000 nM septin is used without methylcellulose. AFM and QCM-D data were obtained with 20 % PS, no PIP2, and 10 nM septin concentrations. These conditions do not resemble the conditions in the TIRF experiments (as written). In the TIRF experiments 1000 nM septin was used. Cryo-EM data were obtained for 6 mol% PIP2, no PS. In the discussion a model (or several models) are proposed, which appear to be highly speculative. The results are compared to those with yeast septins reported in literature. As this comparison is the major point that is made in the manuscript it would be important to perform at least one of the experiments with yeast septin for direct comparison.

      2) The authors should create lipid bilayers on curved surfaces like glass rods to simulate the ability of the septins to create annulus structures as in vivo. Indeed it seems that on vesicles the structure of septin filaments hardly differ from the monolayer case. Adding topographica and geometrical cues to the septin assembly can potentially bring new insights in how they can assemble in rings or sheets.

      3) Examinations of septin hexamers only containing the short or long coiled coils or mixing two populations of septin hexamers (wt and ΔCC) to see whether the ΔCC are excluded from any filament stacks would be highly recommended to support the final model.

      4) Regarding the results in Fig.2D: A simple calculation explains why you find dense septin packing on SLBs at 10nM. Assuming a septin hexamer has the area of 4x24 nm2 and the flow chamber has an area of 5x20 mm2 and a height of 2mm, you would need about 1ꞏ 1012 septin hexamers to cover the SLB. This number of septins in the volume of the flow chamber would correspond to a concentration of about 8.7 nM. It is not clear why the authors did not check lower septin hexamer concentrations as this would simply require further dilution of the stock solution. These results seem to be also in conflict with the AFM results, where individual septin filaments are observed at 12nM and 24nM. The authors should clarify this difference.

      5) The comments about mechanical stability of the lipid bound septins are unsurprising and not very conducive as they describe GTA fixed septins. Studies of lateral stability don't have to relate directly to enhanced cortex stability. It would have been more powerful to compare the stability of septin decorated GUVs. Sorry, but the discussion about septin layer height limitation is very speculative and would be much better founded if the authors would have done some more experiments. The claim that the layer is self-limiting is not in line with the TIRF data that shows a steady increase of fluorescence intensity at 500nM septin. It would have been good to add AFM and QCM data on samples with higher septin concentrations, e.g. 500nM, to prove that the layer remains indeed within 12-21nm. It would also be insightful to either test mixtures of ΔCC septins with full length septins or to generate septins only lacking the long coiled-coils or the short coiled-coils to support the conclusions of the authors.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 30 2020, follows.

      Summary

      Using a mouse model of melanoma, this report demonstrates the relevance of the CD300a immunoreceptor, specifically in dendritic cells (DCs), in tumor growth. It shows that the absence of CD300a is correlated with a higher number of regulatory T cells (Tregs) within the tumor microenvironment and therefore the tumor grows faster and survival decreases. Based on additional experiments, the authors propose a mechanism by which tumor-derived extracellular vesicles (TEVs) interact with CD300a in DCs, decreasing IFNbeta production which subsequently reduces the number of Tregs. In addition, data from melanoma patients show a correlation between overall survival and higher levels of CD300a expression in the tumor.

      Essential Revisions

      1) It is highly recommended to clearly demonstrate the role of IFNbeta in the proposed mechanism. In addition to using an anti-IFNbeta mAb in an in vitro culture (Figure 3D), other experiments must be performed, such as in vivo experiments with the anti-IFNbeta mAb. The authors have used this mAb in their previously published article (Nakahashi-Oda et al., Nature Immunology, 2016). Alternatively, in vivo experiments could also be performed with IFNAR1-like (IFN alpha and beta receptor 1 subunit) KO animals.

      In addition, is the observed increase in Tregs within the tumor in CD300a-/- animals due only to an increase in IFNbeta production by DCs? Are there other cytokines and/or cell-cell contact that may play a role? At least this should be discussed.

      2) Why are not all the experiments performed on CD300afl/fl Itgax-Cre mice instead of CD300a-/- mice? The experiments in Figures 2a, S2C, 3 and 4 should have been performed on CD300afl/fl Itgax-Cre mice. This is very important to state unequivocally that only CD300a in DCs is involved in the induction of an immune response capable of inhibiting tumor development.

      3) The authors found expansion of tumor-infiltrating Tregs in mice deficient in CD300a. However, no increase in Tregs was observed in tumor-draining lymph nodes. Did authors assess the expression of Treg activation and proliferation molecular markers, such as CD25, CTLA4, GITR, CD39, CD73 or Ki67? If indeed, Treg expansion as a result of CD300a-deficiency is the cause of enhanced tumor growth, authors should provide more evidence of Treg suppressive response. For example, authors can consider measuring the levels of co-stimulatory molecules (e.g. CD40, CD80 and CD86) on dendritic cells, which generally correlate with Treg activitiy and/or tumoral IL-2 concentration.

      4) PD-1 is the only marker analyzed to assess the exhausted status of CD8+ T cells infiltrating tumor lesion of CD300a-/- mice. Additional evidence of this functional status could be provided, such as for instance expression of CTLA4, TIM3 or other immune checkpoints, or low Ki67 levels. Indeed, particularly in reference of the human setting, PD-1 is also a sign of T cell activation, usually expressed in T cells infiltrating highly immunogenic and hot tumors. Hence, it would be useful having a broader characterization of immune effectors associated with progressing tumor microenvironment when CD300a is lost.

      5) Since authors have Foxp3-reporter mice, they should confirm their data in Fig. 3D with natural / freshly isolated Tregs, unless they are suggesting that CD300a mainly prevents in situ conversion of intra-tumoral CD4+Foxp3- Tconv cells into Tregs.

      6) Given that the interaction between CD300a and phosphatidylserine (PS) is critical to CD300a activation, PS co-localization with CD300a ought to be included in confocal microscopy. In addition, the binding of CD300a to PS and PE, which are both upregulated in dead cells, implies that apoptotic bodies could also be shuttling comparable signaling, Can the authors exclude that these particles are present in the EV preparations? Furthermore, does tumor supernatant lose any effect when depleted of EVs? The latter evidence could significantly strengthenthe exclusive involvement of exosomes in the process.

      7) Did authors validate the importance of PS in the context that they propose with an anti-PS blocking antibody? There are not many anti-PS blocking antibodies available and they might not block engagement with CD300a (see Nat Commun. 2016 Mar 14;7:10871). Nonetheless, this would be a good assay to demonstrate PS as the ligand that triggers CD300a to inhibit TLR3 and subsequent IFN-β production.

    1. Author Response

      Summary: This study tackles a difficult problem of understanding the basis for hippocampal theta rhythms through reduction of a highly detailed model, seeking to validate a reduced model that would be more amenable to analysis. The reviewers appreciated the attention to this challenging problem and the substantial work that went into it, but had several fundamental concerns about the methodology, interpretation, and reporting.

      We appreciate the detailed feedback provided to us by the reviewers and editors and we are pleased that there was an appreciation for the attention we have given to this challenging problem and the substantial work that went into it. We would like to thank the reviewers for their efforts.

      This feedback helped us realize that there was possibly too much presented in this single paper and moving forward, we will split the work into two papers. While we agree with some of the feedback, we think that some aspects were misunderstood, which may be partially due to the extensiveness of the submitted paper. Below we provide general responses to the points raised, leaving specifics for elsewhere.

      Reviewer #1:

      This study takes two existing models of hippocampal theta rhythm generation, a reduced one with two populations of Izhikevich neurons, and a detailed one with numerous biophysically detailed neuronal models. The authors do some parameter variation on 3 parameters in the reduced model and ask which are sensitive control parameters. They then examine control of theta frequency through a phase response curve and propose an inhibition-based tuning mechanism. They then map between the reduced and detailed model, and find that connectivity but not synaptic weights are consistent. They take a subset of the detailed model and do a 2 parameter exploration of rhythm generation. They compare phenomenological outcomes of the model with results from an optogenetic experiment to support their interpretation of an inhibition-based tuning mechanism for intrinsic generation of theta rhythm in the hippocampus.

      This statement summarizes our work to a certain extent but it misses a key aspect – the ‘mapping’ between the minimal (that this reviewer refers to as ‘reduced’) and detailed model is what is used to rationalize and motivate the subsequent extensive 2-parametric exploration in a ‘piece’ of the detailed model (which we termed the segment model). We will aim to write this more clearly in an edited version.

      General comments:

      1) The paper shows the existence of potential rhythm mechanisms, but the approach is illustrative rather than definitive. For example, in a very lengthy section on parameter exploration in the reduced model, the authors find some domains which do and don't exhibit rhythms. Lacking further exploration or analytic results, it is hard to see if their interpretations are conclusive.

      We agree that these are interpretations (not meant to be conclusive), but the goal was to use the minimal model to develop further insight as we did with a hypothesis development presented in the middle of the paper.

      2) The authors present too much detail on too few dimensions of parameters. An exhaustive parameter search would normally go systematically through all parameters, and be digested in an automated manner. For reporting this, a condensed summary would be presented. Here the authors look at 3 parameters for the reduced model and 2 parameters in the detailed one - far fewer than the available parameter set. They discuss the properties of these parameter choices at length, but then pick out a couple of illustrative points in the parameter domain for further pursuit. This leaves the reader rather overwhelmed on the one hand, and is not a convincing thorough exploration of all parameters of the system on the other.

      See above.

      3) I wonder if the 'minimal' model is minimal enough. Clearly it is well- supplied with free parameters. Is there a simpler mapping to rate models or even dynamical systems that might provide more complete insights, albeit at the risk of further abstraction?

      We agree that models can be even more minimal, but the goal here was not to further analyse the minimal model through simpler mappings or otherwise. Rather, it was to exploit linkages between the minimal model and detailed models to help understand how theta rhythms could be generated in the biological system (Goutagny et al. 2009 intrinsic theta), using a piece of the detailed model as a ‘biological proxy’.

      4) Around line 560 and Fig 12 the authors conclude that only case a) is consistent with experiment. While it is important to match data to experiment, here the match is phenomenological. It misses the opportunity to do a quantitative match which could be done by taking advantage of the biological detail in the model.

      5) The paper is far too long and is a difficult read. Many items of discussion are interspersed in the results, for example around line 335 among many others.

      We will split the paper into two.

      Reviewer #2:

      In this work Chatzikalymniou et al. use models of hippocampus of different complexities to understand the emergence and robustness of intra-hippocampal theta rhythms. They use a segment of highly detailed model as a bridge to leverage insights from a minimal model of spiking point neurons to the level of a full hippocampus. This is an interesting approach as the minimal model is more amenable to analysis and probing the parameter space while the detailed model is potentially closer to experiment yet difficult and costly to explore.

      We completely agree.

      The study of network problems is very demanding, there are no good ways to address robustness of the realistic models and the parameter space makes brute force approaches impractical. The angle of attack proposed here is interesting. While this is surely not the only approach tenable, it is sensible, justified, and actually implemented. The amount of work which entered this project is clear. I essentially accept the proposed reasoning and the hypotheses put forward. The few remarks I have are rather minor, but I think they merit a response.

      1) l. 528-530 "This is particularly noticeable in Figure 9D where theta rhythms are present and can be seen to be due to the PYR cell population firing in bursts of theta frequency. Even more, we notice that the pattern of the input current to the PYR cells isn't theta-paced or periodic (see Figure 10Bi)."

      This is a loose statement. When you look at the raw LFP theta is also not apparent (e.g. Figure 9.Ei or Fi). What happens once you look at the spectrum of the activity shown in 10.Bi? Do you see theta or not?

      We agree – to be done.

      2) l. 562 "This implies that the different E-I balances in the segment model that allow LFP theta rhythms to emerge are not all consistent with the experimental data, and by extension, the biological system."

      This is speculative. We do not know how generic the results of Amilhon et al. are. They showed what you can find experimentally, not what you cannot find experimentally. I agree with the statement from l.581, though : "Thus, from the perspective of the experiments of Amilhon et al. (2015) theta rhythm generation via a case a type pathway seems more biologically realistic ..."

      We agree – to edit accordingly.

      3) There are several problems with access to code and data provided in the manuscript.

      l. 986, 1113 - osf.io does not give access l. 1027 - bitbucket of bezaire does not allow access l. 1030 - simtracker link is down l. 1129, 1141 - the github link does not exist (private repo?)

      Our apologies that all of these were not made public as intended.

      4) l. 1017 - Afferent inputs from CA3 and EC are also included in the form of Poisson-distributed spiking units from artificial CA3 and EC cells.

      Not obvious if Poisson is adequate here - did you check on the statistics of inputs? Any references? Different input statistics may induce specific correlations which might affect the size of fluctuations of the input current. I do not think this would be a significant effect here unless the departure from Poisson is highly significant. Any comments might be useful.

      We were simply using the same input protocol setup done by Bezaire et al. 2016.

      5) l. 909 - "Euler integration method is used to integrate the cell equations with a timestep of 0.1 msec."

      This seems dangerous. Is the computation so costly that more advanced integration is not viable?

      Our apologies as the timestep was erroneously reported. At initial stages of the project, larger stepsizes were attempted to speed up computation. The stepsize/integration used were as in minimal model of Ferguson et al. (2017). That is, Euler integration with a 0.04ms stepsize for the cell simulations and Runge-Kutta for network simulations.

      Reviewer #3:

      [...] I have a number of methodological issues with the paper. First, both models should be validated against experimental evidence given that the experimental results exist. The validation of a "minimal" model with data from another model is circumstantial and useful to link two models, but in no way is a scientific validation, in my opinion. Second, the model reduction arguments are simply taken as a piece of a large model. This is in now way a systematic reduction, which the authors should provide. In the absence of that, the two models are simply two different models. Third, it is not clear what aspects of the mechanisms cannot be investigated using the larger models that require the reduced models, given that the models do not necessarily match. Fourth, the concept of a minimal model should be clearly explained. They used caricature (toy) models of (2D quadratic models, aka Izhikevich models) combined with biophysically plausible descriptions of synapses. The model parameters in 2D quadratic models are not biophysical as the authors acknowledge, but they can be related to biophysical parameters through the specific equations provided in Rotstein (JCNS, 2015) and Turquist & Rotstein (Encyclopedia of Computational Neuroscience, 2018). In fact, they can represent either h-currents or M-currents. I suggest the authors determine this from these references. In this framework, the dynamics would result from a combination of these currents and persistent sodium or fast (transient) sodium activation. Fifth, from the original paper (Ferguson et al., 2017) their minimal model has 500 PV and 10000 PYR cells (I couldn't find the number of PV cells in this paper, but I assumed they were as in the original paper). This is not what I would call a minimal model. It is minimal only in comparison with the more detailed model. While this is a matter of semantics, it should be clarified since there are other minimal model approaches in the literature (e.g., Kopell group, Erdi group). Related to these models, it is typically assumed that the relationship between PYR to PV is 5/1. This is certainly not holy, but seems to have been validated. Here it is 20/1. Is there any reason for that? Sixth, the networks are so big that it is very difficult to gain some profound insight. What is it about the large networks and their contribution to the generation of theta activity that cannot be learned from "more minimal" networks?

      The published minimal model (Ferguson et al. 2017) used experimental data constraints on EPSC and IPSC ratios to come up with the prediction of connectivity. As this connectivity was found in the detailed model (with empirically determined connections), this can be considered a form of validation for the minimal model’s predictions if one considers the detailed as a ‘biological proxy’.

      We agree that the segment model is not a systematic reduction of the detailed model. The segment model reasonably represents a ‘piece’ of the CA1 microcircuit that was experimentally shown to be possible to be able to generate oscillations on its own (see Goutagny et al. 2009 Supplementary figure 11). This was the assumption in determining the network size of the previously published minimal model. A large network is needed in order to appropriate capture the very large EPSCs relative to IPSCs that are present in the experiment. This is the essence of why smaller network sizes cannot be justified.

      Because of these concerns and the development of the paper, I believe the paper is about the comparison between two existing models that the authors have constructed in the past and the parameter exploration of these models.

      We do not fully agree with this statement. The minimal model was constructed by us (Ferguson et al. 2017), but the detailed model was painstakingly constructed in a state-of-the-art fashion by Bezaire et al. 2016. We used a ‘piece’ of this detailed model (see above) so that we could make ‘links’ with the minimal model in understanding the generation of intrinsic theta rhythms. This ‘piece’ also allowed us to do the extensive exploration for the additional results presented. The paper is about taking advantage of the comparison and linkage of minimal and detailed models to show how theta rhythms are generated and their frequencies controlled.

      I find the paper extremely difficult to read. It is not about the narrative, but about the organization of the results and the lack (or scarcity) of clear statements. I can't seem to be able to easily extract the principles that emerge from the analysis. There are a big number of cases and data, but what do we get out of that?. Perhaps creating "telling titles" for each section/subsection would help, where the main result is the title of the section/subsection. I also find an issue with the acronyms. One has to keep track of numbers, cases, acronyms (N, B), etc. All that gets in the way of the understanding. I believe figures would help.

      Another confusing issue in the paper is the use of the concept of "building blocks". I am not opposed to the use of these words, on the contrary. But building blocks are typically associated with the model structure (e.g., currents in a neuron, neurons in a network). PIR, SFA and Rheo are a different type of building blocks, which I would call "functional building blocks". They are building blocks in a functional world of model behavior, but not in the world of modeling components. For example, PIR can be instantiated by different combinations of ionic currents receiving inhibitory inputs. Also, the definitions of the building blocks and how they are quantified should be clearly stated in a separate section or subsection.

      The concept of building blocks was directly taken from Gjorgjieva et al. 2016 as cited in Ferguson et al. 2017 when we first used it, but also cited in the present paper, but for a different point.

      I disagree with the authors' statement in lines 214-216, related to Fig. 4. They claim that "From them, we can say that the PYR cell firing does not speci1cally occur because of their IPSCs, as spiking can occur before or just after its IPSCs." Figure 4 (top, left panel) suggests the opposite, but instead of being a PIR mechanism, it is a "building-up" of the "adaptation" current in the PYR cell. (By "adaptation" current I mean the current corresponding to the second variable in the model. If this variable were the gating variable of the h-current, it would be the same type of mechanism suggested in Rotstein et al. (2005) and in the models presented in Stark et al. (2013).) The mechanism operates as follow: the first PV-spike (not shown in the figure) causes a rebound, which is not strong enough to produce a PYR spike before a new PV spike occurs (the first in the figure), this second PV-spike causes a stronger rebound (it is super clear in the figure), which is still not strong enough to produce a PYR-spike before the new PV-spike arrives, this third PV spike produces a still stronger rebound, which now causes a PYR spike. The fact that this PYR spike occurs before the PV spike is not indicative of the authors' conclusions, but quite the opposite.

      The authors should check whether the mechanistic hypothesis I just described, which is consistent with Fig. 4 (top, left panel), is also consist with the rest of the panels and, more generally, with their modeling results and the experimental data and whether it is general and, if not, what are the conditions under which it is. If my hypothesis ends up not being proven, then they should come up with an alternative hypothesis. The condition the authors' state about the parameter "b" and PIR is not necessarily general. PIR and other phenomena are typically controlled by the combined effect of more than one parameter. As it stands, their basic assumption behind the PRC is not necessarily valid.

      The subsequent hypothesis (about PYR bursting) is called to question in view of the previous comments. The experimental data should be able to provide an answer.

      See above.

      The authors should provide a more detailed explanation and justification for the presence of an inhibitory "bolus". What would the timescale be? Again, the data should provide evidence of that. In their discussion about the PRC, the authors essentially conclude what they hypothesis, but this conclusion is based on the "bolus" idea. The validity of this should be revised.

      The discussion about degeneracy of the theta rhythm generation is interesting. However, because of the size and complexity of the models, this degeneracy is expected. Their minimal modeling approach does not help in shedding any additional light. In addition, the authors' do not discuss the intrinsic sources of degeneracy and how they interact with the intrinsic ones.

      The last two sections were difficult to follow and I found them anecdotal. I was expecting a deeper mechanistic analysis. However, I have to acknowledge that because of my difficulty in following the paper, I might have missed important issues.

      These last sections are where the ‘piece’ of the detailed model (that we termed the segment model) - a ‘biological proxy’ - essentially shows that the theta rhythm is initiated from the pyramidal cells and that the frequency is controlled by the net input to the pyramidal cells.

      The discussion is extensive, exhaustive and interesting. But it is not clear how the paper results are integrated in this big picture, except for a number of generic statements.

      The proposal that the hippocampus has the circuitry to produce theta oscillations without the need of medial septum input has been proposed before by Gillies et. (2003) and the models in Rotstein et al. (2005) and Orban et al. (2005). But the idea from this work is not that the hippocampus (CA1) is a pacemaker, but rather what we now call a "resonator". To claim that the MS is simply an amplificatory of an existing oscillator is against the existing evidence.

      We agree that many models show theta generation without explicit mention of the medial septum. However, what our modelling work shows is how the intrinsic theta rhythm is generated – it is initiated by the pyramidal cells (large enough network size with some recurrent connections) and the control of the theta frequency (LFP) is due to the net input to the pyramidal cells – this is the main claim of the paper. This is explicitly in reference to an intrinsic theta rhythm experimental context. From there, we suggest that MS and other inputs could amplify an already existing intrinsic rhythm in the CA1 microcircuit.

      References:

      Bezaire, M. J., Raikov, I., Burk, K., Vyas, D., & Soltesz, I. (2016). Interneuronal mechanisms of hippocampal theta oscillation in a full-scale model of the rodent CA1 circuit. ELife, 5, e18566. https://doi.org/10.7554/eLife.18566

      Ferguson, K. A., Chatzikalymniou, A. P., & Skinner, F. K. (2017). Combining Theory, Model, and Experiment to Explain How Intrinsic Theta Rhythms Are Generated in an In Vitro Whole Hippocampus Preparation without Oscillatory Inputs. ENeuro, 4(4). https://doi.org/10.1523/ENEURO.0131-17.2017

      Gjorgjieva, J., Drion, G., & Marder, E. (2016). Computational implications of biophysical diversity and multiple timescales in neurons and synapses for circuit performance. Current Opinion in Neurobiology, 37, 44–52. https://doi.org/10.1016/j.conb.2015.12.008

      Goutagny, R., Jackson, J., & Williams, S. (2009). Self-generated theta oscillations in the hippocampus. Nature Neuroscience, 12(12), 1491–1493. https://doi.org/10.1038/nn.2440

    2. Reviewer #3:

      The authors combine minimal and detailed models of hippocampal theta rhythm generation to understand the underlying mechanisms at the cellular-network level. In their 3 steps approach, they extend previous minimal models, they compare these minimal models with more detailed models and they use a piece (segment) of the detailed model to compare it to the minimal models.

      I have a number of methodological issues with the paper. First, both models should be validated against experimental evidence given that the experimental results exist. The validation of a "minimal" model with data from another model is circumstantial and useful to link two models, but in no way is a scientific validation, in my opinion. Second, the model reduction arguments are simply taken as a piece of a large model. This is in now way a systematic reduction, which the authors should provide. In the absence of that, the two models are simply two different models. Third, it is not clear what aspects of the mechanisms cannot be investigated using the larger models that require the reduced models, given that the models do not necessarily match. Fourth, the concept of a minimal model should be clearly explained. They used caricature (toy) models of (2D quadratic models, aka Izhikevich models) combined with biophysically plausible descriptions of synapses. The model parameters in 2D quadratic models are not biophysical as the authors acknowledge, but they can be related to biophysical parameters through the specific equations provided in Rotstein (JCNS, 2015) and Turquist & Rotstein (Encyclopedia of Computational Neuroscience, 2018). In fact, they can represent either h-currents or M-currents. I suggest the authors determine this from these references. In this framework, the dynamics would result from a combination of these currents and persistent sodium or fast (transient) sodium activation. Fifth, from the original paper (Ferguson et al., 2017) their minimal model has 500 PV and 10000 PYR cells (I couldn't find the number of PV cells in this paper, but I assumed they were as in the original paper). This is not what I would call a minimal model. It is minimal only in comparison with the more detailed model. While this is a matter of semantics, it should be clarified since there are other minimal model approaches in the literature (e.g., Kopell group, Erdi group). Related to these models, it is typically assumed that the relationship between PYR to PV is 5/1. This is certainly not holy, but seems to have been validated. Here it is 20/1. Is there any reason for that? Sixth, the networks are so big that it is very difficult to gain some profound insight. What is it about the large networks and their contribution to the generation of theta activity that cannot be learned from "more minimal" networks?

      Because of these concerns and the development of the paper, I believe the paper is about the comparison between two existing models that the authors have constructed in the past and the parameter exploration of these models.

      I find the paper extremely difficult to read. It is not about the narrative, but about the organization of the results and the lack (or scarcity) of clear statements. I can't seem to be able to easily extract the principles that emerge from the analysis. There are a big number of cases and data, but what do we get out of that? Perhaps creating "telling titles" for each section/subsection would help, where the main result is the title of the section/subsection. I also find an issue with the acronyms. One has to keep track of numbers, cases, acronyms (N, B), etc. All that gets in the way of the understanding. I believe figures would help.

      Another confusing issue in the paper is the use of the concept of "building blocks". I am not opposed to the use of these words, on the contrary. But building blocks are typically associated with the model structure (e.g., currents in a neuron, neurons in a network). PIR, SFA and Rheo are a different type of building blocks, which I would call "functional building blocks". They are building blocks in a functional world of model behavior, but not in the world of modeling components. For example, PIR can be instantiated by different combinations of ionic currents receiving inhibitory inputs. Also, the definitions of the building blocks and how they are quantified should be clearly stated in a separate section or subsection.

      I disagree with the authors' statement in lines 214-216, related to Fig. 4. They claim that "From them, we can say that the PYR cell firing does not speci1cally occur because of their IPSCs, as spiking can occur before or just after its IPSCs." Figure 4 (top, left panel) suggests the opposite, but instead of being a PIR mechanism, it is a "building-up" of the "adaptation" current in the PYR cell. (By "adaptation" current I mean the current corresponding to the second variable in the model. If this variable were the gating variable of the h-current, it would be the same type of mechanism suggested in Rotstein et al. (2005) and in the models presented in Stark et al. (2013).) The mechanism operates as follow: the first PV-spike (not shown in the figure) causes a rebound, which is not strong enough to produce a PYR spike before a new PV spike occurs (the first in the figure), this second PV-spike causes a stronger rebound (it is super clear in the figure), which is still not strong enough to produce a PYR-spike before the new PV-spike arrives, this third PV spike produces a still stronger rebound, which now causes a PYR spike. The fact that this PYR spike occurs before the PV spike is not indicative of the authors' conclusions, but quite the opposite.

      The authors should check whether the mechanistic hypothesis I just described, which is consistent with Fig. 4 (top, left panel), is also consist with the rest of the panels and, more generally, with their modeling results and the experimental data and whether it is general and, if not, what are the conditions under which it is. If my hypothesis ends up not being proven, then they should come up with an alternative hypothesis. The condition the authors' state about the parameter "b" and PIR is not necessarily general. PIR and other phenomena are typically controlled by the combined effect of more than one parameter. As it stands, their basic assumption behind the PRC is not necessarily valid.

      The subsequent hypothesis (about PYR bursting) is called to question in view of the previous comments. The experimental data should be able to provide an answer.

      The authors' should provide a more detailed explanation and justification for the presence of an inhibitory "bolus". What would the timescale be? Again, the data should provide evidence of that. In their discussion about the PRC, the authors essentially conclude what they hypothesis, but this conclusion is based on the "bolus" idea. The validity of this should be revised.

      The discussion about degeneracy of the theta rhythm generation is interesting. However, because of the size and complexity of the models, this degeneracy is expected. Their minimal modeling approach does not help in shedding any additional light. In addition, the authors' do not discuss the intrinsic sources of degeneracy and how they interact with the intrinsic ones.

      The last two sections were difficult to follow and I found them anecdotal. I was expecting a deeper mechanistic analysis. However, I have to acknowledge that because of my difficulty in following the paper, I might have missed important issues.

      The discussion is extensive, exhaustive and interesting. But it is not clear how the paper results are integrated in this big picture, except for a number of generic statements.

      The proposal that the hippocampus has the circuitry to produce theta oscillations without the need of medial septum input has been proposed before by Gillies et. (2003) and the models in Rotstein et al. (2005) and Orban et al. (2005). But the idea from this work is not that the hippocampus (CA1) is a pacemaker, but rather what we now call a "resonator". To claim that the MS is simply an amplificatory of an existing oscillator is against the existing evidence.

    3. Reviewer #2:

      In this work Chatzikalymniou et al. use models of hippocampus of different complexities to understand the emergence and robustness of intra-hippocampal theta rhythms. They use a segment of highly detailed model as a bridge to leverage insights from a minimal model of spiking point neurons to the level of a full hippocampus. This is an interesting approach as the minimal model is more amenable to analysis and probing the parameter space while the detailed model is potentially closer to experiment yet difficult and costly to explore.

      The study of network problems is very demanding, there are no good ways to address robustness of the realistic models and the parameter space makes brute force approaches impractical. The angle of attack proposed here is interesting. While this is surely not the only approach tenable, it is sensible, justified, and actually implemented. The amount of work which entered this project is clear. I essentially accept the proposed reasoning and the hypotheses put forward. The few remarks I have are rather minor, but I think they merit a response.

      1) l. 528-530 "This is particularly noticeable in Figure 9D where theta rhythms are present and can be seen to be due to the PYR cell population firing in bursts of theta frequency. Even more, we notice that the pattern of the input current to the PYR cells isn't theta-paced or periodic (see Figure 10Bi)."

      This is a loose statement. When you look at the raw LFP theta is also not apparent (e.g. Figure 9.Ei or Fi). What happens once you look at the spectrum of the activity shown in 10.Bi? Do you see theta or not?

      2) l. 562 "This implies that the different E-I balances in the segment model that allow LFP theta rhythms to emerge are not all consistent with the experimental data, and by extension, the biological system."

      This is speculative. We do not know how generic the results of Amilhon et al. are. They showed what you can find experimentally, not what you cannot find experimentally. I agree with the statement from l.581, though : "Thus, from the perspective of the experiments of Amilhon et al. (2015) theta rhythm generation via a case a type pathway seems more biologically realistic ..."

      3) There are several problems with access to code and data provided in the manuscript.

      l. 986, 1113 - osf.io does not give access<br> l. 1027 - bitbucket of bezaire does not allow access l. 1030 - simtracker link is down l. 1129, 1141 - the github link does not exist (private repo?)

      4) l. 1017 - Afferent inputs from CA3 and EC are also included in the form of Poisson-distributed spiking units from artificial CA3 and EC cells.

      Not obvious if Poisson is adequate here - did you check on the statistics of inputs? Any references? Different input statistics may induce specific correlations which might affect the size of fluctuations of the input current. I do not think this would be a significant effect here unless the departure from Poisson is highly significant. Any comments might be useful.

      5) l. 909 - "Euler integration method is used to integrate the cell equations with a timestep of 0.1 msec."

      This seems dangerous. Is the computation so costly that more advanced integration is not viable?

    4. Reviewer #1:

      This study takes two existing models of hippocampal theta rhythm generation, a reduced one with two populations of Izhikevich neurons, and a detailed one with numerous biophysically detailed neuronal models. The authors do some parameter variation on 3 parameters in the reduced model and ask which are sensitive control parameters. They then examine control of theta frequency through a phase response curve and propose an inhibition-based tuning mechanism. They then map between the reduced and detailed model, and find that connectivity but not synaptic weights are consistent. They take a subset of the detailed model and do a 2 parameter exploration of rhythm generation. They compare phenomenological outcomes of the model with results from an optogenetic experiment to support their interpretation of an inhibition-based tuning mechanism for intrinsic generation of theta rhythm in the hippocampus.

      General comments:

      1) The paper shows the existence of potential rhythm mechanisms, but the approach is illustrative rather than definitive. For example, in a very lengthy section on parameter exploration in the reduced model, the authors find some domains which do and don't exhibit rhythms. Lacking further exploration or analytic results, it is hard to see if their interpretations are conclusive.

      2) The authors present too much detail on too few dimensions of parameters. An exhaustive parameter search would normally go systematically through all parameters, and be digested in an automated manner. For reporting this, a condensed summary would be presented. Here the authors look at 3 parameters for the reduced model and 2 parameters in the detailed one - far fewer than the available parameter set. They discuss the properties of these parameter choices at length, but then pick out a couple of illustrative points in the parameter domain for further pursuit. This leaves the reader rather overwhelmed on the one hand, and is not a convincing thorough exploration of all parameters of the system on the other.

      3) I wonder if the 'minimal' model is minimal enough. Clearly it is well- supplied with free parameters. Is there a simpler mapping to rate models or even dynamical systems that might provide more complete insights, albeit at the risk of further abstraction?

      4) Around line 560 and Fig 12 the authors conclude that only case a) is consistent with experiment. While it is important to match data to experiment, here the match is phenomenological. It misses the opportunity to do a quantitative match which could be done by taking advantage of the biological detail in the model.

      5) The paper is far too long and is a difficult read. Many items of discussion are interspersed in the results, for example around line 335 among many others.

    5. Summary: This study tackles a difficult problem of understanding the basis for hippocampal theta rhythms through reduction of a highly detailed model, seeking to validate a reduced model that would be more amenable to analysis. The reviewers appreciated the attention to this challenging problem and the substantial work that went into it, but had several fundamental concerns about the methodology, interpretation, and reporting.

    1. Reviewer #3:

      The study by Jackson et al. characterizes the progression of the degeneration of axons and dendrites, including metrics on density and dynamics of dendritic spines and terminaux boutons (TBs), in the rTg4510 transgenic mouse model. The authors describe a decrease in the density of both structures, spines and TBs, as well as degeneration of neurites. Repression of the expression of the mutated version of tau was able to partially mitigate some of the negative effects observed in the non-repressed condition. When degeneration of the neuronal process was observed, the loss of a dendritic branch was preceded by a sharp increase in the loss of dendritic spines, while axonal loss was preceded by a long-lasting and progressive loss of TBs. While the findings are interesting, there are several concerns that dampened the enthusiasm on the study:

      1) The data obtained with the rTg4510 mouse model must be very carefully interpreted given that the disruption of the endogenous gene Fgf14 that occurs in this mouse model contributes significantly to the neurodegenerative phenotype (Gamache et al., 2019). While the authors acknowledge the possibility that genetic factors other than tau hyperphosphorylation may contribute to the rTg4510 pathology, the results must be put into the perspective of the mouse model rather than into the perspective of the tauopathy exclusively. In this sense, it would be recommended that the caveats of the mouse model be included in the introduction.

      2) The authors do not either mention the sex of the animals used in the study or how many mice from each sex were included in each experimental group. This is an important matter because it has been described that the rTg4510 mouse model presents with sex differences in the degree of accumulation of tau (Yue et al., 2009; Song et al., 2015).

      3) A big concern is the identity of the neurons labeled. The strategy to label cells is very unspecific and no details are given on their identity. Different subtypes of pyramidal neurons with different densities of dendritic spines and axon boutons may be mixed up in different proportions in each group and batch. In fact, the resilience of different neuron subtypes to the pathology may be different too. If the authors cannot pinpoint the identity of the neuron imaged, an elaboration on this issue must be included in the manuscript. In addition, the manuscript must include representative images of the cortex of both genotypes showing the labeling pattern obtained with their approach. It is recommended to the authors to add more information about the vector.

      4) How did the authors estimate the point of divergence between genotypes? The authors mentioned the 30-35 wk and 50 wk as points of divergence - which should be interpreted as the first time points where the differences between groups are significantly different - in lines 180-183. While the Wald test and the Akaike information criterion indicate that genotype is the factor with the most influence on the model estimates, it does not compute statistical differences between phenotypes at a given time point. Regarding the GAMMs, some fits suggest that data at earlier points may be very different between groups (i.e., Fig 2E, 5C, 6C). Is the decrease in density of TBs over time in WT mice significant? How do the authors interpret those fits?

      5) Looking at the data in Figures 1E and 2E, one would expect more negative growth values in figs 5E and 6E, indicating a larger decrease in density. They are flat. Are these analyses well powered? Are the data in Figures 5E and 6E not representative?

    2. Reviewer #2:

      This manuscript asked the question of how axons vs dendrites are lost by the live-imaging cortex of rTg4510 tau transgenic mice. Overall, this manuscript is well-done and well-written, and confirms previous findings. However, there are a number of key controls missing from the experimental data (please see below). Statistical analyses are satisfactory (with some caveats, please see below).

      Figures 1+2 replicate previous findings also in rTg4510 (Crimins et al., 2012; Jackson et al., 2017; Kopeikina et al., 2013); Figures 3+4 (Ramsden et al., 2005; SantaCruz et al., 2005; Spires et al., 2006; Crimins et al., 2012; Kopeikina et al., 2013; Helboe et al., 2017; Jackson et al., 2017). The novelty here are the differing patterns of bouton and spine turnover shortly before axons and dendrites, respectively, are lost, which is a finding uniquely enabled by 2-photon. Thus, findings in Fig. 5/6 should be highlighted and solidified. Further, the manuscript lacks mechanistic insight.

      It is not clear how the authors ensure that the perceived loss of spines/boutons/dendrites/axons is not due to bleaching or loss of the GFP signal. Please validate loss of spines/boutons and actual synapses using fixed tissue imaging or electron microscopy on a separate cohort of mice.

      Did the authors control for gliosis after the repeated imaging (very short after viral injection and cranial window implant on the same site)? Could it be that the repeated imaging itself on a damaged tissue induces blebbing on the already more vulnerable spines in the tau mice? Please show Iba1 and GFAP with and without doxycycline administration should be included in supplemental along with area staining quantification. Transgenic mice without manipulation (viral injection/cranial window/2P imaging) should also act as a control to ensure no gliosis is observed.

      rTg4510 transgene insertion: Gamache et al. recently showed that the integration sites of both the CaMKIIα-tTA and MAPT-P301L transgenes impact the expression of endogenous mouse genes. The disruption of the Fgf14 gene in particular contributes to the pathological phenotype of these mice, making it difficult to directly ascribe the phenotypes seen in the manuscript to MAPT-P301L transgene overexpression. Although this limitation is acknowledged in the discussion, the T2 mice employed in this paper (Gamache et al., 2019) would be suitable controls to better evaluate the contribution of tauP301L alone on the neuropathology and disease progression observed in the authors' experiments, at least in fixed synapse imaging.

    3. Reviewer #1:

      Studies in mouse models and humans show synapse loss and dysfunction that precede neurodegeneration, raising questions about timing and mechanisms. Using longitudinal in vivo 2-photon imaging, Jackson et al., investigate pre- and post-synaptic changes in rTg4510 mice, a widely used mouse model of tauopathy. Consistent with cross sectional studies, the authors observed a reduction in density of presynaptic axons and dendritic spines in layer 1 cortex that relate to degeneration of neurites and dendrites over time. Taking advantage of an inducible model to overexpress tau p301L, they show that reducing expression of tau by DOX early in disease progression, resulted in amelioration of synapse loss, also consistent with other studies. Interestingly, the authors observed a significant reduction of dendritic spines less than a week before dendrite degeneration. In contrast, they observed plasticity and turnover of presynaptic structures weeks before axonal degeneration, suggesting different mechanisms.

      Overall the results are interesting and largely consistent with previous findings. The new findings shown in Figures 5 and 6 address the timing of pre and postsynaptic loss and structural plasticity and reveal interesting differences; however, the data are highly variable and there are several issues that diminish enthusiasm as outlined below. Moreover, this study does not include new biological or mechanistic insight into the differences in pre- and post-synaptic changes from previous work in the field.

      The main weakness relates to the significance and relevance beyond this specific mouse model and brain region. I appreciate the strengths but also technical challenges of in vivo longitudinal imaging, including a small field of view. Thus, the rationale and choice of model and brain region, and validation of key findings is critical to support conclusions. In this case, the tau model, although used by others, has several caveats relevant to the investigation of synapse loss (see point 2 below) that weaken this study and its impact.

      1) Most of the work in the model related to synapse loss and dysfunction have been carried out in hippocampus and other regions of cortex in this model and tau and amyloid models. Here the authors focused on layer 1 of (somatosensory) cortex and followed neurites of pyramidal cells labeled with AAV:GFP, an approach that does not enable one image and track axons and dendrites from large numbers of neurons. They observed divergent dynamics in spine and presynaptic TBS of individual dendrites and axons. Given the small number of neurons sampled, significant noise in their imaging data, these findings need more validation using other approaches. This is particularly important for the data and conclusion drawn from Figures 5 and 6 (see point 3).

      To estimate the overall effect of genotype the authors fitted Generalized Additive Mixed Models (GAMMS) to their data given the variability in the data within animals and genotype. It would be helpful to those less familiar to provide more comparisons of data using additional statistical tests and analyses along with power analyses calculations.

      2) Major caveat with inducible Tau mode Tg4510. While this inducible model has the advantage of controlling timing of tau overexpression in neurons, a recent study by Gamache et al (PMID: 31685653) demonstrated that there are issues with the transgene insertion site and factors other than tau expression are actually what is driving the phenotype. Thus, differences in synaptic and behavioral phenotypes are based on the mouse line used and this needs to be carefully controlled. This was not addressed or discussed. See https://pubmed.ncbi.nlm.nih.gov/31171783/ and https://pubmed.ncbi.nlm.nih.gov/30659012/

      3) The interesting new findings presented in Figures 5 and 6 that address timing and differences in axonal and dendritic/spine plasticity and loss need to be validated with more neurons and animals. The sample size is small ( i.e. n= 18 axons from 7 animals and not clear how many neurons. Given the significant variability of the data even within animals, these experiments and data are considered preliminary.

      4) How does anesthesia influence these changes in structural plasticity observed? This was not addressed or discussed.

    4. Summary: This paper describes studies of a mouse model of tauopathy with relevance to Alzheimer's Disease. A powerful approach of longitudinal imaging of single synaptic structures over time allows insights into the time course of progressive neurogenerative responses. The strengths of the report are the relevance of the question to human disease, the powerful imaging approach, and the indication that there may be a programmed sequence of structural changes that mediate tauopathy. On the other hand, there were multiple issues with the transgenic mouse model used, which would seriously limit interpretation of results without suitable controls. Further, the data set appeared to be quite noisy, and variable between animals, which may result in part from the nonspecific methods of expressing fluorescent markers, thus leading to uncertainty regarding the specific identity of pre- and post-synaptic elements.

    1. Reviewer #2:

      The paper titled "Brain Network Reconfiguration for Narrative and Argumentative Thought" sought to uncover the common neural processing sequences (time-locked activations and deactivations; inter-subject correlations and inter-subject functional connectivity) underlying narrative and argumentative thought. In particular, the study aimed to provide evidence that would help adjudicate between two current theories: the Content-Dependent Hypothesis (narrative argumentative) and the Content-Independent Hypothesis (narrative = argumentative). In order to assess these possibilities they tested participants in an fMRI scanner as they listened to validated narrative and argumentative texts. Each text condition was directly compared to resting state and scrambled versions of the texts. Across a range of interesting analyses that focus on how each participant's brain synchronized with other participants' brains throughout the same narrative and argumentative texts, they primarily found support for the content-dependent hypothesis with a few differences and commonalities across text conditions. Relative to the scrambled conditions, listening to narrative texts was more associated with default mode activity across participants and listening to argumentative texts only activated a common network of superior fronto-parietal control regions and language regions. Argumentative texts did not differ much from scrambled versions of the same text. These patterns reveal themselves in both ISC and ISFC data. Overall, I feel like this paper is really well written and is a novel approach to distinguishing the neural processes between similar, but different types of thought. At times the manuscript loses touch with its primary brain coordination metrics (ISC and ISFC), describing the findings more like a GLM or functional connectivity study.

      Comments:

      Introduction:

      1) The introduction is very clearly written and uses a wonderful variety of sentence structure. Well done!

      2) While the writing is beautiful, a few sentences are less easy to comprehend than others. For example the use of outstands in line 36 is a bit difficult to parse on first read. Consider simplifying the language some.

      3) There seems to be an opportunity to discuss this work and its findings in a broad context of narrative or argumentative self-generated internal thought (not based on listening to texts). For instance, I think there could be a few sentences tying this work to studies of autobiographical memory retrieval or mind wandering (for argumentation perhaps studies of the cognitive and neural processes behind complex decision making). This is captured to some extent in the introduction and discussion, but I think it could go further with citations beyond those just associated with listening to various types of text.

      4) Appreciate the thorough discussion of hypotheses and background.

      5) It is not necessary, but it might be interesting to show some basic functional connectivity analyses of the individual participant activations in supplemental analyses (no ISC or ISFC).

      Methods:

      1) Please clarify how the ISFC analysis can be directional in any way? Does unidirectional mean that you're just taking one value for each pairwise connection Cij?

      Results:

      1) To what extent is there a concern that participants would still try to stitch together the scrambled narratives even if they are less coherent? Was this even possible given the nature of the stimuli?

      2) In line 125 and throughout the authors should consistently remind the reader that 'engagement' in this case means that there were consistent and correlated increases in the bold response across participants. This differs in some ways to task engagement in event-related GLM studies.

      3) The language throughout should reflect consistent involvement across participants at particular time points in each of the narratives vs the argumentative.

      4) It seems like argumentative is more similar to the scrambled in many ways. Might it be that argumentative texts are just less coherent and structured than narrative texts?

      5) It seems clear that the neural processing of argumentative texts (64 distinct edges) were very different from the narrative texts (2348 distinct edges), but that the current contrasts did not clearly and consistently distinguish argumentative thought from the scrambled argument conditions. A discussion of the analyses that might be necessary to better elucidate the dynamics of processing for argumentative thought would be helpful.

      Discussion:

      1) Were there any neural differences between the narrative vs argument scrambled-texts? This might reveal any differences in the processing of the scrambled texts for each condition and might help shine light on features of the scrambled argument condition that contributed to the overall lack of distinction relative to the narrative vs scrambled narrative conditions.

      2) Throughout the results from ISC and ISFC findings are convolved with the findings from univariate or GLM results from prior studies. Please compare and contrast how ISC and ISFC findings might relate to univariate or GLM findings early in the discussion.

      3) Related to point 2 in the introduction, please also cite studies from autobiographical memory retrieval studies that also show the frontoparietal control system working as information is iteratively accumulated and updated over long temporal windows (St. Jacques et al., 2011; Inman et al., 2018; Daselaar et al., 2008).

      4) Please reconsider how the ISC findings are discussed as 'activation'. While the BOLD activity of these areas are certainly coordinated across participants at similar points in the text, I feel like the term activation fits best with studies that convolve the brain activity with an HRF. In particular, from what I understand ISC, a common decrease in BOLD activity across participants at the same time in a read text would also lead to activity or 'activation' of that area in an ISC analysis. This seems counterintuitive. The 2nd paragraph of the discussion describes ISC and ISFC well in terms of what it shows across a sample (synchronization of fluctuations in BOLD activity across participants for the same stimuli). "Activity" may capture this, but please consider some more nuanced ways to refer to these ISC and ISFC findings.

      Figures:

      1) Please double check the box plots in figure 1a for Scene Construction. Another method of displaying this likert rating data might be helpful. While appreciating the attempt to display the individual data points, the simple main points get somewhat obscured by all of the information in the graph.

      2) Overall, I appreciate the attention to detail in all of the figures and the completeness of the data visualization with several useful supplemental figures.

    2. Reviewer #1:

      Xu and colleagues compared the intersubject correlation (ISC) and intersubject functional connectivity (ISFC) of participants listening to narrative and argumentative texts while undergoing fMRI. Replicating earlier findings, they show that ISC in the DMN was greater when participants listened to an intact narrative than when they listened to a sentence-scrambled version of the same narrative. Listening to a sentence-scrambled argument elicited ISC in language and control regions of the brain, though interestingly, there was no region in the brain where ISC was greater when participants listened to an intact version of the argument. Instead, there was greater ISFC between the IPS and language areas of the brain when participants listened to the intact argument than when they listened to the scrambled argument. The authors interpret their results as suggesting that listening to the intact argument did not recruit additional brain systems, but instead promoted the cooperation between regions that were already involved in processing the argument.

      Most prior work using "naturalistic stimuli" has examined the neural responses to narratives. This manuscript extends this work in an important way by examining how the brain responds to arguments, which comprise a non-trivial proportion of the linguistic content people are exposed to on a daily basis. The ISFC results (Fig. 7) are particularly noteworthy and novel. My main concerns have to do with the possibility that ISC for the scrambled argument seems to be stronger and more extensive than that for the intact argument, and how this might affect the authors' interpretation of their results. Below are some suggestions and comments which I think the paper could benefit from considering further:

      1) I think it would be helpful to run the Scrambled Argument > Intact Argument ISC contrast. Visual inspection of Figure 2 suggests that ISC for the scrambled argument might be stronger than that for the intact argument, especially in control regions. If this is truly the case, I think the authors should discuss what this might imply about what is happening during the scrambled condition and if this affects thinking of the scrambled condition as a control for low-level linguistic features. In particular, the 2.97 out of 5 comprehensibility rating of the scrambled arguments suggests that participants might have understood the scrambled arguments. If participants are actively trying to make sense of the scrambled argument text, it seems like this could then drive observed differences in ISFC between the intact and scrambled arguments as well (e.g., decreased connectivity between control and language regions when trying to make sense of scrambled text, rather than increased connectivity between control and language regions when processing an intact argument).

      2) More broadly, I think the authors need to make sure their effects aren't driven by the scrambled conditions. For example, for Figure 2 - figure supplement 2, the (Intact Narrative - Scrambled Narrative) > (Intact Argument - Scrambled Argument) contrast can be driven by high ISC in the Scrambled Argument condition, which would suggest a different interpretation of the results. My suggestion would be to run the contrast as (Intact Narrative - Scrambled Narrative) > max((Intact Argument - Scrambled Argument),0) to make sure that the contrast isn't driven by a negative value on the right hand side of the inequality.

      3) Point 2 also applies to Figures 6 and 7. Relatedly, the rightmost panel of Figure 6C suggests that the analysis is indeed capturing some edges where the SES of the Scrambled Argument is greater than that of the Intact Argument.

      4) How well do the vertexes identified in Figure 7D overlap with the Intact Argument > Resting map? Given the authors interpretation that the ISFC results suggest cooperation between areas involved in processing the intact stimulus, I think this should be properly assessed.

      5) Both ISC and ISFC capture only signal that is shared across participants. Most narratives are crafted such that all listeners have a similar interpretation. This is unlike arguments, where different listeners might agree with an argument to a different extent. If listeners had differing interpretations of the argument, ISC/ISFC would miss brain activity/connectivity involved in processing an argument. I think this possibility should be considered and discussed, especially given the null DMN finding for the argumentative texts.

      6) For the t-tests on the behavioral ratings , it looks like the authors collapsed over the two texts within a category. This doesn't seem right, given that the ratings for each text are dependent. A mixed model approach would be more appropriate. I doubt this will change the results, but I think it would be good to follow best practices when possible.

    3. Summary: The reviewers thought this was a nicely written paper, and were interested in the idea of extending intersubject correlation (ISC) and intersubject functional connectivity (ISFC) work on narratives to arguments. One major concern was that effects reported here may be driven in part by the scrambled conditions. Specifically, the scrambled argument seems to have resulted in stronger and more widespread ISC than the intact argument: this would call into question assumptions about the scrambled version being a control condition. Another, related concern raised is that perhaps argumentative texts are very different from narrative texts: perhaps argumentative texts are less structured, or less interesting (?) and this is why the intact and scrambled versions are so similar. Together with other issues, relating to the interpretation of the findings, it was felt that while of interest, the study's major conclusions could not be justified without additional experiments.

    1. Reviewer #3:

      This paper describes a novel technique for measuring several distinct subcortical components, using naturalistic speech instead of the more typical clicks and tone-pips. The benefits of using extended speech (e.g., stories) include simultaneous measurement of middle- and late-latency components automatically.

      The technique is of great interest with many potential use cases. The manipulation of the acoustics is reasonable (replacing voiced speech with click trains of the same pitch), does not degrade intelligibility, and reduces sound quality only in minor ways. The manipulation is also described clearly for others to implement.

      The authors also investigate several variations and generalizations of the technique, and their tradeoffs, inducing responses from specific tonotopic bands and ear-specific responses.

      The reliability of the ABR wave I and V responses is remarkable (especially given the previous results of the senior author using unprocessed speech); wave III is less so. Being able to simultaneously record P0, Na, Pa, Nb, P1, N1, and P2 simultaneously shows promise for future clinical applications (and basic science). The practical importance of using a lower fundamental frequency (i.e., typical of male speakers), is clearly established.

      The technique has some overlap with the Chirp spEECh of Miller et al., but with enough tangible additional benefits that it should be considered novel.

      The writing is very clear.

      Major Concerns:

      "wave III was clearly identifiable in 16 of the 22 subjects": Figure 1 indicates that the word "clearly" may be somewhat generous. It would be worthwhile to discuss wave III and its identifiability in more detail (perhaps its identifiability/non-universality could be compared with that of another less prominent peak in traditionally obtained ABRs?).

    2. Reviewer #2:

      General assessment:

      This manuscript presents an improved methodology for extracting distinct early auditory evoked potentials from the EEG response to continuous natural speech, including a novel method for obtaining simultaneous responses from different frequency bands. It is a clever approach and the first results are promising, but more rigorous evaluation of the method and critical evaluation of the results is needed. It could provide a valuable tool for investigating the effect of corticofugal modulation of the early auditory pathway during speech processing. However, the claims made of its use investigating speech encoding or clinical diagnosis seem too speculative and unspecific.

      General comments:

      1) Despite repeated claims, I don't think a convincing case is made here that this method can provide insight on how speech is processed in the early auditory pathway. The response is essentially a click-like response elicited by the glottal pulses in the stimulus; it averages out information related to dynamic variations in envelope and pitch that are essential for speech perception; at the same time, it is highly sensitive to sound features that do not affect speech perception. What reason is there to assume that these responses contain information that is specific or informative about speech processing?

      2) Similarly, the claim that the methodology can be used as a clinical application is not convincing. It is not made clear what pathology these responses can detect that current methods ABR cannot, or why. As explained in the Discussion, the response size is inherently smaller than standard ABRs because of the higher repetition rate of the glottal pulses, and the response may depend on more complex neural interactions that would be difficult to quantify. Do these features not make them less suitable for clinical use?

      3) It needs to be rigorously confirmed that the earliest responses are not contaminated or influenced by responses from later sources. There seems to be some coherent activity or offset in the baseline (pre 0 ms), in particular with the lower filter cut off. One way to test this might be to simulate a simple response by filtering and time shifting the stimulus waveforms, adding these up plus realistic noise, and applying the deconvolution to see whether the input is accurately reproduced. It might be useful to see how the response latencies and amplitudes correlate to those of conventional click responses, and how they depend on stimulus level.

      4) The multiband responses show a variation of latency with frequency band that indicates a degree of cochlear frequency specificity. The latency functions reported here looks similar to those obtained by Don et al 1993 for derived band click responses, but the actual numbers for the frequency dependent delays (as estimated by eye from figures 4,6 and 7) seem shorter than those reported for wave V at 65 dB SPL (Don et al 1993 table II). The latency function would be better fitted to an exponential, as in Strelcyk et al 2009 (equation 1), than a quadratic function; the fitted exponent could be directly compared to their reported value.

      5) The fact that differences between narrators leads to changes to the ABR response is to be expected, and was already reported in Maddox and Lee 2018. I don't understand why it needs to be examined and discussed at such length here. The space devoted to discussing the recording time also seems very long. Neither abstract or introduction refers to these topics, and they seem to be side-issues that could be summarised and discussed much more briefly.

      L142-144. Is it possible to apply the pulse train regressor to the unaltered speech response? If so, does this improve the response, i.e. make it look more similar to the peaky speech response? It would be interesting to know whether improvement is due to the changed regressor or the stimulus modification or both.

      L208 -211. What causes the difference between the effect of high-pass filtering and subtracting the common response? If they serve the same purpose, but have different results, this raises the question which is more appropriate.

      L244. This seems a misinterpretation. The similarity between broadband and summated multiband responses indicates that the band filtered components in the multiband stimulus elicited responses that add linearly in the broadband response. It does not imply that the responses to the different bands originate from non-overlapping cochlear frequency regions.

      L339-342. Is this measure of SNR appropriate, when the baseline is artificially constructed by deconvolution and filtering? Perhaps noise level could be assessed by applying the deconvolution to a silent recording instead? It might also be useful to have a measure of the replicability of the response.

    3. Reviewer #1:

      Major issues:

      I have two major comments on the work.

      1) The authors motivate the work from the use of naturalistic speech, and the application of the developed method to investigate, for instance, speech-in-noise deficits. But they do not discuss how comprehensible the peaky speech in fact is. I would therefore like to see behavioural experiments that quantitatively compare speech-in-noise comprehension, for example SRTs, for the unaltered speech and the peaky speech. Without such a quantification, it is impossible to fully judge the usefulness of the reported method for further research and clinical applications.

      2) The neural responses to unaltered speech and to peaky speech are analysed by two different methods. For unaltered speech, the authors use the half-wave rectified waveform as the regressor. For peaky speech, however, the regressor is a series of spikes that are located at the timings of the glottal pulses. Due to this rather different analysis, it is impossible to know to which degree the differences in the neural responses to the two types of speech that the authors report are due to the different speech types, or due to the different analysis techniques. The authors should therefore use the same analysis technique for both types of speech. It might be most sensible to analyse the unaltered speech through a regressor with spikes at the glottal pulses a well. In addition, it would be good to see a comparison, say of a SNR, when the peaky speech is analysed through the half-wave rectified waveform and through the series of spikes. This would also further motivate the usage of the regressor with the series of spikes.

    4. Summary: This manuscript describes a type of alteration to speech to make it more peaky, with the goal of inducing stronger responses in the auditory brainstem. Recent work has employed naturalistic speech to investigate subcortical mechanisms of speech processing. However, previous methods were ill equipped to tease apart the neural responses in different parts of the brainstem. The authors show that their speech manipulation improves this: the peaky speech that they develop allows to segregate different waves of the brainstem response. This development may allow further and more refined investigations of the contribution of different parts of the brainstem to speech processing, as well as to hearing deficits.

    1. Reviewer #3:

      In this study, Higgs et al. apply a systematic and hierarchical approach to testing the enrichment of imprinted gene expression in (mostly) adult tissues, culminating in a survey at the single-cell and neuronal sub-type level, which the authors achieve by exploitation of now extensive single-cell gene expression datasets. Arguably, there are no great surprises in this analysis: it reinforces previous studies showing/suggesting an enrichment for imprinted genes in the brain, with functions in feeding, parental behaviour, etc. But, it is conducted in a rigorous manner and makes highly informed inferences about the expression domains and neuronal subtypes identified. This level of detail is beyond any previous survey, therefore, the study will provide an excellent resource (although the fine details of the specific neuronal sub-populations in which imprinted gene expression is enriched are likely to be of interest to specialists only). Having, at all levels of their analysis, access to two or more single-cell datasets provides an important level of confidence in the analysis and findings, although there are some discrepancies between the enrichments found in comparing any two datasets. Moreover, the findings will give more prominence to neuronal domains that have received less emphasis in functional studies, for example, the enrichment of imprinted genes within the suprachiasmatic nucleus implicating roles in circadian processes.

      Imprinted expression covers a range of allelic biases and we are still some way from really understanding what an allelic skew means in comparison to absolute monoallelic expression: biased expression in all cells in a tissue or a mosaic of mono- and biallelically expressing cells. So finding an imprinted gene expressed in a given cell type without knowing whether its expression is actually imprinted in that cell type is a problem. And certainly a significant proportion of more recently discovered brain-expressed imprinted genes seem to fall into a category or paternal bias rather than full monoallelic expression. The authors do acknowledge this caveat in their discussion (lines 491-499). Is it possible to stratify the analysis according to degree of allelic bias? Ultimately, scRNA-seq using hybrid tissues will be important to resolve such issues. In this context, the authors will need to discuss findings in the very recently published paper from Laukoter et al. (Neuron, 2020), although that study focussed on cortical neurons in which Higgs and colleagues do not find imprinted gene enrichments.

      Another issue that could cloud the analysis, and particularly inference of how PEGs and MEGs could be involved in separate functions, is the issue of complex transcription units. The authors allude to Grb10 in which there are maternally and paternally expressed isoforms largely arising from separate promoters, which also applies to Gnas. There are also cases in which there are imprinted and non-imprinted isoforms. A problem with short-read RNA-seq libraries will be that much of the expression data for a given transcription unit cannot discriminate such differentially imprinted isoforms, as most of the reads mapping to the locus will map to shared exons. This caveat probably also needs to be mentioned in the text.

      The authors give some prominence to Peg3 as an example of the role of imprinted genes in maternal behaviours (e.g., line 508) as reported in the original knock-out (Li et al. 1999). However, this particular Peg3-knock-out associated phenotype has been questioned by a more recent Peg3 knock-out in which it was not observed (Denizot et al. 2016 PMID: 27187722), suggesting that the initial phenotype could be a consequence of the nature of the targeting insertion rather than Peg3 ablation.

      While a general picture that emerges is of imprinted genes acting in concert to influence shared functions (e.g., feeding), the authors also point out cases in which a single imprinted gene contributes to a neuronal function (Ube3a in the case of hippocampal-related learning and memory; line 511-512) but for which they did not find enrichment of imprinted genes in the relevant neuronal population. This poses some problems, but it could indicate that that particular function of the gene is not the function for which imprinting was selected if the gene is active in other domains, but is rather 'tolerated'. Of course, many imprinted genes will have multiple physiological functions, so the convergence on specific functions probably provides the best (but by no means perfect) basis for discerning the evolutionary imperatives.

    2. Reviewer #2:

      General assessment of the work:

      In this manuscript Higgs and colleagues test the hypothesis that imprinted gene expression is enriched in the brain, and that identifying specific brain regions of enrichment will aid in uncovering physiological roles for imprinted pathways. The authors claim that the hypothesis that imprinted genes are enriched in key brain functions has never been formally/systematically tested. Moreover, they suggest that their analysis represents an unbiased systems-biology approach to this question.

      In our assessment the authors fail to meet these criteria on several major grounds. Firstly, there are multiple instances of methodological bias in their analysis (detailed below). Secondly, the authors claim that their findings are validated by similar test results in 'matched' datasets. However, throughout the authors appear to have avoided identifying individual imprinted genes that are enriched in their analysis (they can be found in a minimally annotated supplementary file). Due to this it is impossible to judge to what extent there is agreement between matched datasets and between levels of the analysis. For these reasons the analysis appears arbitrary rather than systematic, and lacks rigor. Consequently we do not feel that the work of Higgs and colleagues goes beyond previous systematic reports of imprinting in the brain (for example, Gregg, 2010, Babak 2015, in ms reference list).

      Numbered summary of substantive concerns:

      1) Imprinted genes that were identified as enriched are not clearly named or listed

      -The authors use two or more independent datasets at each level to "strengthen any conclusions with convergent findings" (p4 ln96). By this the authors mean that both datasets pass the F-test criteria for enrichment. However, they should show which imprinted genes are allocated to each region, and clearly present the overlap. Are the same genes enriched in the two datasets? Similarly, are the same genes that are enriched in, e.g. the hypothalamus the same genes that are enriched in the ARC?

      -The authors discuss how their main aim of identifying expression "hotspots" helps inform imprinted gene function in the brain. An analysis of the actual genes is therefore crucial (and the assumed next step after identifying the location of enrichment).

      -The authors allocate parental expression enrichment to the brain regions but do not state why they do this analysis.

      -Are imprinted genes in the same cluster co-expressed, as might be expected?

      2) Selection of datasets needs to be more clearly explained (i.e. a selection criteria)

      -Their reason for selection "to create a hierarchical sequence of data analysis" - suggests that there could be potential bias in their selection based on previous knowledge of IG action in the brain.

      -A selection criteria would explain the level of similarity between datasets, which is important before datasets are systematically analyzed

      3) The study is more like a set of independent analyses of individual datasets (rather than one systematic/meta-analysis)

      -Each dataset was individually processed (filtered and normalized) following the original authors' procedure, rather than processing all the raw datasets the same way.

      -"A consistent filter, to keep all genes expressed in at least 20 cells or (when possible) with at least 50 reads" (p7 ln115), our emphasis - which filter was used? This should be consistent throughout.

      -Two different cut-offs were used to identify genes with upregulated expression, making the identification of enriched genes arbitrary (p7 para2).

      -Some datasets contain tissues from various time-points and sexes, but there is no clarification if all the data was included in the analysis. (e.g. the Ximerakis et al. dataset was originally an analysis of young and old mouse brains). This is particularly difficult to interpret when embryonic data is likened to adult data, which is in no way equivalent.

      -The cell-type and tissue-type identities were supplied by the dataset authors, based on their original clustering methods. This can be variable, particularly at the sub-population level.

      4) These differences make it hard to draw connections between the findings from each dataset

      -In some levels, the authors compare two datasets for a "convergence" of IG over-expression. Yet the above differences between datasets and analyses makes them difficult to compare. (e.g. the comparison of hypothalamic neuronal subtypes with enriched IG expression between two datasets in level 3.a.2 is quite speculative).

      -More generally, the authors draw connections between their findings from each level, but the lack of consistency between analyses may not justify these connections.

      5) Hence, the study does not lead to a definitive set of findings that is new to the field

      -The above reasons suggest that this is not an objective set of data about IG expression in the brain, but rather evidence of certain hotspots for targeted analysis. However, these hotspots were already known.

      -A systematic analysis of raw data using fewer datasets, that then includes and discusses the imprinted genes, may lead to novel findings and a paper with a clearer narrative.

    3. Reviewer #1:

      The authors studied the over-representation of imprinted genes in the mouse brain by using fifteen single-cell RNA sequencing datasets. The analysis was performed at three levels 1) whole-tissue level, 2) brain-region level, and 3) region-specific cell subpopulation level. Based on the over-representation and gene-enrichment analyses, they interpreted hypothalamic neuroendocrine populations and monoaminergic hindbrain neurons as specific hotspots of imprinted gene expression in the brain.

      Objective:

      Though the study is potentially interesting, the expression of imprinted genes in the brain and hypothalamus is already known (Davies W et al., 20005, Shing O et al. 2019, Gregg et al, 2010 including many other studies cited in the paper). However, the authors put forth two objectives, the first being whether imprinted gene expression is actually enriched in the brain compared to other adult tissues, where they did find brain as one of the tissues with over-represented imprinted genes. Secondly, whether the imprinted genes are enriched in specific brain regions. The study objectives cannot qualify as completely novel as it is the validation of most of what is already known using scRNA-seq datasets.

      Methods and Results

      Pros:

      -15 scRNA-seq datasets were analysed independently and they were processed as in the original publication.

      -Two enrichment methods used to find tissue-specific enrichment of imprinted genes and appropriate statistics applied wherever necessary.

      Concerns:

      -It is not clear how the over-representation using fisher's exact test was calculated? It would be appropriate to include the name of the software or R package, if used, in the basic workflow section of Materials and methods.

      -Why did authors particularly use Liger in R for GSEA analysis?

      -GSEA plots generated using Liger and represented for each analysis in the paper by itself does not look informative. For eg. in figure 4 and other GSEA plots in the paper- i) Which 'score' does the Y-axis represent? Include x-axis label and mention corrected GSEA q value either in the legend or the figure. ii) Was the normalized enrichment score (NES) calculated? What genes in the cluster represent maximum enrichment? A heat map of the imprinted genes contributing to the cell cluster will add more clarity to the GSEA plots.

      -Apart from the tissue-specific enrichment of gene sets, a functional GO/pathways enrichment of the group of imprinted genes will strengthen the connection of these genes with feeding, parental behavior and sleep.

      -Are these imprinted genes coexpressed across the analyzed brain structures, as the authors repeatedly stress on the functioning of imprinted genes as a group?

      -A basic workflow schematic might be necessary for an easy and quick understanding of the methods.

      Overall, the study gives some insight into the brain regions, particularly cell clusters in the brain where imprinted genes could be enriched. However, the nature of the study is preliminary and validates most of previous studies. The authors have already highlighted some of the limitations of the study in the discussion.

    4. Summary: The reviewers appreciated the effort to merge most of the available datasets to make a precise survey of the sites of imprinted gene expression and the great resource it could bring to the community. However, the reviewers also felt that the study suffered from methodological bias, was preliminary (no allelic information in particular), and that the conclusions did not go beyond previous reports. The general lack of citation of the name of imprinted genes made it difficult to judge whether conclusions were consistent among the different datasets. Highlighting specific imprinted genes would bring a clearer focus and narrative.

    1. Reviewer #2:

      The authors describe the development and use of a D-Serine sensor based on a periplasmic ligand binding protein (DalS) from Salmonella enterica in conjunction with a FRET readout between enhanced cyan fluorescent protein and Venus fluorescent protein. They rationally identify point mutations in the binding pocket that make the binding protein somewhat more selective for D-serine over glycine and D-alanine. Ligand docking into the binding site, as well as algorithms for increasing the stability, identified further mutants with higher thermostability and higher affinity for D-serine. The combined computational efforts lead to a sensor for D-serine with higher affinity for D-serine (Kd = ~ 7 µM), but also showed affinity for the native D-alanine (Kd = ~ 13 uM) and glycine (Kd = ~40 uM). Molecular simulations were then used to explain how remote mutations identified in the thermostability screen could lead to the observed alteration of ligand affinity. Finally, the D-SerFS was tested in 2P-imaging in hippocampal slices and in anesthetized mice using biotin-straptavidin to anchor exogenously applied purified protein sensor to the brain tissue and pipetting on saturating concentrations of D-serine ligand.

      Although presented as the development of a sensor for biology, this work primarily focuses on the application of existing protein engineering techniques to alter the ligand affinity and specificity of a ligand-binding protein domain. The authors are somewhat successful in improving specificity for the desired ligand, but much context is lacking. For any such engineering effort, the end goals should be laid out as explicitly as possible. What sorts of biological signals do they desire to measure? On what length scale? On what time scale? What is known about the concentrations of the analyte and potential competing factors in the tissue? Since the authors do not demonstrate the imaging of any physiological signals with their sensor and do not discuss in detail the nature of the signals they aim to see, the reader is unable to evaluate what effect (if any) all of their protein engineering work had on their progress toward the goal of imaging D-serine signals in tissue.

      As a paper describing a combination of protein engineering approaches to alter the ligand affinity and specificity of one protein, it is a relatively complete work. In its current form trying to present a new fluorescent biosensor for imaging biology it is strongly lacking. I would suggest the authors rework the story to exclusively focus on the protein engineering or continue to work on the sensor/imaging/etc until they are able to use it to image some biology.

      Additional Major Points:

      1) There is no discussion of why the authors chose to use non-specific chemical labeling of the tissue with NHS-biotin to anchor their sensor vs. genetic techniques to get cell-type specific expression and localization. There is no high-resolution imaging demonstrating that the sensor is localized where they intended.

      2) Why does the fluorescence of both the CFP and they YFP decrease upon addition of ligand (see e.g. Supplementary Figure 2)? Were these samples at the same concentration? Is this really a FRET sensor or more of an intensiometric sensor? Is this also true with 2P excitation? How does the Venus fluorescence change when Venus is excited directly? Perhaps fluorescence lifetime measurements could help inform what is happening.

      3) How reproducible are the spectral differences between LSQED and LSQED-T197Y? Only one trace for each is shown in Supplementary Figure 2 and the differences are very small, but the authors use these data to draw conclusions about the protein open-closed equilibrium.

      4) The first three mutations described are arrived upon by aligning DalS (which is more specific for D-Ala) with the NMDA receptor (which binds D-Ser). The authors then mutate two of the ligand pocket positions of DalS to the same amino acid found in NMDAR, but mutate the third position to glutamine instead of valine. I really can't understand why they don't even test Y148V if their goal is a sensor that hopefully detects D-Ser similar to the native NMDAR. I'm sure most readers will have the same confusion.

    2. Reviewer #1:

      The manuscript “A computationally designed fluorescent biosensor for D-serine" by Vongsouthi et al. reports the engineering of a fluorescent biosensor for D-serine using the D-alanine-specific solute-binding protein from Salmonella enterica (DalS) as a template. The authors engineer a DalS construct that has the enhanced cyan fluorescent protein (ECFP) and the Venus fluorescent protein (Venus) as terminal fusions, which serve as donor and acceptor fluorophores in resonance energy transfer (FRET) experiments. The reporters should monitor a conformational change induced by solute binding through a change of the FRET signal. The authors combine homology-guided rational protein engineering, in-silico ligand docking and computationally guided, stabilizing mutagenesis to transform DalS into a D-serine-specific biosensor applying iterative mutagenesis experiments. Functionality and solute affinity of modified DalS is probed using FRET assays. Vongsouthi et al. assess the applicability of the finally generated D-serine selective biosensor (D-SerFS) in-situ and in-vivo using fluorescence microscopy.

      Ionotropic glutamate receptors are ligand-gated ion channels that are importantly involved in brain development, learning, memory and disease. D-serine is a co-agonist of ionotropic glutamate receptors of the NMDA subtype. The modulation of NMDA signalling in the central nervous system through D-serine is hardly understood. Optical biosensors that can detect D-serine are lacking and the development of such sensors, as proposed in the present study, is an important target in biomedical research.

      The manuscript is well written and the data are clearly presented and discussed. The authors appear to have succeeded in the development of D-serine-selective fluorescent biosensor. But some questions arose concerning experimental design. Moreover, not all conclusions are fully supported by the data presented. I have the following comments.

      1) In the homology-guided design two residues in the binding site were mutated to the ones of the D-serine specific homologue NR1 (i.e. F117L and A147S), which lead to a significant increase of affinity to D-serine, as desired. The third residue, however, was mutated to glutamine (Y148Q) instead of the homologous valine (V), which resulted in a substantial loss of affinity to D-serine (Table 1). This "bad" mutation was carried through in consecutive optimization steps. Did the authors also try the homologous Y148V mutation? On page 5 the authors argue that Q instead of V would increase the size of the side chain pocket. But the opposite is true: the side chain of Q is more bulky than the one of V, which may explain the dramatic loss of affinity to D-serine. Mutation Y148V may be beneficial.

      2) Stabilities of constructs were estimated from melting temperatures (Tm) measured using thermal denaturation probed using the FRET signal of ECFP/Venus fusions. I am not sure if this methodology is appropriate to determine thermal stabilities of DalS and mutants thereof. Thermal unfolding of the fluorescence labels ECFP and Venus and their intrinsic, supposedly strongly temperature-dependent fluorescence emission intensities will interfere. A deconvolution of signals will be difficult. It would be helpful to see raw data from these measurements. All stabilities are reported in terms of deltaTm. What is the absolute Tm of the reference protein DalS? How does the thermal stability of DalS compare to thermal stabilities of ECFP and Venus? A more reliable probe for thermal stability would be the far-UV circular dichroism (CD) spectroscopic signal of DalS without fusions. DalS is a largely helical domain and will show a strong CD signal.

      3) The final construct D-SerFS has a dynamic range of only 7%, which is a low value. It seems that the FRET signal change caused by ligand binding to the construct is weak. Is it sufficient to reliably measure D-serine levels in-situ and in-vivo? In Figure 5H in-vivo signal changes show large errors and the signal of the positive sample is hardly above error compared to the signal of the control. Figure 5G is unclear. What does the fluorescence image show? Work presented in this manuscript that assesses functionality and applicability of the developed sensor in-situ and in-vivo is limited compared to the work showing its design. For example, control experiments showing FRET signal changes of the wild-type ECFP-DalS-Venus construct in comparison to the designed D-SerFS would be helpful to assess the outcome.

      4) The FRET spectra shown in Supplementary Figure 2, which exemplify the measurement of fluorescence ratios of ECFP/Venus, are confusing. I cannot see a significant change of FRET upon application of ligand. The ratios of the peak fluorescence intensities of ECFP and Venus (scanned from the data shown in Supplementary Figure 2) are the same for apo states and the ligand-saturated states. Instead what happens is that fluorescence emission intensities of both the donor and the acceptor bands are reduced upon application of ligand.

    3. Summary: The reviewers recognize the merits of your work and your efforts to engineer a D-serine selective biosensor. However, they also raise major concerns regarding the experimental design (selection of mutations), methodology and achieved applicability. The reviewers find that the improvement in the selectivity of the engineered construct for the targeted ligand over alternative ligands is modest. They further indicate ambiguities regarding the origin of the ligand-induced fluorescence signal changes of the sensor. Other problematic aspects are the estimation of thermal stabilities and the lack of physiological signals in fluorescence imaging results that could demonstrate applicability to a biological problem.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 26 2020, follows.

      Summary

      The idea of using DNA tags to enable tracking of protein and its subsequent handling is innovative and interesting. The manuscript is well written and some of the notions presented may help drive the field forward. There is a potential to revise the manuscript for eLife as a proof of principle for the approach of using scRNA seq to track an antigen in vivo.

      Essential Revisions

      While the archiving of the psDNA-Ova conjugate in lymphatic endothelial cells matches the earlier observations of fluorescently tagged Ova, the power of this method is to work with scRNAseq to be able to identify novel cells that interact the the complex. In this regard, it's clear that the psDNA-Ova may interact differently with various cell types due to the potential for recognition of the psDNA, particularly by TLR9 as suggested. Tracking the psDNA-Ova conjugate as an immunogenic adjuvant-antigen complex is an interesting starting point. You can address this concern by reframing the goal from tracking a protein antigen to characterizing the archiving and presentation of the barcoded psDNA-Ova complex. This doesn't require any additional work, just changing the way you set it up- that this if you immunogen is a DNA-protein complex- you can track it by scRNA-seq. The potential to study trafficking in a TLR9 KO mice in the future might open up using this method more generically for tracking the protein antigen, but this would require much more work to rule out other influence of the DNA on archiving and processing of the antigen.

      Your evidence that the psDNA really reflects the distribution of the protein is not sufficient. In Figure 1 c and d it's not clear how you measure the amount of protein? Is this the protein injected or the protein detected in a immunoblot or capture immunoassay? To extend the in vitro analysis in Figure 1c and d to actually visualize the native protein with anti-Ova and the psDNA by FISH would provide a high degree of clarity that the psDNA is not acting like a tattoo that outlives the intact protein. The FISH method could take advantage of any amplification step as long as it is consistent with detection by scRNAseq. Microscopy could demonstrate that the two signals remain in the same comparments. A particularly powerful way to show the direct association would be to perform a bulk IP-seq with anti-Ova and detection of the bar code with a test of the efficiency of depletion of the bar code form the cell lysate. A well-controlled experiment could be performed to map out the time dependent loss of protein (IP-western or capture immunoassay), and the free and protein associated psDNA. It would be ideal to include a macrophage in the analysis as a highly degradative cells in comparison to the dendritic cell and LEC, that maybe more specialized to regain intact proteins.

  5. Oct 2020
    1. Reviewer #3:

      This manuscript reports a series of unique experiments with a single human participant, using an electrode array implanted in the left posterior parietal cortex several years after high-level spinal cord injury. There is a small but increasing number of groups capable of performing this type of research in humans. Most of this work has been focused on the motor system, but studies like this one, characterizing the somatosensory system (touch, in particular), have been increasingly common in the past five years. However, this is the only group focusing on this higher-level, multimodal association area of the cortex.

      Most of the recorded neurons were activated bilaterally, which is consistent with earlier monkey work from this lab. Probably the most important component of the work is the analysis of the modest activation in this area that occurs simply when the participant imagines different places on her body being touched - even the insensate arm. This work is virtually impossible to do in monkeys. There are extensive and overlapping analyses of the relation between actual and imagined activation, and the activation arising from inputs (or imagined inputs) from the two sides of the body. Eliminating a number of these and clarifying the remainder may improve the impact.

      1) 63: in a tetraplegic human subject recorded with an electrode array implanted in the left PPC I am curious why the array was placed in the left PPC, given the clinical evidence for the greater role of the right side in the formation of internal, multi-modal maps. Some comments would be useful.

      2) Fig 1: It would be good to show a panel of representative spikes, perhaps with their single-trial raster responses. This could be in a new figure that includes panel 1D, which is presented in a bit of an odd order as it now stands, coming in the midst of higher-level analyses. Indicate how many trials went into the averages in 1D.

      3) 146: we computed a cross-validated coefficient of determination (R^2 within) to measure the strength of neuronal selectivity for each body side. Even after reading the methods (further comments below) it is difficult to figure out what all these related measures reveal. At this point in the text it is very difficult to intuit how R^2 would measure selectivity.

      4) Fig 4: Several panels would be more effective if plotted as a function of distance rather than a category. 4E: This panel is borderline too small 4F: definitely too small. Enlarge, perhaps with fewer examples The curves drawn on the panels do not appear to be Gaussian, but neither are they just connected points. Show whatever it was you actually used. The Gaussian assumption does not appear to be very good for the edge cases (first two, last two) which is not terribly surprising.

      5) What is added by including both classification and Mahalanobis distance?

      6) 354: information coding evolves for a single unit. Two complementary analyses were then performed. In what sense are they complementary? What is added (besides complexity) by including both cluster analysis and PCA?

      7) Fig 8C: Despite my best efforts, I have no idea what this is showing

      8) 753: Classification was performed using linear discriminant analysis with the following assumptions:

      One, the prior probability across tested task epochs was uniform; It is not clear what prior probability this refers to. Just stimulus site?

      Two, the conditional probability distribution of each unit on any epoch was normal; Is this a reference to firing rate probability conditioned on stimulus site?

      Three, only the mean firing rates differ for unit activity during each epoch (covariance of the normal distributions are the same for each);

      Four, firing rates for each input are independent (covariance of the normal distribution is diagonal).

      Does this refer to independent firing rates of neurons across stimulus sites? This seems very unlikely, given everything we know about dimensionality of cortex. Perhaps it refers to something else. Cannot all of these assumptions be tested? Were they?

      9) 768: we computed the cross-validated coefficient of determination (R2 within) to measure how well a neuron's firing rate could be explained by the responses to the sensory fields. This needs a better description, and I may be missing the point entirely. I assume it is an analysis of mean firing rate (which should be stated explicitly) and that it uses something like the indicator variable of the linear analysis of individual neuron tuning above. In this case is this a logistic regression? As it is computed for each side independently, it would appear that there are only four bits to describe the firing of any given neuron. This would seem to be a pretty impoverished statistic, even if the statistical model is accurate.

      10) 786: The purpose of computing a specificity index was to quantify the degree to which a neuron was tuned to represent information pertaining to one side of the body over the other. This is all pretty hard to follow. The R2 metric itself is a bit mysterious, as noted above. Within and across R2 is fairly straightforward, but adds to the complexity, as does SI, which makes comparisons of three different combinations of these measures across sides. Aside from R2 itself, the math is pretty transparent. However, a better high-level description of what insight all the different combinations provide would help to justify using them all. As is, there is no discussion and virtually no description of the difference across these three scatter plots. The critical point apparently, is that, "nearly all recorded PC-IP neurons demonstrate bilateral coding". There should be much a more direct way to make this point.

      11) Computing response latency via RF discrimination is rather indirect and assumes that there is significant classification in the first place. I suspect it will add at least some delay beyond more typical tests. Why not a far simpler and more direct test of means in the same sliding window? Alternatively, a change point analysis?

    2. Reviewer #2:

      General assessment:

      The study by Chivukula et al., explored a unique (n=1) dataset of multi-unit neuron recordings collected in the postcentral-intraparietal area (PC-IP) of a tetraplegic human subject taking part in a brain machine interface clinical trial. The recordings were collected across a set of tasks designed to investigate neuronal responses to both experienced and imagined touch.

      Overall I found the manuscript to be well-written, the study to be interesting, and the analysis reasonable. I do, however, think the manuscript would benefit by addressing two main, and a number of minor, issues.

      Major comments:

      1) The methods would benefit from additional rationale / supporting references throughout. Whereas it is generally clear what was done, it is sometimes less clear why certain choices were made. Perhaps some of the choices are "standard practice" when working with single unit recordings, but I was left in want of a bit more reasoning (or at least direction to relevant literature). Some examples are below:

      For the population correlation (line 723): why was the correlation computed 250 times or why were the two distributions shuffled together 2000 times?

      For the decode analysis (line 752): consider providing a reference for those interested in better understanding the "peeking" effects mentioned.

      Response latency (line 798): how were window parameters determined (for both visualization and the latency calculation). And what was the rationale for them being different - especially given that the data used for the response latency calculation was still visualized (at least in part)? Relatedly, I'd be curious to see the entire time-course for that data rather than just the shaded region of the "visualization" data. Also, it would be nice if a comment (or some data) could be provided regarding how much the latency estimates change based on these parameter choices.

      Temporal dynamics of population activity (line 830): why use a 500 ms window, stepped at 100 ms intervals instead of something else?

      Temporal dynamics of single unit activity (line 887): it is stated that the neurons were restricted to those whose 90th percentile accuracy was at least 50% to ensure only neurons with some degree of significant selectivity were used for the cluster analysis. But why these particular values? Are the results sensitive to this choice? In this section, I'd also suggest providing references for those interested in better understanding the use of Bayesian information criteria. Similarly, it is stated that PCA is a "standard method for describing the behavior of neural populations" - as such it would be nice to provide some relevant references for the reader.

      2) The manuscript would benefit from additional context in the intro as well as a more thorough discussion - particularly with respect to the imagination aspect of the experiment.

      Intro: The second paragraph did well in establishing why one might be interested in examining somatosensory processing in the PPC. It was however, less clear why the particular questions at the end of the paragraph were being posed. Perhaps an extra paragraph could be added to bridge the notion that a sizeable body of literature has been developed around somatosensory representation within the PPC and the several "fundamental" questions remaining that are of interest here.

      Discussion: The manuscript would benefit from a more thorough discussion of "imagination per se" and the various top-down processes that might be involved - as well as better positioning with respect to previous studies investigating top-down modulation of the somatosensory system. The authors state that the cognitive engagement during the tactile imagery may reflect semantic processing, sensory anticipation, and imagined touch per se - which I would not argue. But I would also expect some explicit mention of processes like attention and prediction. Perhaps these are intended to be captured by "sensory anticipation" - but, for example, attention can be deployed even if no sensation is anticipated. Importantly, it seems that imagining a sensation at a particular body site might well involve attending to that body part. That is, one may first attend to a body part before "imagining" a sensation there - and then even continue to attend there the entire time the imagining is being done. Because of this, perhaps the authors are considering attention to be a part of "imagination per se". But since attention has been shown to modulate somatosensory cortex without imagination, how can one exclude the possibility that the neuronal activity measured here simply reflects this attention component? Regardless, I think the discussion would benefit from a more explicit treatment of these top-down processes - especially given the number of previous studies showing that they are able to modulate activity throughout the somatosensory system. Some literature that may be of interest include:

      Roland P (1981) Somatotopical tuning of postcentral gyrus during focal attention in man. A regional cerebral blood flow study. Journal of Neurophysiology 46 (4):744-754

      Johansen-Berg H, Christensen V, Woolrich M, Matthews PM (2000) Attention to touch modulates activity in both primary and secondary somatosensory areas. Neuroreport 11 (6):1237-1241

      Hamalainen H, Hiltunen J, Titievskaja I (2000) fMRI activations of SI and SII cortices during tactile stimulation depend on attention. Neuroreport 11 (8):1673-1676. doi:10.1097/00001756-200006050-00016

      Puckett AM, Bollmann S, Barth M, Cunnington R (2017) Measuring the effects of attention to individual fingertips in somatosensory cortex using ultra-high field (7T) fMRI. Neuroimage 161:179-187. doi:10.1016/j.neuroimage.2017.08.014

      Yu Y, Huber L, Yang J, Jangraw DC, Handwerker DA, Molfese PJ, Chen G, Ejima Y, Wu J, Bandettini PA (2019) Layer-specific activation of sensory input and predictive feedback in the human primary somatosensory cortex. Sci Adv 5 (5):eaav9053. doi:10.1126/sciadv.aav9053

    3. Reviewer #1:

      In this study Chivukula, Zhang, Aflalo et al. report on an extensive set of neural recordings from human PPC. It is found that many neurons are responsive to touch in specific locations. Interestingly, a considerable fraction of the neurons displayed symmetric bilateral receptive fields. Furthermore, these neurons also became active during imagined touches. The study paves the way for a deeper understanding of the role of the human PPC.

      The paper presents a wealth of analysis on an extensive set of recordings. It is generally well written and the analyses are well thought out. My main concerns are regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below.

      1) At the start of the results section it is stated that the recordings were from "well-isolate and multi-unit neurons". This seems to contradict the Methods section, which only talks about "sorted" neurons. This needs to be clarified, and if multi-units were included, it should be stated which sections this concerns as it will have implications for the results (e.g. for selectivity for different body parts). In any case, the number of neurons included in different analyses should be evident. There are some numbers in the Methods and sprinkled throughout the Results section, but for some of the analyses (e.g. clustering analysis, which was run only on a responsive subset of neurons) no numbers are provided.

      2) The linear analysis section needs further details. The coefficients are matched to "conditions" but it is not explained how. I am assuming that each touch location is assigned to a condition c, however the way the model is described suggests that the vector X can in principle have multiple conditions active at the same time. Additionally, could the authors confirm whether it is the significance of the coefficients that determined whether a neuron was classed as responsive as shown in Figure 1? This analysis states a p-value but does give no further information on which test was run and on what data.

      3) Figure 1 C could be converted into a matrix that lists all combinations of RF numbers on either side of the body to highlight whether larger RFs on one side of the body generally imply larger RFs on the other side.

      4) I am confused about the interpretation of the coefficient of determination as shown in Figure 2A. In the text this is described as testing the "selectivity" of the neurons. To clarify, I am assuming that the "regression analysis" is referring to the linear model described in a previous section. The authors then presumably took the coefficients from this model for a single side only and tested how well they could predict the responses to the opposite side, as assessed by R^2 (Fig 2C,E). Before that in Fig 2A, they tested how well each single-side model could predict the responses. This is all fine, but the "within" comparison then simply tests how well a linear model can explain the observed responses, and has nothing to do with the selectivity of the neuron. For example, the neuron might be narrowly or broadly selective, but the model might fit equally well.

      5) Regarding the timing analysis, it is not clear to me how the accuracy can top out at 100% as shown in the figure, when the control conditions were included. Additionally, the authors should state the p value and statistic for the comparison of latencies.

      6) Spatial analysis. Could the authors provide the size of the paintbrush tip that was used in this analysis. Furthermore, as stimulation sites were 2 cm apart, it is not appropriate to specify receptive fields down to millimeter precision.

      7) Imagery: how many neurons were responsive to both imagery and real touch? Were all neurons that were responsive to imagery also responsive to actual touch? This is left vague and Figure 5 only includes the percentages per condition, but gives no indication of how many neurons responded to several conditions. Whether and how many neurons were responsive to both conditions also determines the ceiling for the correlation analysis in Figure 5D (e.g. if the most neurons are responsive only to actual but not imaginary touch, this will limit the population correlation).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Tamar R Makin (University College London) served as the Reviewing Editor.

      Summary:

      Chivukula and colleagues report an extensive set of multi-unit neural recordings from PPC of a tetraplegic patient taking part in a brain machine interface clinical trial. The recordings were collected across a set of tasks, designed to investigate neuronal responses to both experienced and imagined touch. It was found that many neurons are responsive to touch in specific locations. Most of the recorded neurons were activated bilaterally, which is consistent with earlier monkey work from this lab. Probably the most important component of the work is the analysis of the modest activation in this area that occurs simply when the participant imagines different places on her body being touched - even the insensate arm. This work is virtually impossible to do in animals, and as such offers a unique opportunity to describe neural properties for higher-level representation of touch. The study therefore paves the way for a deeper understanding of the role of the human PPC in the cognitive processing of somatosensation.

      Overall, we found the manuscript to be well-written, the study to be interesting, and for the most part the analyses are well thought out. But at the same time, the reviewers raised multiple main concerns regarding missing information and unclear descriptions of some of the analyses undertaken, which are detailed below over many major and minor comments. In addition, it was felt that there was unnecessary overlap across analyses - the first part especially contains a number of analyses that seem to make very similar points repeatedly or where it is not entirely clear what the point is in the first place. As such, there is a need to identify and cut a lot of the duplicative analyses/results and explain both the essential methods and the interpretation of the remaining results more succinctly and clearly. The key analyses could then be streamlined and better justified, ideally with an eye towards a consistent approach in both parts of the paper. here are also some major considerations regarding the contextualisation and interpretation of the key imagery results, as detailed in the first major comment below.

    1. Reviewer #3:

      The manuscript by Schonhaut et al. presents novel analysis on an impressive dataset of more than 1200 neurons across diverse brain areas in the human brain to investigate their modulation by hippocampal theta oscillations. They found a substantive proportion of cells phase-locked to hippocampal activity, mainly in the theta frequency band, in several areas known to be functionally related to the hippocampus, some of them receiving monosynaptic hippocampal inputs but other only indirect ones. These results extend previous reports in humans showing hippocampal interactions with these structures but at the level of mesoscopic activity and highlight the ubiquity of spike-theta timing and the importance of single-unit studies in humans. Additional analysis, detailed below, will contribute to give a better description of the data, provide stronger support for some of the authors' claims and clarify some issues.

      1) I assume that the dataset also includes hippocampal units, why then excluding them from the analysis? Although the main novelty is in the coupling of cells in other structures with hippocampal LFPs, it would be useful to also compare it with the coupling of local hippocampal cells.

      2) Include average power spectrum of hippocampal LFPs. Additional examples of raw LFP traces overlaid to spectrograms (perhaps in Supplementary) will help to illustrate the nature of hippocampal oscillations.

      3) The authors compared fractions of significantly modulated units and their preferred frequencies across regions. While very informative, these analyses are not sufficient to capture the richness of spike-LFP interactions likely existing in the dataset. Were there differences in the strength of phase-locking across regions? (this analysis could be added to Figure 2). Studies in rodents have shown that theta phase-locked units in different structures have characteristics preferred firing phases (when hippocampal LFP is used as a reference). Authors can easily look if this is also the case in their data. They should include both pooled data statistics of mean phases across regions and single neuron examples of firing probability by LFP phase (such examples could be added to the single unit plots in Figure 1).

      4) Did phase-locked and non phase-locked units have different properties? The authors can compare if they differ in basic properties such as mean firing rate, waveform width, inter-spike intervals, burstiness, etc., as it has been reported in other studies in non-human primates and rodents. These analyses could be extended to show if units with different properties also differ in their preferred phase-locking frequency, or phase. It would be very interesting if these analyses reveal the existence of heterogeneous cellular populations with different relation to hippocampal theta, even if the single-unit isolation quality is limited due to the low density recordings. In relation to this, authors should also plot unit auto-correlograms. ACGs can be computed for all the spikes, but also only for the strongly phase-locked spikes, to show if, at least during periods of strong oscillatory activity, some units show rhythmicity.

      5) To better interpret the results in Figure 4, it would be important to know if the recording sites in both hippocampi were from the same sub-region and similar location along the longitudinal hippocampal axis in each subject and if the degree of synchrony between the LFP in both hemispheres. Coherence or phase-locking between LFPs across hemispheres should be computed and also power spectrum for both of them shown.

      6) In Figure 4C-D it seems that phase-locking strength across hemispheres was not correlated but preferred frequency was. This should be quantified and mentioned in text before moving to the correlation in Figure 4E.

      7) The analysis in Figure 5D should be complemented by also checking the LFP-LFP phase-locking between the local region and the hippocampus. Were periods of high LFP power correlation also reflect enhanced phase-phase coupling? Were the structures also more phase-synchronous during periods of stronger spike-LFP coupling? These analyses could provide a more direct support for the interpretation of the authors in line with the CTC hypothesis.

      8) Was there any relation of the "strongly phase-locked" periods with global variables reflecting brain state (e.g. drowsiness versus attention to the task, etc.) or with the firing dynamics of the units (instantaneous firing rate or inter-spike intervals)?

    2. Reviewer #2:

      In this study, Schonhaut et al., describe the phase locking statistics of cortical and subcortical neurons with respect to hippocampal local field potential (LFP) recorded in 18 epilepsy patients undergoing seizure monitoring. Nearly 30% of extrahippocampal neurons showed phase locking to some bandpassed hippocampal signal. Amygdalar and entorhinal neurons were more likely to be phase locked, as compared to neurons recorded in other neocortical sites. Most neurons showed the strongest phase locking to hippocampal theta (2-8 Hz), though neocortical and amygdalar neurons tended to phase lock to lower theta bands. Spikes that were phase locked to hippocampal rhythms occurred during local LFP-states that showed moderate correlations with the spectral patterns observed in the hippocampus. These data are interpreted within the broader "communication through coherence" hypothesis.

      Large N, multi-region, single unit studies from humans are rare and the kind of mesoscopic descriptive analyses provided here serve as an important bridge between the large rodent literature on hippocampal physiology and human physiology and cognition. That said, there are some weaknesses in the analyses that could be addressed. Also, a deeper discussion of the biological origin of human theta is merited in the discussion to address alternate explanations - beyond communication through coherence - of the data.

      A similar statistical mistake was made several times. The author's logic goes like this: find the argmax in one sample, take the argument that generated that max, and use that to sample in another condition, and report that the max is higher in the first condition than the second. For example, on pg. 6 "This is difficult to reconcile with our results, in which 248/362 neurons (68.5%) phase-locked more strongly to hippocampal LFPs than to locally-recorded LFPs at their preferred hippocampal phase-locking frequency." The same flaw can be seen in Figure 5, where the spikes are sub-sampled to occur during strong phase locking in one condition, thus almost guaranteeing high power in the frequency bands that generated that strong phase locking (which was observed). This is a case in which cross-validating the data may be useful. The authors could take a subset of the hippocampal data to define the preferred frequency, and then test phase locking on the held out data from the hippocampus and cortex.

      The relationship between power and phase locking is not fully controlled in this paper. The phase seems to be calculated irrespective of whether there is any instantaneous power at that frequency band, introducing noise. This will bias away from finding significant phase locking to frequency bands that occur transiently. Therefore, I recommend defining some threshold of the existence of the spectral signal prior to using that signal to calculate phase.

      A related point has to do with the nature of the theta rhythm in the human. There has been considerable controversy over the years as to whether this is a comparable signal to that studied in the rodent. Based on the citations in this manuscript, and the nomenclature of the spectral band, the authors seek to make explicit the commonality of the underlying physiology, or function. Rodent theta is a sustained rhythm, while primate theta seems to come in bouts, perhaps even related to sampling statistics, such as saccades, leading to the suggestion that the apparent theta may be better thought of as semi-rhythmic evoked responses. How long were the bouts of high theta power? Was eye movement tracked? If so it would be important to relate the signal to eye movements. If the low frequency signal is phase locked to eye movement and potentially reflects semi-rhythmic information arriving to (from?) the hippocampus, then a stronger case could be made that hidden "third parties" synchronize the apparent communication through coherence observed here, and in fact there may be no communication at all.

      The authors dedicate much of their discussion to relating the current result to the communication through coherence analyses. Oddly, LFP coherence was never addressed. A strong prediction of the current framing would be that: when coherence is high, phase locking should be high, and higher than other moments when power in either region is high but coherence is not observed. The authors should directly measure how phase locking is modulated by coherence.

      The authors also lump together biological entities that should have different phase locking behaviors. The amygdala is not a monolithic region, does phase locking differ by nucleus? Also, do fast spiking inhibitory cells differ from excitatory cells? The authors should relate their phase locking measure to mean firing rate to show that it is insensitive to lower level cell statistics. This is important since the conclusions of the study would be quite different if neurons in the entorhinal cortex had high rates which artifactually drove up phase locking values.

    3. Reviewer #1:

      Hippocampal theta oscillations are among the most prominent rhythms in the mammalian brain. Extensive research in rodents has shown that neurons not only within the hippocampus but in widespread cortical areas can be phase-locked to hippocampal theta. Such cross-regional communication within theta frames has been postulated to be the foundation of many hippocampal operations. While previous studies in humans have documented the relationship between LFP theta and spiking in the hippocampus, coupling between hippocampal LFP and more remote cortical areas have not been demonstrated in human subjects. This is the topic of the present work. The authors show that spikes of single (and mostly multi-unit) neurons in multiple cortical regions both in the same and opposite hemispheres are phase locked to transient occurrence of hippocampal theta LFP in the 2-6 Hz range. However, phase-locking is stronger in structures known to be part of the 'limbic system', such as the amygdala and entorhinal cortex. Theta phase locking was stronger to hippocampal than to local LFP and the magnitude of spike phase locking increased when the power of theta increased, associated with increased high frequency power. The results are straightforward and the analysis methods are reliable. The novel information is limited but informative and documents a missing aspect of theta communication in the human brain.

      Comments:

      1) Given the simple message, the text is a bit long with many repetitions and loose ends. This applies to both Introduction and Discussion. Potential implications to learning, etc are interesting but the findings do not provide additional clues, thus those aspects of the discussion are mainly distractions. Instead, perhaps the authors would like to discuss potential mechanisms of remote unit entrainment. They are talking about multi-synaptic pathways but these are unlikely to be a valid conduit. Instead, the septum, entorhinal cortex or retrosplenial cortex, with their widespread projections, may be responsible for coordinating both hippocampal and neocortical areas.

      2) Arguably, the weakest part of the manuscript is the lack of hippocampal neurons. The authors refer to their own previous papers, but in a story which compares hippocampal theta oscillations with remote unit activity, it is strange that the magnitude of theta phase-locking to local hippocampal neurons is not available for comparison.

      3) How was the hippocampal LFP reference site chosen and did it vary substantially from subject to subject? Anterior or posterior locations?

      4) The authors list 1233 single neurons but in the discussion they make it clear that most of them were multiple neurons. This should be emphasized up front and may be used as an excuse why the authors did not attempt to separate pyramidal cells from interneurons (interneurons have a much higher propensity to be entrained by projected rhythms).

      5) Given that units were mixed, a logical extension would be to examine how hippocampal theta phase modulates high gamma in neocortical areas. This could potentially yield a much larger data base, targeting the same question.

      6) In the Discussion, the authors suggest that cross-regional theta phase coupling could be related to learning and other cognitive performance. However, spike-LFP coupling and coherence is confounded by LFP power increase and the authors cite Herweg et al., 2020 which did not find a relationship between theta power and memory performance. Is it then not logical to assume that cross-regional coupling may also not be related to memory?

      7) Line 36. "Long-term potentiation and long-term depression in the rodent hippocampus are also theta phase-dependent (Hyman et al., 2003)." Pavlides et al. (Brain Res 1988) or Huerta and Lisman (Neuron 1995) are perhaps more relevant references here.

      8) Line 82: "significant neocortical and contralateral phase-locking suggests". This is a strange phase. Perhaps significant phase locking of neurons in the neocortex in both hemispheres or similar would be a better formulation.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      This is a very intriguing paper showing how hippocampal local field potentials couple with the activity of other cortical regions. This mechanism has been and continues to be extensively studied in other mammals, and thus its existence and relevance in humans is exciting.

    1. Reviewer #2:

      This human psychophysics study claims to provide more evidence in support of the popular notion that visual processing of faces may involve partially independent processes for the analysis of static information such as facial shape versus dynamic information such as facial expression. In this respect the scientific hypotheses and conclusions are not novel, although some of the methods (parametric variation of facial expression dynamics using computer-generated animations) and analyses (Bayesian generative modeling of expression dynamics) are relatively new. Although the science is rigorously conducted, the paper currently feels heavy on statistics and technical details but light on data, compelling results and clear interpretation. However, the main problem is that the study fails to provide sufficient controls to support its central claims as currently formulated.

      Concerns:

      1) A central claim of the paper and the first words in the title are that the behavior studied (categorization of facial expression dynamics) is "shape-invariant". However, the lack of variation in facial shapes (n = 2) used here limits the strength of the conclusions that can be drawn, and it certainly remains an open question whether representations of facial expression dynamics are truly "shape-invariant". A simple control would have been to vary the viewing angle of the avatars, in order to dissociate 3D object shapes from their 2D projections (images). The authors also claim that "face shapes differ considerably" (line 49) amongst primate species, which is clearly true in absolute terms. However, the structural similarity of simian primate facial morphology (i.e. humans and macaques used here) is striking when compared to various non-primate species, which naturally raises questions about just how shape-invariant facial expression recognition is. The lack of data to more thoroughly support the central claim is problematic.

      2) As the authors note, macaque and human facial expressions of 'fear' and 'threat' differ considerably in visual salience and motion content - both in 3D and their 2D projections (i.e. optic flow). Indeed, the decision to 'match' expressions across species based on semantic meaning rather than physical muscle activations is a central problem here. Figure 1A illustrates clearly the relative subtlety of the human expression compared to the macaque avatar's extreme open-mouthed pose, while Fig 1D (right panels) shows that this is also true of macaque expressions mapped onto the human avatar. The authors purportedly controlled for this in an 'optic-flow equilibrated' experiment that produced similar results. However, this crucial control is currently difficult to assess since the control stimuli are not illustrated and the description of their creation (in the supplementary materials) is rather convoluted and obfuscates what the actual control stimuli were.

      The results of this control experiment that are presented (hidden away in supplementary Fig S3C) show that subjects rated the equilibrated stimuli at similar levels of expressiveness for the human vs macaque avatars. However, what the reader really needs to know is whether subjects rated the human vs macaque expression dynamics to be similarly expressive (irrespective of avatar)? My understanding is that species expression (and not species face shape) is the variable that the authors were attempting to equilibrate for.

      In short, the authors have not presented data to convince a reader that their equilibrated stimuli resolve the obvious confound in their original stimuli (namely the correlation between low level visual salience - especially around the mouth region- and the species of the expression dynamics).

      3) This paper appears to be the human psychophysics component of work that the authors have recently published using the macaque avatar. The separate paper (Siebert et al., 2020 - eNeuro) reported basic macaque behavioral responses to similar animations, while the task here takes advantage of the more advanced behavioral methods that are possible in human subjects. Nevertheless, the emphasis of the current paper on cross-species perception begs the question - how do macaques perceive these stimuli. Do the authors have any macaque behavioral data for these stimuli (even if not for the 4AFC task) that could be included to round this out? If not, I recommend rewording the title since it's current grammatical structure implies that the encoding is "across species", whereas encoding of species (shape and expression) was only tested in one species (humans).

    2. Reviewer #1:

      Overall assessment:

      The strengths of this paper are the novel cross species stimuli and very interesting behavioural findings, showing sharper tuning for recognising human expression sequences compared to monkey expressions. Technically, the paper is of a very high quality, both in terms of stimulus creation, but also in terms of analysis. Appropriate control experiments have been run, and in my view, the only concern is the way the results are presented, which I believe can be dealt with by restructuring the text. Other than that, I feel this would make a very nice contribution to the field.

      Concerns:

      The only major concern that I have is that the main take-home messages do not come through clearly in the way the Results section is currently structured. I found there was still too much technical detail - despite considerable use of Supplementary Information (SI) - which made extracting the empirical findings quite hard work. The details of the multinominal regression, the model comparisons (Table 1) and even the Discriminant Functions (Fig 2), for example, could all be briefly mentioned in the main text, with details provided in Methods or SI. These are all interesting, but I feel the focus should be on the behavioural findings, not the methods.

      I would suggest using the Discussion as a guide (this clearly states the key points) making sure the focus is more on Figure 3 and then working through the points more concisely.

      Obviously, this can be achieved simply by re-writing and does not take away from the significance of the work in any way. While the quality of the English is generally very high, some very minor wording issues could also be dealt with at this stage.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The paper employs novel cross species stimuli and a well-designed psychophysical paradigm to study the visual processing of facial expression. The authors show that facial expression discrimination is largely invariant to face shape (human vs. monkey). Furthermore, they reveal sharper tuning for recognising human expressions compared to monkey expressions, independent of whether these expressions were conveyed by a human or a monkey face. Technically, the paper is of a very high quality, both in terms of stimulus creation, but also in terms of analysis.

    1. Author Response

      We thank the editors and the reviewers for a number of useful criticisms and suggestions, and for the opportunity given to us, as authors, to publicly reply to the comments. This is a useful exercise, which brings to the attention of the reader lights, but also shadows of the reviewing process, and that we hope will lead in future to develop a better approach to it. Here, we will reply to a number of selected issues which appear to us to be of particular relevance.

      Reviewer 1

      Reviewer 1 disqualifies our work altogether, based on her/his statement that: “In the paper by Mercurio et al, the authors examine the role of SOX2 in the development of mouse hippocampal dentate gyrus. Using conditionally mutant SOX2 mice the authors show that early, but not late, deletion of SOX2 leads to developmental impairments of the dentate gyrus. A drawback of their study is that these findings have been reported previously by the group (Favaro et al. 2009; Ferri et al. 2013).

      The statement reported in bold is simply not true. In Favaro et al. 2009 (Nat Neurosci 12:1248), we demonstrated that nes-Cre-mediated Sox2 deletion leads to defects in postnatal, but not embryonic, hippocampal neurogenesis. In Ferri et al. 2013 (Development 140:1250), we demonstrated that FoxG1Cre-mediated Sox2 deletion leads to defective development of the VENTRAL forebrain. The presence, at the end of gestation, of hippocampal defects was just mentioned in one sentence: - “the hippocampus, at E18.5, was severely underdeveloped (not shown)” (line 1, page 1253)-, and not analyzed any further. In the present work, we describe in detail, starting from E12.5, up to E18.5, how the hippocampal defect develops, and undertake a detailed study of downstream gene expression and cellular defects arising in mutants.

      It is unfortunate that the reviewer further insists on the same misleading, and unfounded statement – see her/his comment 3, highlighted in bold character: “the authors state "...remarkably, in the FoxG1-Cre cKO, the DG appears to be almost absent (Figure 2A).". The question is why this finding is remarkable as it already was published in (Ferri et al. 2013)”. As mentioned above, we only remark, in Ferri et al., that the hippocampus was severely underdeveloped (not shown).

      Reviewer 2

      Reviewer 2 states, already at the beginning: “I am concerned about a major confounding issue (see below).” ... “The authors rely on Foxg1-Cre for their main evidence that very early deletion of Sox2 leads to near loss of the dentate. However, it doesn't appear that the authors are aware that Foxg1 het mice have a fairly significant dentate phenotype (see this paper).”

      The reviewer refers to the fact that, to delete Sox2, we need to express a Cre gene “knocked-in” into the Foxg1 gene; hence, heterozygous and homozygous Sox2 deletions will be accompanied by heterozygous loss of Foxg1. If Foxg1 is important for hippocampus development, the absence of a Foxg1 allele will affect the phenotype.

      Unfortunately, the statement of the reviewer is subtly misleading, and leads the reader who has not checked the data reported in the cited paper (Shen et al., 2006) to erroneously believe that heterozygous loss of Foxg1 may be responsible for the effects that we report upon homozygous Sox2 deletion. In contrast to the statement made by the reviewer, the paper cited by the reviewer documents that, while heterozygous loss of Foxg1 leads to important POSTNATAL dentate gyrus abnormalities, the PRENATAL development of the dentate gyrus is essentially normal (Figure 6) (“a subtle and inconsistent defect” of the ventral blade observed in about 50% of the mice at E18.5, according to the authors of that paper). Compare “subtle and inconsistent defect” by Shen et al. with “fairly significant dentate phenotype”, as stated by the reviewer. As our paper is entirely focused on defects seen in PRENATAL development in Foxg1Cre; Sox2 mutants, the subtle and inconsistent defects seen by Shen et al. are in sharp contrast with the deep defects seen in embryonic development in our Foxg1Cre;Sox2-/- mutants, and in agreement with the similarity we observe between wild type and heterozygous Foxg1Cre;Sox2+/- embryos (page 5, lines 140-145, of the version of the Full Submission for publication on August 30). An example showing the comparison between a Wild type, a FoxG1 +/- heterozygote;Sox2+/- heterozygote and a FoxG1 heterozygote;Sox2-/- homozygote is now shown in the accompanying figure.

      Obviously the incorrect statement kills our paper by itself. If the reviewer had doubts, we could have provided plenty of additional data demonstrating the lack of significant differences between Foxg1CRE Sox2+/- and wild type (Sox2+/+) embryos, as we stated in our paper.

      There is an additional interesting comment by Reviewer 2 (see points 2 and 6). The reviewer argues that “The only two direct targets they find don't seem likely to be important players in the phenotypes they describe”. The Reviewer excludes the Gli3 gene (a direct Sox2 target, see Fig. 6), as a possible important player, in spite of the observation that Gli3 is decreased, at early developmental stages, in the cortical hem (Figure 5). The reviewer says “The Gli3 [mutation] phenotypes that have been published are quite distinct from this”. We object that the Gli3 phenotypes are indeed more severe than the phenotype of our mutant, and include failure to develop a dentate gyrus. However, this observation does not preclude the hypothesis that the decreased expression of Gli3 in our mutant is directly responsible for the phenotype we observe. The more severe phenotype of the Gli3 mutants is in fact due to a germ-line null mutation, whereas, in our Foxg1-Cre Sox2 mutants, we observe only a reduction of Gli3 expression, around E12.5 (Fig. 5), that is compatible with a less severe dentate gyrus phenotype. The Reviewer adds that Wnt3A, based on the phenotype of the knock-out mice, similar to that of our Sox2 deleted mice, is a more relevant gene, but it is not a direct target of Sox2. However, the fact that Wnt3A is apparently not directly regulated by Sox2 is not necessarily to be considered a “minus”; Sox2, being a transcription factor, is expected to directly regulate a multiplicity of genes, whose expression will affect the expression of other genes. Indeed, we presented in Fig 6D the hypothesis that decreased expression of Gli3 may contribute to decreased expression of Wnt3A, as already proposed by Grove et al. (1998) based on the observation that Gli3 null mutants lose the expression of Wnt3A (and other Wnt factors) from the cortical hem. The additional suggestion made by the Reviewer, in the context of the Wnt3A hypothesis, to investigate LEF1, as a potential direct Sox2 target, and its expression, is certainly interesting, but, as stated by the reviewer, LEF1 is downstream to Wnt3A, and, by itself, its hypothetical regulation by Sox2 would not explain the downregulation of Wnt3A. Moreover, we already have evidence that Sox2 does not directly regulate Wnt3A (unpublished).

      Reviewer 1 and 2

      Both Reviewer 1 and 2 have questions about the timing of Sox2 ablation in the Sox2 mutants obtained with the three different Cre deleters. As we state in the text (pages 4, 6), Foxg1-Cre deletes at E.9.5 (Ferri et al., 2013; Hébert and McConnell, 2000); Emx1-Cre deletes from E10.5 onwards, but not at E9.5 (Gorski et al., 2002; see also Shetty AS et al., PNAS 2013, E4913); Nestin-Cre deletes at later stages, around E12.5 (Favaro et al. 2009).

      Reviewer 3

      We thank Reviewer 3 for the useful considerations and suggestions, which constructively help to improve the paper.

      Imgur

      Evidence that Sox2+/-;FoxG1+/- hippocampi at E18.5 do not significantly differ from wild type (Sox2+/+, FoxG1+/+) controls. In contrast, Sox2-/-;FoxG1+/- hippocampi are severely defective. (A) GFAP immunofluorescence at E18.5 on coronal sections of control and FoxG1-Cre cKO hippocampi (controls n=6, mutants n=4). (B) In situ hybridization at E18.5 for NeuroD (controls n=4, mutants n=3) on coronal sections of control and FoxG1-Cre cKO hippocampi. Arrows indicate dentate gyrus (DG); note the strong decrease of the dentate gyrus, and the radial glia (GFAP) disorganization in cKO.<br> The Sox2flox/flox genotype corresponds to wild type mice (Sox2+/+). The Sox2+/flox ; FoxG1Cre genotype corresponds to Sox2+/-; FoxG1+/- controls. The Sox2flox/flox ; FoxG1Cre genotype corresponds to Sox2-/-; FoxG1+/- mutants.

    2. Reviewer #3:

      This paper investigates the role of Sox2 in early hippocampal development. Previously the authors investigated conditional knockout mice using a Nestin-Cre line and found few phenotypes. The authors hypothesised that Sox2 may have greater impact on earlier developmental stages. The authors used a similar approach in a previous paper (Ferri et al., 2003) studying the ventral forebrain. To test this in the dorsal telencephalon they generated conditional knockout mice using both Emx1-Cre and FoxG1Cre driver lines. These lines displayed more significant phenotypes in the hippocampus, particularly in the cortical hem and dentate gyrus, and were most severe in the FoxG1Cre cross.

      The study is well executed and carefully thought through. Appropriate controls have been included for all experiments.

      In Figure 6, the data on Gli3 has been verified with additional luciferase data. The data on Cxcr4 has been previously published and has not been further verified with luciferase analysis. Including panel C in the figure may not be justified unless additional data is included to verify the result. It could be referred to in the discussion.

      In addition, related to Figure 6, Bertonlini et al., 2019, identified a number of Sox2 responsive enhancers, expressed in the dorsal telencephalon but it is not clear why these are not incorporated into the model. Further justification in the discussion would be helpful. The authors may also consider discussing how Emx2 in their model since they previously showed it was a negative regulator of Sox2 (Mariani et al 2012) and is required for hippocampal development (eg: Pellegrini et al., 1996; Yoshida et al 1997; Zhao et a la 2006).

      Regarding the interpretation of the results in Figure 7, previous work by the authors showed that early deletion of Sox2 using a Bf1Cre driver line resulted in severe developmental defects of the ganglionic eminences and therefore GABAergic interneurons. Are the development of GABAergic interneurons affected in the FoxG1Cre cross? It would be preferable to include some analysis of this, or at least a discussion of this issue in the context of the electrophysiology results.

      The authors use an eYFP reporter line in Figure 1- supplement. If they have similar data demonstrating Cre activity with the eYFP reporter crossed into the FoxG1CreXSox2flox/flox and Emx1CreXSox2 flox/flox it would be good to add this. It would demonstrate cell autonomous knockout versus non-cell autonomous knockout of Sox2 and may help with the interpretation of Sox2 function.

    3. Reviewer #2:

      This study examines the phenotype of early deletion of Sox2 and shows that there is a major dentate phenotype when fl-Sox2 mice are crossed to Foxg1-Cre when compared to Emx1-Cre or Nestin-Cre. This is a novel phenotype, but I don't think the authors have addressed the basis of this phenotype adequately to understand the basis of the phenotype. In addition, I am concerned about a major confounding issue (see below). I believe significant additional studies are needed to establish the specific role of Sox2 here. Below I list the major concerns.

      1) The authors rely on Foxg1-Cre for their main evidence that very early deletion of Sox2 leads to near loss of the dentate. However, it doesn't appear that the authors are aware that Foxg1 het mice have a fairly significant dentate phenotype (see this paper). The Foxg1-Cre line generated by Hebert and used by the authors is a knock-in allele that inactivates the endogenous Foxg1 gene. The authors need to address whether the phenotype they observe is actually due to loss of Sox2 alone at E9.5 vs the combined loss of Sox2 and a copy of Foxg1. In particular, could this explain the difference between Emx1-Cre and Foxg1-Cre lines? If this is the explanation for the difference, it isn't clear to me that the story really holds together without bringing in far more complex compound mutant explanations.

      2) The phenotype as described by the authors appears to be most compatible with the published Wnt3a mutant phenotype - perhaps a hypomorphic version makes the most sense or a near phenocopy of the Lef1 mutant. Given this, it appears to me this is really a hem phenotype and is likely explained by the loss of Wnt3a predominantly. Yet the authors don't show direct regulation of Wnt3a by Sox2 - the study would be dramatically enhanced by addressing the mechanism of loss of Wnt3a expression. In addition, examining the expression of Lef1 might reveal the more proximal mechanism of loss of DGC than simply less Wnt3a. This might also be another potential direct target of Sox2 since Lef1 expression is regulated by Wnt signaling but also by other morphogenic signals and could be a Sox2 target.

      3) The authors provide little specific analysis of hippocampal subfield specific markers. Their assumption is that the cells that are in the malformed dentate are granule neurons but they don't use any specific markers of DGC (eg Prox1). Instead they rely on cell position and expression of NeuroD (which is nonspecific). Similarly, it would make sense to examine other markers of mossy cells and CA3, which are also in the same region as DGC and made by adjacent neuroepithelium.

      4) Much of the study relies on the assumption that Nestin-Cre is an efficient deleter in the entire hippocampus yet there is no direct evidence of this. The authors could easily determine when Sox2 expression is lost in the various Cre-deleter lines using antibodies.

      5) I think the electrophysiology section isn't very useful or important. We know that mice with major developmental defects in the DG and hippocampus will have changes in circuit physiology. There is nothing specific about this phenotype, nor does it shed light on the important biology here.

      6) The only two direct targets they find don't seem likely to be important players in the phenotypes they describe, thus, it seems that they don't necessarily address the biology here. The Gli3 phenotypes that have been published are quite distinct from this.

      7) Some of the dentate phenotype is no doubt due to defects in CR cell production or development and this indirect effect has been seen in many other mutants that affect CR cell production (ie a disorganized dentate). It is hard to see how this part of the phenotype, which is likely due to the hem defects (the neuroepithelium that makes the CR cells) is helping us to understand the fundamental aspects of this phenotype.

    4. Reviewer #1:

      In the paper by Mercurio et al, the authors examine the role of SOX2 in the development of mouse hippocampal dentate gyrus. Using conditionally mutant SOX2 mice the authors show that early, but not late, deletion of SOX2 leads to developmental impairments of the dentate gyrus. A drawback of their study is that these findings have been reported previously by the group (Favaro et al. 2009; Ferri et al. 2013). In the current study the authors show additional examples of SOX2 target genes, which are dysregulated in the cortical hem upon early SOX2 deletion. However, as no mechanistic insights how this may affect dentate gyrus development are provided, the general novelty of the study is limited.

      Comments:

      1) The language of the manuscript needs to be improved.

      2) Using different Cre-drivers the authors aim to delete SOX2 at different developmental stages. What references demonstrate that EMX-Cre first deletes SOX2 after E10.5? I don't find where in Tronche et al. 1999 is it shown that Nes-Cre is deleting after E11.5?

      3) At line 149 the authors state "...remarkably, in the FoxG1-Cre cKO, the DG appears to be almost absent (Figure 2A).". The question is why this finding is remarkable as it already was published in (Ferri et al. 2013).

      4) Line 154 "In the FoxG1-Cre cKO, Reelin expression (marking CRC) is greatly reduced, and a HF is not observed (Figure 2D);...". This statement has no support from Figure 2D.

      5) Some of the images presented in Figure 4 are of such poor quality that they are hard to evaluate.

      6) In Figure 6 the authors show that SOX2 interacts with the promoter region of the Cxcr4 gene and that the SOX2 bound enhancer is active in the developing Zebrafish brain. These data can be removed as they have been published previously in Bertolini et al. 2019.

    5. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript. Joseph G Gleeson (Howard Hughes Medical Institute, The Rockefeller University) served as the Reviewing Editor.

      Summary:

      The positive aspects of the paper are the examination of the role of SOX2 in the development of mouse hippocampal dentate gyrus. Using conditionally mutant SOX2 mice the authors show that early, but not late, deletion of SOX2 leads to developmental impairments of the dentate gyrus. There were substantial criticisms of the work, most importantly that the work does not advance the field as much as is expected for a high-ranking journal, considering prior publications, and that there could be some difficulties interpreting the data as presented.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 6 2020, follows.

      Summary

      The reviewers and editors were enthusiastic about the major conclusion of the study: that NusG-dependent pausing is an important factor that promotes Rho-independent transcription termination in Bacillus subtilis. Nonetheless, we felt that this conclusion can be strengthened with additional analysis and experiments that hopefully are not terribly burdensome. We believe that these additions would bring the paper to the level required for publication in eLife. The essential revisions are detailed below.

      Essential Revisions

      1) While we appreciated the careful follow-up work, we felt that the major conclusion could be strengthened by a more in-depth analysis of the genome-wide data, assuming those data support the role of NusG-dependent pausing in termination. Reviewers 1 and 2 give specific suggestions in their reviews. Some of these relate to the way comparisons are made between datasets, and others address specific scientific questions. Of particular relevance are analyses that test whether NusG stimulates termination at sites with (i) weak terminal base-pairs, and (ii) gaps in the U-tract. Any other analyses of the genome-wide data that support the importance of NusG-simulated pausing in termination would be valuable to include. For example, is there any evidence that NusG-dependent pause sites identified by NET-seq are associated with sites of termination?

      2) Termination sites in vitro are consistently downstream of those observed in vivo. While it is reasonable to hypothesize that this difference is due to trimming by exonucleases, there is no experimental evidence presented to support this. To test the hypothesis, we suggest mapping termination sites by 3' RACE in RNase mutant strains for one or two of the terminators characterized in the paper. B subtilis 3' exonucleases are defined, and mutant strains have been described (e.g., Oussenko et al., 2005 J Bacteriol 187:2758; Liu et al., 2014 Mol Microbiol 94:41).

      3) Add a figure showing the model described in the discussion (lines 332-84) for the proposed roles of NusG and NusA in intrinsic termination.

      4) Broaden the discussion of how the study relates to prior work on intrinsic termination.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 18 2020, follows. The decision letter below relates to version 1 of the preprint.

      Summary

      This work has the potential to make an important original contribution to the aging biology literature and includes interesting new findings regarding the role of 17a-E2 in lifespan extension. The authors provide persuasive evidence of the role of classic estradiol receptor ERa in 17a-E2 signaling and mediating the metabolic effects of 17a-E2. However, far reaching and unsupported claims are made regarding the central tenets of this manuscript as is stated in their abstract: namely sex-specific differences and that of tissue specific mediation of 17a-E2 effects facilitating the therapeutic benefit. While the tissue specific data are suggestive that the liver and hypothalamus facilitate the beneficial effects of 17a-E2, these data do not appear to have been appropriately statistically analyzed. These need to be addressed prior to publication.

      The authors need to provide additional evidence for the tissue-specific nature of 17a-E2s effects on metabolism and lifespan, as well as correct/redo/undertake their statistical analyses using appropriate methods. Moreover, the experimental design needs to be more clearly documented and the data presented in a legible and interpretable manner. As such, the manuscript requires a complete reanalysis of current data and a complete rewrite, before we could consider its publication in eLife.

      Essential Revisions

      1) The experimental design needs to be more clearly documented, and the data presented in legible figures with detailed legends that alert the reader to the salient methods and findings. The authors should include a brief rational for each of the experiments used and their choice of cells and provide a concise description of the methods used, and sample size. The authors need to articulate how the significance for RNA-seq and CHIP-seq data was assessed.

      2) If the authors already have the complementary data on female mice on a high fat diet (rather than on normal chow) to facilitate a direct comparison with the male mice on a high fat diet and, thereby, support their sex-specific claims, it would strengthen the paper considerably, if not their claims on sex specificity should be removed from this paper as one cannot directly compare females on normal chow with males on a HFD.

      3) The authors state that their ChIP-seq data reveal nearly identical ERa binding patterns with17a-E2 and 17b-E2, however these have not been rigorously analyzed. The authors need to undertake actual statistical comparisons across groups rather than rely solely on qualitative assessments.

      4) The authors need to provide additional evidence for the tissue-specific nature of 17a-E2s effects on metabolism and lifespan. While they present convincing data that administration of 17a-E2 has direct effects on liver and hypothalamus, they have not provided definitive evidence that these tissues are directly responsible for the beneficial effects on metabolism and lifespan. The abstract and results section should be appropriately tempered.

      5) The authors claim that 17a-E2 reverses cellular senescence in the liver but provide limited conclusive supporting data. The authors need to provide additional evidence to support these claims.

      6) The motif comparison for the ChIP-seq data in Fig. 1B is not described sufficiently to allow for evaluation by readers. Moreover, the authors do not appear to have employed appropriate statistical analyses. Please describe the statistical tests employed.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on September 21 2020, follows.

      Summary

      In this technical report, the authors use a previously-described mouse reporter construct to measure Cre-mediated recombination events, and repurpose it to examine Cas-induced double-strand break (DSB) repair gene editing events. All reviewers agree that the article will be of interest for the gene editing community and has the potential to make a significant addition to the field, in particular the delivery aspect in different experimental systems (cell culture, organoids, in vivo tissues). However, the reviewers also are in consensus that the manuscript is in need of additional experimentation to bolster the claims, in particular with regards to HITI and the complex events. Moreover, the authors need to address additional points as detailed below, many of which require only clarification, elaboration and text changes but not additional data. The claim in Figure 4 is not well corroborated and may be dropped unless a pilot screen and its results are presented in the revision. The claim in Figure 8 is not essential and may be dropped for better elaboration in an independent study. Below is a summary of the comments listing the essential revisions required for a revised manuscript.

      Essential Points

      1) The title should be changed. First, the use of in vivo may imply to some that the system is only for animals, which would short-sell its value. Second, given that the FIVER system is essentially the creative but simple use of dual CRISPR/Cas targeting on top of an already existing system (mT/mG Cre reporter mouse), the use of a novel acronym (FIVER) is not merited.

      2) The Abstract should include the fact that this represents a previously-described Cre/Lox reporter repurposed for gene editing analysis. This information is in the text, but more transparency is required. The repurposing of a previously-described LoxP reporter assay for gene editing does not constitute a novel reporter assay. Moreover, the abstract should highlight the gene delivery advance.

      3) The text requires elaboration and comparisons to other delivery approaches in the literature for each tissue examined. For example, it is unclear whether the efficiency of the delivery of Cas/sgRNAs to the retina in Figure 7H is expected based on other studies of gene delivery to this tissue. Namely, is it more difficult to achieve editing in this tissue, compared to introduction of a fluorescent transgene? If this has not been done, it does not need to be done here, but such comparisons with experiments in the literature will reinforce the utility of the approach.

      4) The HDR reporter presentation could be clarified and the assay has some limitations that should be discussed. Figure 1a is difficult to follow, because the repair templates are not shown. It is suggested to show at least the minicircle template and the targeted integration. In general, please strive that figures are understandable without consulting the text.

      Overall the frequencies of HDR are very low, but this is expected due to the design of the assay. Since two tandem DSBs are induced, NHEJ using the distal DSB ends causes loss of both cut sites, and hence is likely highly favored over terminal repair event. In contrast, most gene editing events involving HDR are induced by a single DSB, for which NHEJ recreates the cut site, and hence is a futile repair event. Namely, HDR is also promoted by the persistent nature of single Cas9-induced DSBs, which are inhibited in this assay by the second DSB. Also, the HDR event here requires a relatively long gene conversion tract, which is also not necessarily the goal of therapeutic gene editing. Accordingly, it is unclear whether this assay will be particularly useful for studying HDR.

      5) It is inaccurate to state this is the first in vivo gene editing reporter for HDR. The DR-GFP mouse was established and used to examine HDR frequencies in mouse tissues over 7 years ago (PMID: 23509290). The text should be changed and this information should be included in the manuscript.

      6) The validation of HITI editing and proposed applications are not at the level of Figures 1-3 validating endjoining or HDR events. NGS or equivalent quantitative techniques should be applied also here. This is especially true for the embryo editing application (Fig. 7) and the independent locus editing correlation (Fig. 8). For example, the very low levels of mosaicity during embryo editing is particularly surprising, since other papers have indicated that this is a major problem by sequence-based methods and phenotypic outcomes (e.g. tyrosinase editing for coat color). Have the authors validated the frequency of BFP+ events that are bona fide integration events at the target locus vs. random integration? This could be done, for example, by showing that the DSB at the target locus is required for BFP+ cells. BFP-minicircle transfections are used as negative controls, but do these also lack the DSB in the BFP-minicircle? Namely, the appropriate negative control should be BFP-minicircle + the DSB cutting the minicircle, but without the DSB cutting the target locus. This requires extensive new validation experiments that are essential for a revision.

      7) To improve the utility of the assay, the "unexpected outcome" of editing at the reporter locus (i.e. +tdTomato/+EGFP) must be investigated further with new experimentation to fully understand the structure of the event and potentially deduce the mechanism(s) leading to it.

      8) Figure 4 and associated main text. The claim that the reporter assay can be used in drug screens is not well supported. Either a proof-of-concept screen should be conducted or otherwise this claim should be removed.

      The small molecule SCR7 that has been reported to target DNA Ligase 4 (Lig4) is discredited (PMID: 27235626) and this should be discussed and indicated in any figure, if this section were to remain in the manuscript.

      9) The claim made in Figure 8 that editing at the reporter locus corresponds to editing at another independent locus requires further evidence. Controls for cutting efficiency are lacking and more loci need to be tested. In the context of the dual-editing (HITI at mT/mG and Zmynd10), it should furthermore be evaluated whether integration of the n.TagBFP occurs at the Zmynd10 locus and vice versa. The comments must be addressed experimentally, if this claim were to remain in the revised manuscript.

    1. Author Response

      Reviewer #1:

      Hutchings et al. report an updated cryo-electron tomography study of the yeast COP-II coat assembled around model membranes. The improved overall resolution and additional compositional states enabled the authors to identify new domains and interfaces--including what the authors hypothesize is a previously overlooked structural role for the SEC31 C-Terminal Domain (CTD). By perturbing a subset of these new features with mutants, the authors uncover some functional consequences pertaining to the flexibility or stability of COP-II assemblies.

      Overall, the structural and functional work appears reliable, but certain questions and comments should be addressed prior to publication. However, this reviewer failed to appreciate the conceptual advance that warrants publication in a general biology journal like eLIFE. Rather, this study provides a valuable refinement of our understanding of COP-II that I believe is better suited to a more specialized, structure-focused journal.

      We agree that in our original submission our description of the experimental setup, indeed similar to previous work, did not fully capture the novel findings of this paper. Rather than being simply a higher resolution structure of the COPII coat, in fact we have discovered new interactions in the COPII assembly network, and we have probed their functional roles, significantly changing our understanding of the mechanisms of COPII-mediated membrane curvature. In the revised submission we have included additional genetic data that further illuminate this mechanism, and have rewritten the text to better communicate the novel aspects of our work.

      Our combination of structural, functional and genetic analyses goes beyond refining our textbook understanding of the COPII coat as a simple ‘adaptor and cage’, but rather it provides a completely new picture of how dynamic regulation of assembly and disassembly of a complex network leads to membrane remodelling.

      These new insights have important implications for how coat assembly provides structural force to bend a membrane but is still able to adapt to distinct morphologies. These questions are at the forefront of protein secretion, where there is debate about how different types of carriers might be generated that can accommodate cargoes of different size.

      Major Comments: 1) The authors belabor what this reviewer thinks is an unimportant comparison between the yeast reconstruction of the outer coat vertex with prior work on the human outer coat vertex. Considering the modest resolution of both the yeast and human reconstructions, the transformative changes in cryo-EM camera technology since the publication of the human complex, and the differences in sample preparation (inclusion of the membrane, cylindrical versus spherical assemblies, presence of inner coat components), I did not find this comparison informative. The speculations about a changing interface over evolutionary time are unwarranted and would require a detailed comparison of co-evolutionary changes at this interface. The simpler explanation is that this is a flexible vertex, observed at low resolution in both studies, plus the samples are very different.

      We do agree that our proposal that the vertex interface changes over evolutionary time is speculative and we have removed this discussion. We agree that a co-evolutionary analysis will be enlightening here, but is beyond the scope of the current work.

      We respectfully disagree with the reviewer’s interpretation that the difference between the two vertices is due to low resolution. The interfaces are clearly different, and the resolutions of the reconstructions are sufficient to state this. The reviewer’s suggestion that the difference in vertex orientation might be simply attributable to differences in sample, such as inclusion of the membrane, cylindrical versus spherical morphology, or presence of inner coat components were ruled out in our original submission: we resolved yeast vertices on spherical vesicles (in addition to those on tubes) and on membrane-less cages. These analyses clearly showed that neither the presence of a membrane, nor the change in geometry (tubular vs. spherical) affect vertex interactions. These experiments are presented in Supplementary Fig 4 (Supplementary Fig. 3 in the original version). Similarly, we discount that differences might be due to the presence or absence of inner coat components, since membrane-less cages were previously solved in both conditions and are no different in terms of their vertex structure (Stagg et al. Nature 2006 and Cell 2008).

      We believe it is important to report on the differences between the two vertex structures. Nevertheless, we have shifted our emphasis on the functional aspects of vertex formation and moved the comparison between the two vertices to the supplement.

      2) As one of the major take home messages of the paper, the presentation and discussion of the modeling and assignment of the SEC31-CTD could be clarified. First, it isn't clear from the figures or the movies if the connectivity makes sense. Where is the C-terminal end of the alpha-solenoid compared to this new domain? Can the authors plausibly account for the connectivity in terms of primary sequence? Please also include a side-by-side comparison of the SRA1 structure and the CTD homology model, along with some explanation of the quality of the model as measured by Modeller. Finally, even if the new density is the CTD, it isn't clear from the structure how this sub-stoichiometric and apparently flexible interaction enhances stability. Hence, when the authors wrote "when the [CTD] truncated form was the sole copy of Sec31 in yeast, cells were not viable, indicating that the novel interaction we detect is essential for COPII coat function." Maybe, but could this statement be a leap to far? Is it the putative interaction essential, or is the CTD itself essential for reasons that remain to be fully determined?

      The CTD is separated from the C-terminus of the alpha solenoid domain by an extended domain (~350 amino acids) that is predicted to be disordered, and contains the PPP motifs and catalytic fragment that contact the inner coat. This is depicted in cartoon form in Figures 3A and 7, and discussed at length in the text. This arrangement explains why no connectivity is seen, or expected. We could highlight the C-terminus of the alpha-solenoid domain to emphasize where the disordered region should emerge from the rod, but connectivity of the disordered domain to the CTD could arise from multiple positions, including from an adjacent rod.

      The reviewer’s point about the essentiality of the CTD being independent of its interaction with the Sec31 rod, is an important one. The basis for our model that the CTD enhances stability or rigidity of the coat is the yeast phenotype of Sec31-deltaCTD, which resembles that of a sec13 null. Both mutants are lethal, but rescued by deletion of emp24, which leads to more easily deformable membranes (Čopič et al. Science 2012). We agree that even if this model is true, the interaction of the CTD with Sec31 that our new structure reveals is not proven to drive rigidity or essentiality. We have tempered this hypothesis and added alternative possibilities to the discussion.

      We have included the SRA1 structure in Supplementary Fig 5, as requested, and the model z-score in the Methods. The Z-score, as calculated by the proSA-web server is -6.07 (see figure below, black dot), and falls in line with experimentally determined structures including that of the template (PDB 2mgx, z-score = -5.38).

      img

      3) Are extra rods discussed in Fig. 4 are a curiosity of unclear functional significance? This reviewer is concerned that these extra rods could be an in vitro stoichiometry problem, rather than a functional property of COP-II.

      This is an important point, that, as we state in the paper, cannot be answered at the moment: the resolution is too low to identify the residues involved in the interaction. Therefore we are hampered in our ability to assess the physiological importance of this interaction. We still believe the ‘extra’ rods are an important observation, as they clearly show that another mode of outer coat interaction, different from what was reported before, is possible.

      The concern that interactions visualised in vitro might not be physiologically relevant is broadly applicable to structural biology approaches. However, our experimental approach uses samples that result from active membrane remodelling under near-physiological conditions, and we therefore expect these to be less prone to artefacts than most in vitro reconstitution approaches, where proteins are used at high concentrations and in high salt buffer conditions.

      4) The clashsccore for the PDB is quite high--and I am dubious about the reliability of refining sidechain positions with maps at this resolution. In addition to the Ramchandran stats, I would like to see the Ramachandran plot as well as, for any residue-level claims, the density surrounding the modeled side chain (e.g. S742).

      The clashscore is 13.2, which, according to molprobity, is in the 57th percentile for all structures and in the 97th for structures of similar resolutions. We would argue therefore that the clashscore is rather low. In fact, the model was refined from crystal structures previously obtained by other groups, which had worse clashscore (17), despite being at higher resolution. Our refinement has therefore improved the clashscore. During refinement we have chosen restraint levels appropriate to the resolution of our map (Afonine et al., Acta Cryst D 2018)

      The Ramachandran plot is copied here and could be included in a supplemental figure if required. We make only one residue-level claim (S742), the density for which is indeed not visible at our resolution. We claim that S742 is close to the Sec23-23 interface, and do not propose any specific interactions. Nevertheless we have removed reference to S742 from the manuscript. We included this specific information because of the potential importance of this residue as a site of phosphorylation, thereby putting this interface in broader context for the general eLife reader.

      img

      Minor Comments:

      1) The authors wrote "To assess the relative positioning of the two coat layers, we analysed the localisation of inner coat subunits with respect to each outer coat vertex: for each aligned vertex particle, we superimposed the positions of all inner coat particles at close range, obtaining the average distribution of neighbouring inner coat subunits. From this 'neighbour plot' we did not detect any pattern, indicating random relative positions. This is consistent with a flexible linkage between the two layers that allows adaptation of the two lattices to different curvatures (Supplementary Fig 1E)." I do not understand this claim, since the pattern both looks far from random and the interactions depend on molecular interactions that are not random. Please clarify.

      We apologize for the confusion: the pattern of each of the two coats are not random. Our sentence refers to the positions of inner and outer coats relative to each other. The two lattices have different parameters and the two layers are linked by flexible linkers (the 350 amino acids referred to above). We have now clarified the sentence.

      2) Related to major point #1, the author wrote "We manually picked vertices and performed carefully controlled alignments." I do now know what it means to carefully control alignments, and fear this suggests human model bias.

      We used different starting references for the alignments, with the precise aim to avoid model bias. For both vesicle and cage vertex datasets, we have aligned the subtomograms against either the vertex obtained from tubules, or the vertex from previously published membrane-less cages. In all cases, we retrieved a structure that resembles the one on tubules, suggesting that the vertex arrangement we observe isn’t simply the result of reference bias. This procedure is depicted in Supplementary Fig 4 (Supplementary Fig. 3 in the original manuscript), but we have now clarified it also in the methods section.

      3) Why do some experiments use EDTA? I may be confused, but I was surprised to see the budding reaction employed 1mM GMPPNP, and 2.5mM EDTA (but no Magnesium?). Also, for the budding reaction, please replace or expand upon the "the 10% GUV (v/v)" with a mass or molar lipid-to-protein ratio.

      We regret the confusion. As stated in the methods, all our budding reactions are performed in the presence of EDTA and Magnesium, which is present in the buffer (at 1.2 mM). The reason is to facilitate nucleotide exchange, as reported and validated in Bacia et al., Scientific Reports 2011.

      Lipids in GUV preparations are difficult to quantify. We report the stock concentrations used, but in each preparation the amount of dry lipid that forms GUVs might be different, as is the concentration of GUVs after hydration. However since we analyse reactions where COPII proteins have bound and remodelled individual GUVs, we do not believe the protein/lipid ratio influences our structures.

      4) Please cite the AnchorMap procedure.

      We cite the SerialEM software, and are not aware of other citations specifically for the anchor map procedure.

      5) Please edit for typos (focussing, functionl, others)

      Done

      Reviewer #2:

      The manuscript describes new cryo-EM, biochemistry, and genetic data on the structure and function of the COPII coat. Several new discoveries are reported including the discovery of an extra density near the dimerization region of Sec13/31, and "extra rods" of Sec13/31 that also bind near the dimerization region. Additionally, they showed new interactions between the Sec31 C-terminal unstructured region and Sec23 that appear to bridge multiple Sec23 molecules. Finally, they increased the resolution of the Sec23/24 region of their structure compared to their previous studies and were able to resolve a previously unresolved L-loop in Sec23 that makes contact with Sar1. Most of their structural observations were nicely backed up with biochemical and genetic experiments which give confidence in their structural observations. Overall the paper is well-written and the conclusions justified.

      However, this is the third iteration of structure determination of the COPII coat on membrane with essentially the same preparation and methods. Each time, there has been an incremental increase in resolution and new discoveries, but the impact of the present study is deemed to be modest. The science is good, but it may be more appropriate for a more specialized journal. Areas of specific concern are described below.

      As described above, we respectfully disagree with this interpretation of the advance made by the current work. This work improves on previous work in many aspects. The resolution of the outer coat increases from over 40A to 10-12A, allowing visualisation of features that were not previously resolved, including a novel vertex arrangement, the Sec31 CTD, and the outer coat ‘extra rods’. An improved map of the inner coat also allows us to resolve the Sec23 ‘L-loop’. We would argue that these are not just extra details, but correspond to a suite of novel interactions that expand our understanding of the complex COPII assembly network. Moreover, we include biochemical and genetic experiments that not only back up our structural observations but bring new insights into COPII function. As pointed out in response to reviewer 1, we believe our work contributes a significant conceptual advance, and have modified the manuscript to convey this more effectively.

      1) The abstract is vague and should be re-written with a better description of the work.

      We have modified the abstract to specifically outline what we have done and the major new discoveries of this paper.

      2) Line 166 - "Surprisingly, this mutant was capable of tubulating GUVs". This experiment gets to one of the fundamental unknown questions in COPII vesiculation. It is not clear what components are driving the membrane remodeling and at what stages during vesicle formation. Isn't it possible that the tubulation activity the authors observe in vitro is not being driven at all by Sec13/31 but rather Sec23/24-Sar1? Their Sec31ΔCTD data supports this idea because it lacks a clear ordered outer coat despite making tubules. An interesting experiment would be to see if tubules form in the absence of all of Sec13/31 except the disordered domain of Sec31 that the authors suggest crosslinks adjacent Sec23/24s.

      This is an astute observation, and we agree with the reviewer that the source of membrane deformation is not fully understood. We favour the model that budding is driven significantly by the Sec23-24 array. To further support this, we have performed a new experiment, where we expressed Sec31ΔN in yeast cells lacking Emp24, which have more deformable membranes and are tolerant to the otherwise lethal deletion of Sec13. While Sec31ΔN in a wild type background did not support cell viability, this was rescued in a Δemp24 yeast strain, strongly supporting the hypothesis that a major contributor to membrane remodelling is the inner coat, with the outer coat becoming necessary to overcome membrane bending resistance that ensues from the presence of cargo. We now include these results in Figure 1.

      However, we must also take into account the results presented in Fig. 6, where we show that weakening the Sec23-24 interface still leads to budding, but only if Sec13-31 is fully functional, and that in this case budding leads to connected pseudo-spherical vesicles rather than tubes. When Sec13-31 assembly is also impaired, tubes appear unstructured. We believe this strongly supports our conclusions that both inner and outer coat interactions are fundamental for membrane remodelling, and it is the interplay between the two that determines membrane morphology (i.e. tubes vs. spheres).

      To dissect the roles of inner and outer coats even further, we have done the experiment that the reviewer suggests: we expressed Sec31768-1114, but the protein was not well-behaved and co-purified with chaperones. We believe the disordered domain aggregates when not scaffolded by the structured elements of the rod. Nonetheless, we used this fragment in a budding reaction, and could not see any budding. We did not include this experiment as it was inconclusive: the lack of functionality of the purified Sec31 fragment could be attributed to the inability of the disordered region to bind its inner coat partner in the absence of the scaffolding Sec13-31 rod. As an alternative approach, we have used a version of Sec31 that lacks the CTD, and harbours a His tag at the N-terminus (known from previous studies to partially disrupt vertex assembly). We think this construct is more likely to be near native, since both modifications on their own lead to functional protein. We could detect no tubulation with this construct by negative stain, while both control constructs (Sec31ΔCTD and Nhis-Sec31) gave tubulation. This suggests that the cross-linking function of Sec31 is not sufficient to tubulate GUV membranes, but some degree of functional outer coat organisation (either mediated by N- or C-terminal interactions) is needed. It is also possible that the lack of outer coat organisation might lead to less efficient recruitment to the inner coat and cross-linking activity. We have added this new observation to the manuscript.

      3) Line 191 - "Inspecting cryo-tomograms of these tubules revealed no lozenge pattern for the outer 192 coat" - this phrasing is vague. The reviewer thinks that what they mean is that there is a lack of order for the Sec13/31 layer. Please clarify.

      The reviewer is correct, we have changed the sentence.

      4) Line 198 - "unambiguously confirming this density corresponds to 199 the CTD." This only confirms that it is the CTD if that were the only change and the Sec13/31 lattice still formed. Another possibility is that it is density from other Sec13/31 that only appears when the lattice is formed such as the "extra rods". One possibility is that the density is from the extra rods. The reviewer agrees that their interpretation is indeed the most likely, but it is not unambiguous. The authors should consider cross-linking mass spectrometry.

      We have removed the word ‘unambiguously’, and changed to ‘confirming that this density most likely corresponds to the CTD’. Nonetheless, we believe that our interpretation is correct: the extra rods bind to a different position, and themselves also show the CTD appendage. In this experiment, the lack of the CTD was the only biochemical change.

      5) In the Sec31ΔCTD section, the authors should comment on why ΔCTD is so deleterious to oligomer organization in yeast when cages form so abundantly in preparations of human Sec13/31 ΔC (Paraan et al 2018).

      We have added a comment to address this. “Interestingly, human Sec31 proteins lacking the CTD assemble in cages, indicating that either the vertex is more stable for human proteins and sufficient for assembly, or that the CTD is important in the context of membrane budding but not for cage formation in high salt conditions.”

      6) The data is good for the existence of the "extra rods", but significance and importance of them is not clear. How can these extra densities be distinguished from packing artifacts due to imperfections in the helical symmetry.

      Please also see our response to point 3 from reviewer 1. Regarding the specific concern that artefacts might be a consequence of imperfection in the helical symmetry, we would argue such imperfections are indeed expected in physiological conditions, and to a much higher extent. For this reason interactions seen in the context of helical imperfections are likely to be relevant. In fact, in normal GTP hydrolysis conditions, we expect long tubes would not be able to form, and the outer coat to be present on a wide range of continuously changing membrane curvatures. We think that the ability of the coat to form many interactions when the symmetry is imperfect might be exactly what confers the coat its flexibility and adaptability.

      7) Figure 5 is very hard to interpret and should be redone. Panels B and C are particularly hard to interpret.

      We have made a new figure where we think clarity is improved.

      8) The features present in Sec23/24 structure do not reflect the reported resolution of 4.7 Å. It seems that the resolution is overestimated.

      We report an average resolution of 4.6 Å. In most of our map we can clearly distinguish beta strands, follow the twist of alpha helices and see bulky side chains. These features typically become visible at 4.5-5A resolution. We agree that some areas are worse than 4.6 Å, as typically expected for such a flexible assembly, but we believe that the average resolution value reported is accurate. We obtained the same resolution estimate using different software including relion, phenix and dynamo, so that is really the best value we can provide. To further convince ourselves that we have the resolution we claim, we sampled EM maps from the EMDB with the same stated resolution (we just took the 7 most recent ones which had an associated atomic model), and visualised their features at arbitrary positions. For both beta strands and alpha helices, we do not feel our map looks any worse than the others we have examined. We include a figure here.

      img

      9) Lines 315/316 - "We have combined cryo-tomography with biochemical and genetic assays to obtain a complete picture of the assembled COPII coat at unprecedented resolution (Fig. 7)"

      10) Figure 7. is a schematic model/picture the authors should reference a different figure or rephrase the sentence.

      We now refer to Fig 7 in a more appropriate place.

      Reviewer #3:

      The manuscript by Hutchings et al. describes several previously uncharacterised molecular interactions in the coats of COP-II vesicles by using a reconstituted coats of yeast COPI-II. They have improved the resolution of the inner coat to 4.7A by tomography and subtomogram averaging, revealing detailed interactions, including those made by the so-called L-loop not observed before. Analysis of the outer layer also led to new interesting discoveries. The sec 31 CTD was assigned in the map by comparing the WT and deletion mutant STA-generated density maps. It seems to stabilise the COP-II coats and further evidence from yeast deletion mutants and microsome budding reconstitution experiments suggests that this stabilisation is required in vitro. Furthermore, COP-II rods that cover the membrane tubules in right-handed manner revealed sometimes an extra rod, which is not part of the canonical lattice, bound to them. The binding mode of these extra rods (which I refer to here a Y-shape) is different from the canonical two-fold symmetric vertex (X-shape). When the same binding mode is utilized on both sides of the extra rod (Y-Y) the rod seems to simply insert in the canonical lattice. However, when the Y-binding mode is utilized on one side of the rod and the X-binding mode on the other side, this leads to bridging different lattices together. This potentially contributes to increased flexibility in the outer coat, which maybe be required to adopt different membrane curvatures and shapes with different cargos. These observations build a picture where stabilising elements in both COP-II layers contribute to functional cargo transport. The paper makes significant novel findings that are described well. Technically the paper is excellent and the figures nicely support the text. I have only minor suggestions that I think would improve the text and figure.

      We thank the reviewer for helpful suggestions which we agree improve the manuscript.

      Minor Comments:

      L 108: "We collected .... tomograms". While the meaning is clear to a specialist, this may sound somewhat odd to a generic reader. Perhaps you could say "We acquired cryo-EM data of COP-II induced tubules as tilt series that were subsequently used to reconstruct 3D tomograms of the tubules."

      We have changed this as suggested

      L 114: "we developed an unbiased, localisation-based approach". What is the part that was developed here? It seems that the inner layer particle coordinates where simply shifted to get starting points in the outer layer. Developing an approach sounds more substantial than this. Also, it's unclear what is unbiased about this approach. The whole point is that it's biased to certain regions (which is a good thing as it incorporates prior knowledge on the location of the structures).

      We have modified the sentence to “To target the sparser outer coat lattice for STA, we used the refined coordinates of the inner coat to locate the outer coat tetrameric vertices”, and explain the approach in detail in the methods.

      L 124: "The outer coat vertex was refined to a resolution of approximately ~12 A, revealing unprecedented detail of the molecular interactions between Sec31 molecules (Supplementary Fig 2A)". The map alone does not reveal molecular interactions; the main understanding comes from fitting of X-ray structures to the low-resolution map. Also "unprecedented detail" itself is somewhat problematic as the map of Noble et al (2013) of the Sec31 vertex is also at nominal resolution of 12 A. Furthermore, Supplementary Fig 2A does not reveal this "unprecedented detail", it shows the resolution estimation by FSC. To clarify, these points you could say: "Fitting of the Sec31 atomic model to our reconstruction vertex at 12-A resolution (Supplementary Fig 2A) revealed the molecular interactions between different copies of Sec31 in the membrane-assembled coat.

      We have changed the sentence as suggested.

      L 150: Can the authors exclude the possibility that the difference is due to differences in data processing? E.g. how the maps amplitudes have been adjusted?

      Yes, we can exclude this scenario by measuring distances between vertices in the right and left handed direction. These measurements are only compatible with our vertex arrangement, and cannot be explained by the big deviation from 4-fold symmetry seen in the membrane-less cage vertices.

      L 172: "that wrap tubules either in a left- or right-handed manner". Don't they do always both on each tubule? Now this sentence could be interpreted to mean that some tubules have a left-handed coat and some a right-handed coat.

      We have changed this sentence to clarify. “Outer coat vertices are connected by Sec13-31 rods that wrap tubules both in a left- and right-handed manner.”

      L276: "The difference map" hasn't been introduced earlier but is referred to here as if it has been.

      We now introduce the difference map.

      L299: Can "Secondary structure predictions" denote a protein region "highly prone to protein binding"?

      Yes, this is done through DISOPRED3, a feature include in the PSIPRED server we used for our predictions. The reference is: Jones D.T., Cozzetto D. DISOPRED3: precise disordered region predictions with annotated protein-binding activity Bioinformatics. 2015; 31:857–863. We have now added this reference to the manuscript.

      L316: It's true that the detail in the map of the inner coat is unprecedented and the model presented in Figure 7 is partially based on that. But here "unprecedented resolution" sounds strange as this sentence refers to a schematic model and not a map.

      We have changed this by moving the reference to Fig 7 to a more appropriate place

      L325: "have 'compacted' during evolution" -> remove. It's enough to say it's more compact in humans and less compact in yeast as there could have been different adaptations in different organisms at this interface.

      We have changed as requested. See also our response to reviewer 1, point 1.

      L327: What's exactly meant by "sequence diversity or variability at this density".

      We have now clarified: “Since multiple charge clusters in yeast Sec31 may contribute to this interaction interface (Stancheva et al., 2020), the low resolution could be explained by the fact that the density is an average of different sequences.”

      L606-607: The description of this custom data processing approach is difficult to follow. Why is in-plane flip needed and how is it used here?

      Initially particles are picked ignoring tube directionality (as this cannot be assessed easily from the tomograms due to the pseudo-twofold symmetry of the Sec23/24/Sar1 trimer). So the in plane rotation of inner coat subunit could be near 0 or 180°. For each tube, both angles are sampled (in-plane flip). Most tubes result in the majority of particles being assigned one of the two orientations (which is then assumed as the tube directionality). Particles that do not conform are removed, and rare tubes where directionality cannot be determined are also removed. We have re-written the description to clarify these points: “Initial alignments were conducted on a tube-by-tube basis using the Dynamo in-plane flip setting to search in-plane rotation angles 180° apart. This allowed to assign directionality to each tube, and particles that were not conforming to it were discarded by using the Dynamo dtgrep_direction command in custom MATLAB scripts”

      L627: "Z" here refers to the coordinate system of aligned particles not that of the original tomogram. Perhaps just say "shifted 8 pixels further away from the membrane".

      Changed as requested.

      L642-643: How can the "left-handed" and "right-handed" rods be separated here? These terms refer to the long-range organisation of the rods in the lattice it's not clear how they were separated in the early alignments.

      They are separated by picking only one subset using the dynamo sub-boxing feature. This extracts boxes from the tomogram which are in set positions and orientation relative to the average of previously aligned subtomograms. From the average vertex structure, we sub-box rods at 4 different positions that correspond to the centre of the rods, and the 2-fold symmetric pairs are combined into the same dataset. We have clarified this in the text: “The refined positions of vertices were used to extract two distinct datasets of left and right-handed rods respectively using the dynamo sub-boxing feature.”

      Figure 2B. It's difficult to see the difference between dark and light pink colours.

      We have changed colours to enhance the difference.

      Figure 3C. These panels report the relative frequency of neighbouring vertices at each position; "intensity" does not seem to be the right measure for this. You could say that the colour bar indicates the "relative frequency of neighbouring vertices at each position" and add detail how the values were scaled between 0 and 1. The same applies to SFigure 1E.

      Changed as requested.

      Figure 4. The COP-II rods themselves are relatively straight, and they are not left-handed or right-handed. Here, more accurate would be "architecture of COPII rods organised in a left-handed manner". (In the text the authors may of course define and then use this shorter expression if they so wish.) Panel 4B top panel could have the title "left-handed" and the lower panel should have the title "right-handed" (for consistency and clarity).

      We have now defined left- and right-handed rods in the text, and have changed the figure and panel titles as requested.

    2. Reviewer #3:

      The manuscript by Hutchings et al. describes several previously uncharacterised molecular interactions in the coats of COP-II vesicles by using a reconstituted coats of yeast COPI-II. They have improved the resolution of the inner coat to 4.7A by tomography and subtomogram averaging, revealing detailed interactions, including those made by the so-called L-loop not observed before. Analysis of the outer layer also led to new interesting discoveries. The sec 31 CTD was assigned in the map by comparing the WT and deletion mutant STA-generated density maps. It seems to stabilise the COP-II coats and further evidence from yeast deletion mutants and microsome budding reconstitution experiments suggests that this stabilisation is required in vitro. Furthermore, COP-II rods that cover the membrane tubules in right-handed manner revealed sometimes an extra rod, which is not part of the canonical lattice, bound to them. The binding mode of these extra rods (which I refer to here a Y-shape) is different from the canonical two-fold symmetric vertex (X-shape). When the same binding mode is utilized on both sides of the extra rod (Y-Y) the rod seems to simply insert in the canonical lattice. However, when the Y-binding mode is utilized on one side of the rod and the X-binding mode on the other side, this leads to bridging different lattices together. This potentially contributes to increased flexibility in the outer coat, which may be required to adopt different membrane curvatures and shapes with different cargos. These observations build a picture where stabilising elements in both COP-II layers contribute to functional cargo transport. The paper makes significant novel findings that are described well. Technically the paper is excellent and the figures nicely support the text. I have minor suggestions that I think would improve the text and figures.

      L 108: "We collected .... tomograms". While the meaning is clear to a specialist, this may sound somewhat odd to a generic reader. Perhaps you could say "We acquired cryo-EM data of COP-II induced tubules as tilt series that were subsequently used to reconstruct 3D tomograms of the tubules."

      L 114: "we developed an unbiased, localisation-based approach". What is the part that was developed here? It seems that the inner layer particle coordinates where simply shifted to get starting points in the outer layer. Developing an approach sounds more substantial than this. Also, it's unclear what is unbiased about this approach. The whole point is that it's biased to certain regions (which is a good thing as it incorporates prior knowledge on the location of the structures).

      L 124: "The outer coat vertex was refined to a resolution of approximately ~12 A, revealing unprecedented detail of the molecular interactions between Sec31 molecules (Supplementary Fig 2A)". The map alone does not reveal molecular interactions; the main understanding comes from fitting of X-ray structures to the low-resolution map. Also "unprecedented detail" itself is somewhat problematic as the map of Noble et al (2013) of the Sec31 vertex is also at nominal resolution of 12 A. Furthermore, Supplementary Fig 2A does not reveal this "unprecedented detail", it shows the resolution estimation by FSC. To clarify, these points you could say: "Fitting of the Sec31 atomic model to our reconstruction vertex at 12-A resolution (Supplementary Fig 2A) revealed the molecular interactions between different copies of Sec31 in the membrane-assembled coat.

      L 150: Can the authors exclude the possibility that the difference is due to differences in data processing? E.g. how the maps’ amplitudes have been adjusted?

      L 172: "that wrap tubules either in a left- or right-handed manner". Don't they always do both on each tubule? Now this sentence could be interpreted to mean that some tubules have a left-handed coat and some a right-handed coat.

      L276: "The difference map" hasn't been introduced earlier but is referred to here as if it has been.

      L299: Can "Secondary structure predictions" denote a protein region "highly prone to protein binding"?

      L316: It's true that the detail in the map of the inner coat is unprecedented and the model presented in Figure 7 is partially based on that. But here "unprecedented resolution" sounds strange as this sentence refers to a schematic model and not a map.

      L325: "have 'compacted' during evolution" -> remove. It's enough to say it's more compact in humans and less compact in yeast as there could have been different adaptations in different organisms at this interface.

      L327: What's exactly meant by "sequence diversity or variability at this density".

      L606-607: The description of this custom data processing approach is difficult to follow. Why is in-plane flip needed and how is it used here?

      L627: "Z" here refers to the coordinate system of aligned particles not that of the original tomogram. Perhaps just say "shifted 8 pixels further away from the membrane"

      L642-643: How can the "left-handed" and "right-handed" rods be separated here? These terms refer to the long-range organisation of the rods in the lattice; it's not clear how they were separated in the early alignments.

      Figure 2B. It's difficult to see the difference between dark and light pink colours.

      Figure 3C. These panels report the relative frequency of neighbouring vertices at each position; "intensity" does not seem to be the right measure for this. You could say that the colour bar indicates the "relative frequency of neighbouring vertices at each position" and add detail how the values were scaled between 0 and 1. The same applies to SFigure 1E.

      Figure 4. The COP-II rods themselves are relatively straight, and they are not left-handed or right-handed. Here, more accurate would be "architecture of COPII rods organised in a left-handed manner". (In the text the authors may of course define and then use this shorter expression if they so wish.) Panel 4B top panel could have the title "left-handed" and the lower panel should have the title "right-handed" (for consistency and clarity).

    3. Reviewer #2:

      The manuscript describes new cryo-EM, biochemistry, and genetic data on the structure and function of the COPII coat. Several new discoveries are reported including the discovery of an extra density near the dimerization region of Sec13/31, and "extra rods" of Sec13/31 that also bind near the dimerization region. Additionally, they showed new interactions between the Sec31 C-terminal unstructured region and Sec23 that appear to bridge multiple Sec23 molecules. Finally, they increased the resolution of the Sec23/24 region of their structure compared to their previous studies and were able to resolve a previously unresolved L-loop in Sec23 that makes contact with Sar1. Most of their structural observations were nicely backed up with biochemical and genetic experiments which give confidence in their structural observations. Overall the paper is well-written and the conclusions justified. However, this is the third iteration of structure determination of the COPII coat on membrane with essentially the same preparation and methods. Each time, there has been an incremental increase in resolution and new discoveries, but the impact of the present study is deemed to be modest. The science is good and appropriate for a specialized journal. Areas of specific concern are described below.

      1) The abstract should be re-written with a better description of the work.

      2) Line 166 - "Surprisingly, this mutant was capable of tubulating GUVs". This experiment gets to one of the fundamental unknown questions in COPII vesiculation. It is not clear what components are driving the membrane remodeling and at what stages during vesicle formation. Isn't it possible that the tubulation activity the authors observe in vitro is not being driven at all by Sec13/31 but rather Sec23/24-Sar1? Their Sec31ΔCTD data supports this idea because it lacks a clear ordered outer coat despite making tubules. An interesting experiment would be to see if tubules form in the absence of all of Sec13/31 except the disordered domain of Sec31 that the authors suggest crosslinks adjacent Sec23/24s.

      3) Line 191 - "Inspecting cryo-tomograms of these tubules revealed no lozenge pattern for the outer 192 coat" - this phrasing is vague. The reviewer thinks that what they mean is that there is a lack of order for the Sec13/31 layer. Please clarify.

      4) Line 198 - "unambiguously confirming this density corresponds to 199 the CTD." This only confirms that it is the CTD if that were the only change and the Sec13/31 lattice still formed. Another possibility is that it is density from other Sec13/31 that only appears when the lattice is formed such as the "extra rods". One possibility is that the density is from the extra rods. The reviewer agrees that their interpretation is indeed the most likely, but it is not unambiguous. The authors should consider cross-linking mass spectrometry.

      5) In the Sec31ΔCTD section, the authors should comment on why ΔCTD is so deleterious to oligomer organization in yeast when cages form so abundantly in preparations of human Sec13/31 ΔC (Paraan et al 2018).

      6) The data is good for the existence of the "extra rods", but significance and importance of them is not clear. How can these extra densities be distinguished from packing artifacts due to imperfections in the helical symmetry.

      7) Figure 5 is very hard to interpret and should be redone. Panels B and C are particularly hard to interpret.

      8) The features present in Sec23/24 structure do not reflect the reported resolution of 4.7 Å. It seems that the resolution is overestimated.

      9) Lines 315/316 - "We have combined cryo-tomography with biochemical and genetic assays to obtain a complete picture of the assembled COPII coat at unprecedented resolution (Fig. 7)." Figure 7 is a schematic model/picture; the authors should reference a different figure or rephrase the sentence.

    4. Reviewer #1:

      Hutchings et al. report an updated cryo-electron tomography study of the yeast COP-II coat assembled around model membranes. The improved overall resolution and additional compositional states enabled the authors to identify new domains and interfaces, including what the authors hypothesize is a previously overlooked structural role for the SEC31 C-Terminal Domain (CTD). By perturbing a subset of these new features with mutants, the authors uncover some functional consequences pertaining to the flexibility or stability of COP-II assemblies.

      Overall, the structural and functional work appears reliable, but certain questions and comments should be addressed. This study provides a valuable refinement of our understanding of COP-II that I believe is well suited to a specialized, structure-focused journal.

      Major Comments: 1) The authors belabor the comparison between the yeast reconstruction of the outer coat vertex with prior work on the human outer coat vertex. Considering the modest resolution of both the yeast and human reconstructions, the transformative changes in cryo-EM camera technology since the publication of the human complex, and the differences in sample preparation (inclusion of the membrane, cylindrical versus spherical assemblies, presence of inner coat components), I did not find this comparison informative. The speculations about a changing interface over evolutionary time are unwarranted and would require a detailed comparison of co-evolutionary changes at this interface. The simpler explanation is that this is a flexible vertex, observed at low resolution in both studies, plus the samples are very different.

      2) As one of the major take home messages of the paper, the presentation and discussion of the modeling and assignment of the SEC31-CTD could be clarified. First, it isn't clear from the figures or the movies if the connectivity makes sense. Where is the C-terminal end of the alpha-solenoid compared to this new domain? Can the authors plausibly account for the connectivity in terms of primary sequence? Please also include a side-by-side comparison of the SRA1 structure and the CTD homology model, along with some explanation of the quality of the model as measured by Modeller. Finally, even if the new density is the CTD, it isn't clear from the structure how this sub-stoichiometric and apparently flexible interaction enhances stability. Hence, when the authors wrote "when the [CTD] truncated form was the sole copy of Sec31 in yeast, cells were not viable, indicating that the novel interaction we detect is essential for COPII coat function." Maybe, but could this statement be a leap to far? Is it the putative interaction essential, or is the CTD itself essential for reasons that remain to be fully determined?

      3) Are extra rods discussed in Fig. 4 are a curiosity of unclear functional significance? This reviewer is concerned that these extra rods could be an in vitro stoichiometry problem, rather than a functional property of COP-II.

      4) The clashscore for the PDB is quite high, and I am dubious about the reliability of refining sidechain positions with maps at this resolution. In addition to the Ramchandran stats, I would like to see the Ramachandran plot as well as, for any residue-level claims, the density surrounding the modeled side chain (e.g. S742).

      Minor Comments:

      1) The authors wrote "To assess the relative positioning of the two coat layers, we analysed the localisation of inner coat subunits with respect to each outer coat vertex: for each aligned vertex particle, we superimposed the positions of all inner coat particles at close range, obtaining the average distribution of neighbouring inner coat subunits. From this 'neighbour plot' we did not detect any pattern, indicating random relative positions. This is consistent with a flexible linkage between the two layers that allows adaptation of the two lattices to different curvatures (Supplementary Fig 1E)." I do not understand this claim, since the pattern both looks far from random and the interactions depend on molecular interactions that are not random. Please clarify.

      2) Related to major point #1, the author wrote "We manually picked vertices and performed carefully controlled alignments." I do now know what it means to carefully control alignments, and fear this suggests human model bias.

      3) Why do some experiments use EDTA? I may be confused, but I was surprised to see the budding reaction employed 1mM GMPPNP, and 2.5mM EDTA (but no Magnesium?). Also, for the budding reaction, please replace or expand upon the "the 10% GUV (v/v)" with a mass or molar lipid-to-protein ratio.

      4) Please cite the AnchorMap procedure.

    5. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on October 2 2020, follows.

      Summary

      There is consensus among the reviewers that this study provides an interesting and important advance in understanding the role of CEBP/b LAP in metabolism and response to a high fat diet challenge. The major discoveries reported here are fairly well supported by the data, including that male uORF KO mice show an increase in fat cell number as opposed to fat cell size, less inflammation, and improved glucose tolerance/insulin sensitivity.

      Essential Revisions

      However, all of the reviewers shared the major concern that it appears that only male mice were studied here, and that this fact - or the rationale for using only male mice - was not clearly articulated within the manuscript. This makes interpretation quite challenging, especially given that the authors previously published that lifespan extension in the uORF KO mice is much more pronounced in female compared to male mice. There was consensus that this is a substantial weakness to the current manuscript which limits its overall impact. It's possible the authors already have this data, and we would need to see inclusion of data supporting similar outcomes for the key experiments in female mice to recommend publication in eLife. If the outcomes are different in males and females, this is likely quite interesting and would need to be developed further.

      The other significant concern was related to the RT-qPCR data, which is indicative but not conclusive support for the authors' conclusions, especially since many of the changes are small in magnitude. It was noted that most of the relevant proteins have ELISAs available, and they all have antibodies which could be used to support the robustness and importance of the small but plausibly important differences observed. IHC against CD68 in the fat depots could be performed and the authors could strengthen their claims about adipose tissue inflammation by measuring the expression levels of inflammatory cytokines in adipose depots.

  6. Sep 2020
    1. Reviewer #3:

      The manuscript by Ishii et al focuses on understanding how cellular dynamics drive the spiral shape of the cochlear duct in mammals. The authors use live imaging of inner ear explants to follow dynamics of interkinetic nuclear migration (IKNM) and ERK activity (using ERK FRET sensor) to track some of the processes that give rise to tissue bending during spiral duct formation. On the imaging side, the manuscript presents a technical tour de force, showing remarkable two photon imaging capabilities that provide insights into the dynamics underlying cochlear extension. These experiments reveal several new observations: (1) Medial epithelial layer (MEL) tends to bend more than the lateral epithelial layer (LEL) despite being more proliferative. (2) That nuclei of cells in the curved region of the cochlea tend to stay in the luminal side, following cell division, rather than migrate back to the basal side. (3) The cells migrate towards the apical lateral roof. (4) That there are orchestrated ERK waves that correlate with cell migration. Based on these observations and on mathematical modeling, the manuscript has two main claims: (1) that nuclear stalling on the luminal side following cell division leads to increased curvature which gives rise cochlear duct bending, and (2) that multicellular flow mediated by ERK signaling waves pushes cells towards the growing apex, supplying the cells required for luminal expansion. While the observations in the manuscript are certainly interesting, I worry however, that some of the claims are not sufficiently substantiated, and also the connection between the two observations is rather weak. Here are the detailed concerns:

      Major concerns:

      1) The authors argue that cell cycle arrest results in a decrease in the curvature of the cochlear duct, which supports the hypothesis that luminal nuclear stalling promotes MEL bending. This is fine, but luminal nuclear stalling can be a result and not a cause. Since in a bent region, the basal side is more packed, this density gradient can be the cause of nuclei stalling at the luminal side. The fact that the curvature decreased but not diminished after cell cycle arrest could suggest that nuclear stalling is not required for bending, but rather reinforces it.

      2) Since the authors discuss both cell proliferation and nuclear stalling, and cell migration, as forces that can drive bending and coiling, it hard to interpret the results of the mitomycin C experiment. Could it be that the tissue is less curved because there are less cells to supply the elongation tissue rather than less nuclear stalling? The authors should consider inhibiting either cell migration or the cytoskeletal machinery required for IKNM to dissect these effects.

      3) The authors present a mathematical model to demonstrate that nuclear stalling in the luminal side results in bending. To model nuclear motion they use a parameter, gamma, which controls the degree of basalward movement after IKNM. Modeling in such way means that other than gamma=1, the nucleus never fully returns to the basal side, but if I understand correctly this is not the case, as even if the nuclei that stall at the luminal side, eventually return to the basal side.

      4) Furthermore, for luminal nuclear stalling, the authors tracked only the nuclei of dividing cells. This makes the data in Fig 2D' much clearer. However, in their model the authors show only these nuclei and not all nuclei. In addition, they show many crowded nuclei in the model, yet this is not observed in the images provided in the manuscript. Therefore, it seems the model does not represent the morphology of the tissue properly. The authors should model the process with non-dividing cells at the basal side.

      5) In lines 250-252 the authors claim that the higher volumetric growth measured at the MEL should cause an opposite curvature relative to the innate one. This is true if EdU intensity is proportional to volumetric growth, but cells in the MEL and LEL may not be the same size. For example, cells in the MEL could be smaller than cells in the LEL. The authors should therefore measure the nuclei number density and the volumetric cell density to clarify this. If the number density of the nuclei is indeed higher at the MEL, it may also explain the higher structural integrity of the MEL relative to the LEL demonstrated in figure 1C.

      6) The authors show the effect of ERK inhibition on tissue flow speed. This is a very important observation and raises several important questions. What is effect of ERK inhibition on curvature? On tissue length? On proliferation? These will provide a more complete understanding of the effect of RK inhibition.

      7) The authors should also test the effect of mitomycin C on cell flow and ERK activity. AS mentioned above, it is not clear whether the effect of mitomycin C is a result of less nuclear stalling or perhaps less cells that flow towards the apex.

      8) In Figure 3 the authors analyze the EdU distribution over the cochlear duct. This analysis is done using the maximum intensity projection of the stack. It seems that a more accurate way to quantify would be to use the summed intensity image rather than the maximum intensity image. This may reveal additional details that were missed by throwing away all other layers except the one at maximum intensity.

      9) In Figure 4 the colors used for the ERK activity analysis are very hard to see for color-blind people. It would be easier for this audience if the authors changed 1 of these color to green/red/yellow.

    2. Reviewer #2:

      The paper by Hirashima and colleagues shows some interesting cellular mechanisms they conclude drive the spiraling and outgrowth of the mammalian cochlea. The two cellular mechanisms they propose are supported by experiments and modeling. The spiraling ERK wave and the contrasting movement of lateral cells was very intriguing. However, the ERK wave and lateral cell movements seem disconnected from the bending forces discussed. Are the authors saying that the ERK mediated lateral cell movements are important for cochlear growth while the MEL is important for the bending? The two mechanisms they discuss seem insufficient to explain all of cochlear spiraling. Other cellular mechanisms such as cell proliferation and convergent extension are mentioned but their roles are not incorporated into their discussion. Are they not required? How do they complement their results?

      1) While the authors talk about bending forces, the paper has no measurements of the forces generated by different tissues. I also feel there are other cellular mechanisms that are mentioned but never incorporated into their proposed explanation for duct coiling such as convergent extension and actomyosin based basal shrinkage. Proliferation is discussed quite a bit but seems to be dismissed as a force. In the introduction they mention how Shh mediated proliferation is required for duct elongation while Fgf10 null mutants have a shortened duct yet normal proliferation. So what is the role for proliferation? Maybe they can answer this in the context of their interesting observation that there is more proliferation in the roof than the floor which would be predicted to bend the cochlea along that axis. When combined with the medial lateral bending could these two forces result in the spiraling? It also seems like this differential proliferation between the floor and roof was in more than just the epithelium correct? Could the cartilaginous capsule around the duct guide the bending as well? In their culture experiments, if too much of the capsule was removed then normal duct development was disrupted.

      2) Their demonstration that the bending forces are in the medial half is interesting but the only tissue whose mechanism is studied is the MEL. Could convergent extension in other medial tissues such as the prosensory domain (which Wang et al. showed was occurring in this tissue) and surrounding mesenchyme be the main force generator for the bending of the medial half of the cochlear duct? Does the MEL cultured by itself bend? They say that cell intercalation can drive ductal elongation but not bending (line 83) but can't convergent extension occur asymmetrically in the tissue? Such as by occurring in the overlying medial mesenchyme but not in the medial epithelium. It should be noted that the bending by the epithelium does not have to provide high forces as long as the force provided by other tissues are similar across the medial lateral axis, the bending in the epithelium could bias the mass of tissue to bend.

      3) The mathematical modeling for the luminal bending is less convincing than the mathematical modeling for the ERK and Cell flow coupling. The simulated curves in Fig. 2K are quite different from the Experimental measure in Fig. 2M, especially for the Mitomycin C condition. I feel that the values plugged in for the Numerical simulation, the standard parameter set were not well justified. What happened to the simulations as these values changed? Was the parameter space for acceptable values broad? In contrast the parameters for the numerical simulation of the ERK activation waves and cell flows were well justified. The parameters chosen might explain the big differences between simulation and experimental in figure 2.

      4) For the cell tracking experiments in the lateral region the resolution was 4-5 cells. The resulting cell flow patterns were very interesting but why didn't the authors track single cells? Segmenting individual cells via cytoplasmic labeling is much trickier but the nuclei are identifiable and the Imaris software they used in the paper has a cell tracking feature for such labeling. I would think that individual cell movements might provide more insights. In line 303 they say they can see cell contractions which I assume is for individual cells? How were cell contractions identified? Movie 5 was excellent and very informative. Do the cell flows correlate at all with the proliferation seen with Edu staining?

    3. Reviewer #1:

      This is a fascinating manuscript that explores for the first time the potential mechanisms underlying cochlear morphogenesis. The authors have used a combination of modeling, beautiful imaging and ERK-FRET reporter mice reporter mice to suggest at least two processes may be at play in cochlear shaping - differential interkinetic nuclear migration and a cellular flow that appears to correlate with ERK activation.

      I have no major concerns with this lovely piece of work. The imaging and quantification is meticulous, and the observations made by the authors are novel and will of great interest to cell biologists interested in morphogenesis, no just aficionados of the inner ear.

      The one suggestion I would make is for the authors to clarify the relationship between cell proliferation and ERK activation. When they reference the inner ear literature, they point out that FGF pathway mutants have deficient cochlear morphogenesis and proliferation, and they hypothesize that FGF-induced ERK activation may be responsible for their propagating waves. However, they also reference work suggesting that cellular extension during collective migration can also induce ERK activation and also suggest SHH-induced proliferation as another causative factor in promoting ERK activation through proliferation. I think the authors should try and clarify this - both in their explanation, but also by comparing the effects of the MEK inhibitor PD0325901 on ERK activity and tissue flow speed (Fig 4I and S3F) with the effects of the FGFR inhibitor SU5402, and also Shh inhibitors like cyclopamine. If the effects they see are directly due to FGF signaling, one would expect a change in ERK activation and cell flow with the same kinetics as with PD0325901. However, if Shh-induced proliferation is responsible, the change in ERK activation would take much longer to achieve. I think these experiments should be possible to do in a relatively short period of time.

    4. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on September 9 2020, follows.

      Summary

      The mammalian cochlear duct is a spiral-shaped organ. This study investigated the mechanisms underlying the bending of the cochlear duct. Using two-photon live imaging and mathematical modeling, it was reported that the bending of the cochlear duct is caused by stalling of nuclei in the luminal side of the medial cochlear duct during interkinetic nuclear migration. Using FRET-based imaging, cochlear duct elongation is attributed to an oscillatory wave of ERK activity originating from the cochlear tip.

      All three reviewers were impressed by the imaging results. Although the reviewers and editors find the concept and approach interesting, blocking cell proliferation may be too crude a method to address the authors' hypothesis and many questions were raised by the results of blocking cell division. Second, the relationship between cell proliferation and ERK-driven migration is also unclear. Please see comments from Reviewer #2 and #3 for specifics. Third, what is the relationship between SHH-induced proliferation and ERK activation as suggested by the authors (see comments from Reviewer #1)? Additionally, it is problematic to illustrate a difference in the bending force between medial and lateral cochlear duct that is presumably occurring at E12.5 and E14.5 with a cochlear dissection at E17.5. The tissue architecture is completely different between E12.5 and E17.5. The surgery basically removed a specific region of the cochlear duct, the stria vascularis, rather than medial versus lateral halves of the cochlear duct.

    1. Author Response

      Note from the authors:

      This is the authors' response to the reviewers' comments for the manuscript “Perceptual gating of a brainstem reflex facilitates speech understanding in humans” submitted to eLife via Preprint Review. We appreciate the time and effort the reviewers took to carefully revise our work. We believe all comments and suggestions will improve the manuscript for future publication. All the authors’ comments detailed in this response will be implemented in the next version of this manuscript.

      Reviewer #1: [...] Reviewer 1-Comment 1: 1) An important aspect of assessing the efferent feedback through the CEOAEs and ABRs is to ensure that different stimuli have equal intensity. The authors write in the methodology that the speech stimuli were presented at 75 dB SPL. However, it is not stated if this applies to the speech stimuli only, such that the stimuli that include background noise would have a higher intensity, or to the net stimuli. If the intensity of the speech signals alone had been kept at 75 dB SPL while the background noise had been increased, this would render the net signal louder and influence the MOCR. In addition, it would have been better to determine the loudness of the signals according to frequency weighting of the human auditory system, especially regarding the vocoded speech, to ensure equal loudness. If that was not done, how can the authors control for differences in perceived loudness resulting from the different stimuli?

      Response to Reviewer 1-Comment 1:

      Controlling the stimulus level is a critical step when recording any type of OAE due to the potential activation of the middle ear muscle reflex (MEMR). High intensity sounds delivered to an ear can evoke contractions of both the stapedius and the tensor tympani muscles causing the ossicular chain to stiffen and the impedance of middle ear sound transmission to increase (Murata et al.,1986; Liberman & Guinan,1998). As a result, retrograde middle ear transmission of OAE magnitude can be reduced due to MEMR and not MOCR activation (Lee et al., 2006). For this reason, we were particularly careful to determine the presentation level of our stimuli.

      As pointed out by the reviewer and stated in the Methods section: Experimental Protocol: “The speech tokens were presented at 75 dB SPL and the click stimulus at 75 dB p-p, therefore no MEMR contribution was expected given a minimum of 10 dB difference between MEMR thresholds and stimulus levels (ANSI S3.6-1996 standards for the conversion of dB SPL to dB HL)”. 75 dB SPL was indeed selected as the presentation level for all natural, noise vocoded and speech-in-noise tokens. All tokens were root-mean-square normalized and the calibration system (sound level meter (B&K G4) and microphone IEC 60711 Ear Simulator RA 0045 563 (BS EN 60645-3:2007), (see CEOAEs acquisition and analysis section)) was set to “A-Weighting” which matches the human auditory range. Therefore, the net signal was never above 75 dBA. We acknowledge the lack of details about the calibration procedure in the current manuscript and will consequently add them in a future Methods section.

      Reviewer 1-Comment 2: 2) Many of the p-values that show statistical significance are actually near the threshold of 0.05 (such as in the paragraph lines 147-181). This is particularly concerning due to the large number of statistical tests that were carried out. The authors state in the Methods section that they used the Bonferroni correction to account for multiple comparisons. This is in principle adequate, but the authors do not detail what number of multiple comparisons they used for the correction for each of the tests. This should be spelled out, so that the correction for multiple comparisons can be properly verified.

      Response to Reviewer 1-Comment 2:

      Bonferroni corrections were explicitly chosen as the multiple comparisons adjustment across our post-hoc statistical analyses because they are a highly conservative test that protect from Type I error. All the p-values reported in our study are corrected p-values for post-hoc comparisons. However, we agree that for verification purposes, the number of comparisons for each statistical analysis should be clarified in the Methods section and will be added to a future version of the manuscript.

      Reviewer 1-Comment 3: 3) Line 184-203: It is not clear what speech material is being discussed. Is it the noise vocoded speech, the speech in either type of background noise, or these data taken together?

      Response to Reviewer 1-Comment 3:

      Lines 184-203 correspond to “Auditory brainstem activity reflects changes in cochlear gain” in the Results section. Line 186 describes changes in ABR components during noise-vocoded speech: “Click-evoked ABRs—measured during simultaneous presentation of vocoded speech—showed task-engagement-specific effects similar to the effects observed for CEOAE measurements.” The subsequent 3 sentences refer to the same (noise-vocoded) condition, whereas the remaining sentences in the section refer to the speech-in-noise conditions. As pointed out by the reviewer we did not specify a specific masked condition in the sentence: “Conversely, although wave III was unchanged in both masked conditions for active vs. passive listening, wave V was significantly enhanced: [F (1, 26) = 5.67, p = 0.025 and F (1, 25) = 8.91, p = 0.006] when a lexical decision was required.” Here the rANOVAs correspond to masked conditions: speech in babble noise and speech-shaped noise respectively. This will be rectified in a future version of the manuscript.

      Reviewer 1-Comment 4: 4) Line 202-203: The authors write that "the ABR data suggest different brain mechanisms are tapped across the different speech manipulations in order to maintain iso-performance levels". It is not clear what evidence supports this conclusion. In particular, from Figure 1D, it appears plausible that the effects seen in the auditory brainstem may be entirely driven by the MOCR effect. To see this, please note that absence of statistical significance does not imply that there is no effect. In particular, although some differences between active and passive listening conditions are non-significant, this may be due to noise, which may mask significant effects. Importantly, where there are significant differences between the active and the passive scenario, they are in the same direction for the different measures (CEOAEs, Wave III, Wave V). Of course, that does not mean that nothing else might happen at the brainstem level, but the evidence for this is lacking.

      Response to Reviewer 1-Comment 4:

      Lines 202-203 also correspond to “Auditory brainstem activity reflects changes in cochlear gain” in the Results section. As suggested by the reviewer, the effects observed in the ABRs may be driven by the MOCR. We agree with this observation in lines 195-197, explaining that the decreased magnitude of ABR components is consistent with reduced magnitude of CEOAEs measured during active listening in the vocoded condition, since a reduction in cochlear gain can reduce the activity of auditory nerve (AN) afferents synapsing in the cochlear nucleus (CN). However, we did not explain that this trend is also observed during the passive listening of speech-in-noise, therefore demonstrating that vocoded and speech-in-noise are differently processed at the level of the brainstem and midbrain. In a future version of the manuscript, we will restrict our interpretation to statistical comparisons in the Results and leave potential mechanisms for the Discussion section.

      Reviewer 1-Comment 5: 5) The way the output from the computational model is analyzed appears to bias the results towards the author's preferred conclusion. In particular, the authors use the correlation between the simulated neural output for a degraded speech signal, say speech in noise, and the neural output to the speech signal in quiet with the efferent feedback activated. They then compute how this correlation changes when the degraded speech signal is processed by the computational model with or without efferent feedback. However, the way the correlation is computed clearly biases the results to favor processing by a model with efferent feedback.

      The result that the noise-vocoded speech has a higher correlation when processed with the efferent feedback on is therefore entirely expected, and not a revelation of the computational model. More surprising is the observation that, for speech in noise, the correlation value is larger without the efferent feedback. This could due to the scaling of loudness of the acoustic input (see point 1), but more detail is needed to pin this down. In summary, the computational model unfortunately does not allow for a meaningful conclusion.

      Response to Reviewer 1-Comment 5:

      While claims of bias would be understandable had we used shuffled auto-correlograms (SACs) to compare the expression of temporal fine structure (TFS) cues for natural speech versus vocoded stimuli (TFS cues reconstructed from the envelope of our vocoded stimuli would have differed dramatically from those original TFS cues in natural speech) (Shamma and Lorenzi, 2013), there is no inherent reason for SAC analysis of envelopes cues being biased towards either vocoded or speech-in-noise conditions as both stimuli retain the original envelope cues from natural speech. Indeed, since the purpose of our simulations was to compare the relative effects of adding efferent feedback on the reconstruction of the stimulus’ envelope cues in the AN for the two degraded stimuli, SACs offered a targeted analysis tool to extract the relevant information with fewer intermediate steps and presumptions than either encoder models or automatic speech recognition systems.

      We do agree with the reviewer that results of our simulations for the vocoded condition may have been less unexpected than those of speech-in-noise, as the envelopes of vocoded stimuli closely resemble those of natural speech in the absence of a masking noise. However, our results also demonstrate that adding efferent feedback could generate negative correlation changes for a number of vocoded words: either at individual frequencies (low and high spontaneous rate AN fibres (see raw data)) or on average across all frequencies tested [high spontaneous rate AN fibres only (Fig Supplement 3)]. This suggests that noise-vocoding speech (i.e. implementing the envelope from broader channel bandwidths while also scrambling spectrotemporal information in said channels) can disrupt envelope representation in the 1-2kHz range of certain words enough that efferent feedback should not be automatically presumed able to rectify their envelope cue reconstruction in AN fibres.

      As for the speech-in-noise conditions, our intuition for the negative correlation changes observed is that the signal-to-noise ratios (SNRs) tested were not large enough to allow for the isolated extraction of the target signal’s envelope by expanding the dynamic range of AN fibres. As the test stimuli and their SNRs were directly acquired by finding iso-performance in the psychophysical portion of this study (and appropriately normalized as input for the MAP_BS model), we consider the results of the simulation to be indicative of the actual benefit/disadvantage that activating efferent feedback might have on envelope representation of vocoded or speech-in-noise tasks in the AN [and not artefacts of poorly calibrated stimulus presentation level (see Responses to Reviewer1-Comment 1 and 6 for more details about methodology)]. Although this result may be surprising when viewed in the context of physiological and modelling studies demonstrating efferent feedback’s masking effect, our results may help to explain why MOCR anti-masking appears SNR- and stimulus- specific in numerous human studies (de Boer et al., 2012; Mertes et al., 2019).

      Reviewer 1-Comment 6: 6) The experiment on the ERPs in relation to the speech onsets is not properly controlled. In particular, the different acoustics of the considered speech signals -- speech in quiet, vocoded speech, speech in background noise -- will cause differences in excitation within the cochlea which will then affect every subsequent processing stage, from the brainstem and on to the cortex, thereby leading to different ERPs. As an example, babble noise allows for 'dip listening', while with its flat envelope speech-shaped noise does not. Analyzing differences in the ERPs with the goal of relating these to something different than the purely acoustic differences, such as to attention, would require these acoustic differences to be controlled, which is not the case in the current results.

      Response to Reviewer 1-Comment 6:

      Our fundamental methodological strategy was not to compare or even control the acoustics of the signals (although we did this to some extent by normalizing the presentation level and long-term spectrum across all signals), but instead to maintain iso-performance across conditions and, in doing so, allow the identification of brain mechanisms underlaying performance in a lexical decision task where speech intelligibility was manipulated.

      We do acknowledge the reviewer’s comment regarding acoustic differences across our speech signals. This is why in the Results section we describe that: “Early auditory cortical responses (P1 and N1) are largely driven by acoustic features of the stimulus (Getzmann et al., 2015; Grunwald et al., 2003)”. Therefore, our ERP analysis instead focuses on later, less stimulus-driven components such as P2, N400 and LPC: “Later ERP components, such as P2, N400 and the Late Positivity Complex (LPC), have been linked to speech- and task-specific, top-down (context-dependent) processes (Getzmann et al., 2015; Potts, 2004).”

      With regards to the reviewer’s example: “…babble noise allows for 'dip listening', while with its flat envelope speech-shaped noise does not”. We could argue that in our specific listening conditions “dip listening” did not offer a perceptual advantage over speech in speech shaped noise because:

      1) Higher SNR was required in the babble noise conditions to achieve the same level of performance than for the speech-shaped noise manipulations.

      2) Listeners have fewer chances to use the spectral and temporal dips compared to sentences(Rosen 2013) when listening to monosyllabic words (used in our study)

      3) The dips in the signal are expected to decrease both in depth and frequency with the number of talkers in a babble noise masker (8-talker babble used in our study), with no differences in masking effectiveness for more than 4-talker babble noise (Rosen et al., 2012).

      Overall, we believe that having modulated maskers effectively impaired speech intelligibility (Kwon and Turner 2001), but the most effective one was babble noise confirming that the best speech is its own best masker (Miller, 1947).

      Reviewer #2: [...] Reviewer 2-Comment 1: 1) A core premise of the experiment is that the non-invasive measures recorded in response to click sounds in one ear provide a direct measure of top-down modulation of responses to the speech sounds presented to the opposite ear. This is not acknowledged anywhere in the paper, and is simply not justifiable. The click and speech stimuli in the different ears will activate different frequency ranges and neural sources in the auditory pathway, as will the various noises added to the speech sounds. Furthermore, the click and speech sounds play completely different roles in the task, which makes identical top-down modulation illogical. The situation is further complicated by the fact that the clicks, speech and noise will each elicit MOCR activation in both ipsi- and contralateral ears via different crossed and uncrossed pathways, which implies different MOCR activation in the two ears.

      Response to Reviewer 2-Comment 1:

      We employed broadband clicks across all stimulus manipulations and listening conditions to activate the entire cochlea so that resulting OAEs could be used to measure modulation of cochlear gain by olivocochlear efferents.

      Historically, studies have applied clicks in one ear (to evoke OAEs) and a broadband noise suppressor in the other to monitor contralateral MOCR activation, demonstrating that clicks are suppressed consistently when subjects actively perform either an auditory (Froehlich et al., 1993, Maison et al., 2001; Garinis et al., 2011) or visual tasks (Puel et al., 1988; Froehlich et al., 1990; Avan & Bonfils 1992; Meric & Collet 1994). Therefore, while we acknowledge that the presence of clicks may have made the task of discriminating vocoded and words-in-noise more difficult, we would have expected to observe suppression of click-evoked OAEs for all stimulus manipulations whether subjects were actively or passively listening to speech stimuli in order to minimize the impact of the irrelevant clicks. In contrast, we observed that contralateral suppression of CEOAEs was both stimulus- and task-dependent. Unlike natural and vocoded speech, active listening of speech-in-noise did not produce significant MOCR activation; while passive listening (equivalent to visual attention) generated an MOCR effect in the opposite direction to their active-listening analogues for all 3 speech manipulations.

      Despite spectrotemporal, level and task-difficulty similarities between noise-vocoded speech and speech-in-noise manipulations, the stimulus-dependence of these results suggests that MOCR activation was controlled in a top-down manner according to the auditory scene presented. We speculated that this arises from improved peripheral processing of specific speech cues during active listening, whereas the opposite effects in passive listening are associated with attenuating auditory inputs to prioritize visual information. In line with this, we observed that introducing efferent feedback to our auditory periphery model differentially affected the auditory nerve output for the 3 most challenging speech manipulations: the resulting enhancement or deterioration of envelope cue representation offering an explanation for divergent patterns of MOCR gating for noise-vocoded and speech-in-noise.

      In summary, we predict that observed changes in CEOAE amplitudes in the contralateral ear will mirror cochlear gain inhibition in the ear processing speech. Bilateral descending control of the MOCR despite speech being presented monaurally is not unexpected for two reasons:

      1) Unlike simple pure tone stimuli, speech activates both left and right auditory cortices even when presented unilaterally to either ear (Heggdal et al., 2019)

      2) Cortical gating of the MOCR in humans does not appear restricted to direct ipsilaterally descending processes that impact cortical gain control in the opposite ear instead likely incorporating polysynaptic, decussating processes to affect both cochlear gain in both ears (Khalfa et al., 2001).

      Together this evidence makes it difficult to envisage a case where unilaterally-presented speech does not influence top-down control of cochlear gain bilaterally.

      Reviewer 2-Comment 2: 2) The vocoded conditions were recorded from a different group of participants than the masked speech conditions. Comparing between these two, which forms the essential point in this paper, is therefore highly confounded by inter-individual differences, which we know are substantial for these measures. More generally, the high variability of results in this research field should caution any strong conclusions based on comparing just these two experiments. A more useful approach would have been to perform the exact same task in the two experiments, to examine the reproducibility.

      Response to Reviewer 2-Comment 2:

      We ensured that the two populations tested across the three experiments were all normal hearing adults assessed using the same criteria. They were also age- and gender- matched and were recruited from undergraduate courses at Macquarie University (therefore presumably possessed similar literacy); however, we acknowledge this as an important issue and controlled for these issues, as far as we could, by:

      1) Ensuring that CEOAE SNRs were above a 6 dB minimum which allowed for more reliable and replicable recordings within and between subjects (Goodman et al., 2013).

      2) Carefully analysing and selecting ABR waveforms above the residual noise. Residual noise was calculated by applying a weighted average method based on Bayesian inference that weighs individual sweeps proportionally to their estimated precision (Box & Tiao, 1973). This helped preserve all trials without any rejection required for artefacts. ABR waveforms with residual noise equal to or higher than the averaged signal were discarded.

      3) Ensuring that individual ERP components represented a reliable individual average by: a) removing noisy trials (trials between -200 ms and 1.2 sec from sound onset which had absolute amplitude values higher than 75 μV) and b) maintaining between 60-80% of total trials per condition.

      In addition, we assessed potential differences across common variables between experiments such as, lexical performance during natural speech (see Results section), ABR components and CEOAE magnitude changes relative to the baseline during the Active and Passive listening of natural speech (as part of the 1st author’s thesis dissertation: Hernandez Perez, H., & Macquarie University. Department of Linguistics, degree granting institution. (2018). Disentangling the Influence of Attention in the Auditory Efferent System during Speech Processing / Heivet Hernandez Perez): “During active or passive listening of natural speech, no statistical differences between the populations assessed in the noise-vocoded and speech-in-noise experiments for: wave V-III amplitude ratio- Active listening [t (12) = 0.90, p=0.39], Passive listening: [t (23) = 1.58, p=0.13]; wave V-Active listening: [t (23) = 0.09, p=0.93]; Passive listening: [t (24) = -0.24, p=0.81]; CEOAE magnitude changes-Active listening [t (23) = -0.21, p=0.83; Passive listening [t (24) = -0.36, p=0.72].”

      These results ruled out the possibility that the effects observed across the three experiments were due to intrinsic differences between the populations tested. This would be discussed in a future version of the manuscript and added as supplemental material.

      Reviewer 2-Comment 3: 3) The interpretation presented here is essentially incompatible with the anti-masking model for the MOCR that first started of this field of research, in which the noise response is suppressed more than the signal, which is contradictory to the findings and model presented here, which suggest no role for the MOCR in improving speech in noise perception.

      Response to Reviewer 2-Comment 3:

      Physiological evidence for the MOCR anti-masking effect in animal models (Wiederhold, 1970; Winslow & Sachs 1987; Guinan & Gifford 1988; Kawase et al., 1993) has led to the hypothesis that the MOCR may play an important role in aiding humans to perceive speech in noise (Giraud et al., 1997; Liberman & Guinan 1998). The strictly non-invasive nature of human experiments has made measuring MOCR effects on OAE amplitudes the main technique for testing this anti-masking hypothesis. However, OAE inhibition (the MOCR-mediated reduction in OAE amplitude) has been reported as either increased (Giraud et al., 1997; Mishra and Lutman, 2014), reduced (de Boer et al., 2012; Harkrider and Bowers, 2009) or being unaffected (Stuart and Butler, 2012; Wagner et al., 2008) in participants with improved speech-in-noise perception. More recently, Mertes et al. (2019) suggested that the SNR used to explore speech-in-noise abilities might explain the contradicting results in the literature. The authors found that the MOCR only contributed to perception at the lowest SNR they tested (-12 dB), suggesting that the role of the MOCR for listening-in-noise may be highly dependent on the SNR, which in turns influences the extent to which the MOCR does or does not provide a benefit for hearing in noise. Therefore, our human and modelling data not only expands but also challenges the classical MOCR anti-masking effect by suggesting that, in humans, this effect is not only SNR-specific (which we controlled) but it is also task-specific (i.e whether participants are attending to the contralateral masker or not) and stimuli-dependent (i.e masker intrinsically noisy Vs signal-in-noise). We acknowledge that we can discuss further how our data advances the current state of the MOCR anti-masking effect in a future version of the manuscript.

      Reviewer 2-Comment 4: 4) The analysis of measures becomes increasingly selective and lacking in detail as the paper progresses: numerous 'outliers' are removed from the ABR recordings, with very uneven numbers of outliers between conditions. ABRs were averaged across conditions with no explicit justification. The statistical analysis of the ABRs is flawed as it does not compare across conditions (vocoded vs masked) but only within each condition separately (active v passive) - from which no across-condition difference can be inferred. The model simulation includes only 3 out of 9 active conditions. For the cortical responses, again only 3 conditions are discussed, with little apparent relevance.

      Response to Reviewer 2-Comment 4:

      In regard to the reviewer’s comment “The analysis of measures becomes increasingly selective and lacking in detail as the paper progresses: numerous 'outliers' are removed from the ABR recordings, with very uneven numbers of outliers between conditions. ABRs were averaged across conditions with no explicit justification.” During the analysis of the ABR measurements, we not only dealt with outliers but also with several missing data points (ABR components below the residual noise). The statistical analysis used to assess potential differences within ABR components was rANOVAs. This type of analysis is particularly restrictive when dealing with missing data points, because it will only include participants with all data available: (2 Conditions X 4 Stimuli manipulations for the noise vocoded experiment). This is why, ABR components’ sample sizes across experiments appeared uneven.

      Regarding the reviewer’s comment: “ABRs were averaged across conditions with no explicit justification.” Our rANOVA had the following design: Factor 1 (Conditions: Active Vs Passive); Factor 2 (Stimuli: natural, 8 channels noise vocoded (Voc8) …etc) and finally the Interaction (Conditions x Stimuli). ABR conditions were not simply averaged together; we only found a significant Conditions effect in the rANOVA that collapses all stimuli manipulations into Active Vs Passive conditions. Therefore, it was only statistically valid, to make inferences and potential interpretations about the Conditions main effect. This would be clarified in both the statistical design and in the Results section of a future version of this manuscript.

      In regard to the reviewer’s comment: “The statistical analysis of the ABRs is flawed as it does not compare across conditions (vocoded vs masked) but only within each condition separately (active v passive) - from which no across-condition difference can be inferred”. Up to this point in our data analysis, we were only interested in within-speech-manipulations comparisons (similar to the CEOAE analysis i.e, within noise-vocoded manipulations). We agree with the reviewer that a simple comparison between speech manipulations (noise-vocoded Vs masked speech) for the variables that are reflecting attentional changes (Active Vs Passive listening) could be useful to infer differences across experiments (noise-vocoded Vs speech-in-noise). This analysis will be added in a future version of the paper.

      Finally, regarding the comment:” The model simulation includes only 3 out of 9 active conditions. For the cortical responses, again only 3 conditions are discussed, with little apparent relevance”. At this stage of our analysis, we wanted to understand the potential reasons why the control of the cochlear gain appeared to be dependent on the way speech was being degraded i.e, noise vocoding the speech signal Vs speech-in-noise. Iso-performance being achieved in 3 task-difficulty levels, we thought to test how both the biophysical model and the auditory cortex (ERP components) would respond to the hardest and most challenging speech degradations (noise vocoded 8 channels, speech in babble noise +5 dB snr and speech in speech-shaped noise +3 dB snr) (see Figure 1B in Results section), where differences in the cochlear gain are most evident across experiments (see Figure 1B in Results section). In these extreme conditions we hypothesized that both the model and the auditory cortex activity would display the most obvious differences in the processing of the different speech degradations. We acknowledge the reviewer’s comment and in a future version of this manuscript, this line of thought will be more clearly described.

      Reviewer 2-Comment 5: 5) The assumption that changes in non-invasive measures, which represent a selective, random, mixed and jumbled by-product of underlying physiological processes, can be linked causally to auditory function, i.e. that changes in these responses necessarily have a definable and directional functional correlate in perception, is very tenuous and needs to be treated with much more caution.

      Response to Reviewer 2-Comment 5:

      We acknowledge the reviewer’s view about being cautious when interpreting non-invasive measures associated with human perception. However, the physiological measurements used in this study are not new in the field of auditory or speech perception, they are gold-standard methods to assess auditory function in both animal and human models. The novelty of our approach lays in imposing attentional states (Active listening) and (Passive listening) while concurrently probing along the auditory pathway in order to gain a holistic understanding of MOCR-mediated changes during a speech comprehension task. The strength of our methodology arises from extensively and continuously monitoring both the attentional states and the quality of our physiological measurements.

      Reviewer #3: [...] Reviewer 3-Comment 1: 1) However, I have several substantial concerns with the design, conceptualization, data analysis and interpretation of the results. I have had challenges to understand the hypotheses and rationale behind this study. A number of experimental paradigms have been employed, including peripheral/brainstem physiological measure, as well as cortical auditory responses during active versus 'passive' listening. Different noise conditions were tested but it is not clear to me what rationale was behind these stimulus choices. The authors claim that "our data comparing active and passive listening conditions highlight a categorical distinction between speech manipulation, a difference between processing a single, but degraded, auditory stream (vocoded speech) and parsing a complex acoustic scene to hear out a stream from multiple competing and spectrally similarly sounds" (lines 401-403). This seems like too much of a mouthful. I cannot see that the data support this pretty broad interpretation.

      Response to Reviewer 3-Comment 1:

      The main objective of this study is to examine the role of the auditory efferent system in active vs. passive listening tasks for three commonly employed speech manipulations. To address this, speech intelligibility was degraded in three ways: 1) noise vocoding the speech signal; 2) adding babble noise (BN) to the speech signal at different SNRs or 3) adding speech-shaped noise (SSN) to the speech signal at different SNRs. The reason for using noise-vocoded speech while contralaterally recording CEOAEs is that it allowed speech intelligibility to be manipulated without increasing noise levels (a classical way of evoking the MOCR (Berlin et al., 1993; Norman & Thornton 1993; Kalaiah et al., 2017b)). This avoided confounding CEOAE magnitude changes due to purely stimulus-driven MOCR activation with attention-driven MOCR on CEOAE magnitudes. Moreover, because the level of the speech spectrum decreases with increasing frequency, white noise (which is the most commonly used stimulus to evoke MOCR in the literature) predominantly masks only the high frequency component of the speech signal, therefore it is not considered an efficient speech masker. However, BN (besides representing a more ethological auditory type of noise) and SSN (which is the spectrally matched long-term averaged of the speech signal) have the same long-term average spectrum as speech. Therefore, these noises were able to mask the speech signal equally across frequencies.

      Reviewer 3-Comment 2: 2) Despite maintaining iso-difficulty between vocoded vs speech-in-noise (SIN) conditions, the authors neither address (a) the fundamental differences in understanding vocoded vs. SIN speech nor (b) any theoretical basis for how the noise manifests in vocoded speech. If the tasks are indeed so obviously 'categorically' different - then it should not be surprising they engage different processing (the 'denoising' may not be comparable). I would prefer much more clearly defined and targeted hypotheses and a justification of the specific stimulus and paradigm choices to test such hypotheses. It appears to me that numerous measures have been obtained (reflecting in fact very different processes along the auditory pathway) and then it has been attempted to make up some coherent conclusions from these data - but the assumptions are not clear, the data are very complex and many aspects of the discussion are speculative. To me, the most interesting element is the reversal of the MOCR behavior in the attended vs ignored conditions. However, ignoring a stimulus is not a passive task! It would have been interesting to also see cortical unattended results.

      Response to Reviewer 3-Comment 2:

      The motivation behind this study arises from controversy in the literature regarding attentional effects at both the level of the cochlear (via MOCR) and the brainstem. Previous studies of attentional effects on CEOAEs have not only prevented direct comparison among them but have also distorted the interpretation of their results. Most have implemented paradigms with large differences in their arousal state [or alertness levels (Eysenck, 2012)] and stimulus type between the active auditory task (e.g. speech stimuli presented while CEOAEs are recorded) and passive listening conditions (no task, CEOAEs recorded during no-noise conditions or with-noise conditions) (Froehlich et al., 1990; Meric et al., 1994; Srinivasan et al., 2012). Our experimental paradigm addressed these issues in three main ways: 1) using the same stimuli for both active and passive listening conditions; 2) using a controlled visual scene across the experimental sessions; and 3) attempting to control for differences in alertness during the passive condition by asking subjects to watch an engaging cartoon movie. The homogeneity of visual and auditory scenes across the experiments allowed the effects of attending to the speech on CEOAE magnitude to be disentangled from the stimulus-driven effects.

      In addition, it was never assumed that the “Passive listening” or the “auditory-ignored” condition was a passive task. In this condition subjects were asked to ignore the auditory stimuli and to watch a non-subtitled, stop-motion movie. To ensure participants’ attention during this condition, they were monitored with a video camera and were asked questions at the end of this session (e.g. What happened in the movie? How many characters were present?) (See Methods section). The aim of a passive or an auditory-ignoring condition is to shift attentional resources away from the auditory scene and towards the visual scene. As shown in (Figure supplement 4) all ERP components were also obtained in the Passive listening condition and they are of a smaller magnitude than ERP components observed in the active listening conditions, demonstrating that cortical representation of the speech-onset was enhanced in all active listening conditions.

      Reviewer 3-Comment 3: 2) Overall, I'm struggling with this study that touches upon various concepts and paradigms (efferent feedback, active vs. passive listening, neural representation of listening effort, modeling of efferent signal processing, stream segregation, speech-in-noise coding, peripheral vs cortical representations...) where each of them in isolation already provides a number of challenges and has been discussed controversially. In my view, it would be more valuable to specify and clarify the research question and focus on those paradigms that can help verify or falsify the research hypotheses.

      Response to Reviewer 3-Comment 3:

      In our study, we sought to explore how active listening of degraded speech modulates CEOAE magnitudes (as a proxy for efferent-MOCR effects). With the specific Research question: Does auditory attention modulate cochlear gain, via the auditory efferent system, in a task-dependent manner? and Hypothesis: Decreases in speech intelligibility raise auditory attention and this reduces cochlear gain (measured using CEOAEs).

      In particular, unlike previously published studies, we assessed auditory changes objectively and subjectively as part of a highly controlled experimental paradigm, maintaining a constant performance across three experimental manipulations of speech intelligibility as well as minimizing influences of MEMR activation and controlling for homogeneity of both visual and auditory scenes across conditions. We agree with the reviewer that due to the complexity of our study, each section should be more explicit in its hypothesis and aims. This will be clarified in a future version of this manuscript.

    1. Author Response

      We thank the reviewers for their comments, which will improve the quality of our manuscript.

      Our study describes a novel approach to the identification of GTPase binding-partners. We recapitulated and augmented previous protein-protein interaction data for RAB18 and presented data validating some of our findings. In aggregate, our dataset suggested that RAB18 modulates the establishment of membrane contact sites and the transfer of lipid between closely apposed membranes.

      In the original version of our manuscript, we stated that we were exploring the possibility that RAB18 contributes to cholesterol biosynthesis by mobilizing substrates or products of the Δ8-Δ7 sterol isomerase emopamil binding protein (EBP). While our manuscript was under review, we profiled sterols in wild-type and RAB18-null cells and assayed cholesterol biosynthesis in a panel of cell lines (Figure 1).

      Figure 1

      Our new data show that an EBP-product, lathosterol, accumulates in RAB18-null cells (p<0.01). Levels of a downstream cholesterol intermediate, desmosterol, are reduced in these cells (p<0.01) consistent with impaired delivery of substrates to post-EBP biosynthetic enzymes (Figure 1A). Further, our preliminary data suggests that cholesterol biosynthesis is substantially reduced when RAB18 is absent or dysregulated (4 technical replicates, one independent experiment) (Figure 1B).

      Because of the clinical overlap between Micro syndrome and cholesterol biosynthesis disorders including Smith-Lemli-Opitz syndrome (SLOS; MIM 270400) and lathosterolosis (MIM 607330), our new findings suggest that impaired cholesterol biosynthesis may partly underlie Warburg Micro syndrome pathology. Therapeutic strategies have been developed for the treatment of SLOS and lathosterolosis, and so confirmation of our findings may spur development of similar strategies for Micro syndrome.

      Our new findings provide further functional validation of our methodology and support our interpretation of our protein interaction data.

      Response to Reviewer #1

      Reply to point 1)

      Tetracycline-induced expression of wild-type and mutant BirA*-RAB18 fusion proteins in the stable HEK293 cell lines was quantified by densitometry (Figure 2).

      Figure 2

      For the HEK293 BioID experiments, tetracycline dosage was adjusted to ensure comparable expression levels. We will include these data in supplemental material in an updated version of our manuscript.

      The localization of wild-type and mutant forms of RAB18 in HEK293 cells is somewhat different consistent with previous reports (Ozeki et al. 2005)(Figure 3).

      Figure 3

      We feel that this may reflect the differential localization of ‘active’ and ‘inactive’ RAB18, with wild-type RAB18 corresponding to a mixture of the two. We will include these data in supplemental material in an updated version of our manuscript.

      We acknowledge that the differential localization of wild-type and mutant BirA*-RAB18 might influence the compliment of proteins labeled by these constructs. Nevertheless, we feel that the RAB18(S22N):RAB18(WT) ratios are useful since they distinguish a number of previously-identified RAB18-interactors (manuscript, Figure 1B).

      Reply to point 2)

      For the HEK293 dataset, spectral counts are provided and for the HeLa dataset LFQ intensities were provided in the manuscript (manuscript, Tables S1 and S2 respectively). However, we did not find that these were useful classifiers for ranking functional interactions when used in isolation.

      The extent of labelling produced in a BioID experiment is not wholly determined by the kinetics of protein-protein associations. It is also influenced by, for example, protein abundance, the number and location of exposed surface lysine residues, and protein stability over the timcourse of labelling. We feel that RAB18(S22N):RAB18(WT) and GEF-null:wild-type ratios were helpful in controlling for these factors. Further, that our comparative approach was effective in highlighting known RAB18-interactors and in identifying novel ones.

      We acknowledge that our approach may omit some bona fide functional RAB18-interactions, but would argue that our aims were to augment existing functional RAB18-interaction data and avoid false-positives, rather than to emphasise completeness.

      Reply to point 3)

      We will include representative fluorescence images for the SEC22A, NBAS and ZW10 knockdown experiments in an updated version of our manuscript.

      Unfortunately, a suitable antibody for determining knockdown efficiency of SEC22A at the protein level is not commercially available. We will determine SEC22A knockdown efficiency at the mRNA level using qPCR.

      Reply to point 4)

      Expression levels of wild-type and mutant RAB18 in the stable CHO cell lines generated for this study were determined by Western blotting and found to be comparable (Figure 4).

      Figure 4

      We will include these data in supplemental material in an updated version of our manuscript.

      The levels of [14C]-CE were higher in RAB18(Gln67Leu) cells than in the other cell lines following loading with [14C]-oleate for 24 hours. We will amend the text to make this explicit. Our interpretation of the data is that ‘active’ RAB18 facilitates the mobilization of cholesterol. When cells are loaded with oleate, this promotes generation and storage of CE. Conversely, when cells are treated with HDL, it promotes more rapid efflux.

      Our new data implicating RAB18 in the mobilization of lathosterol supports our interpretation of our loading and efflux experiments. In the light of our new data showing that de novo cholesterol biosynthesis is impaired when RAB18 is absent or dysregulated, it will be interesting to determine whether cholesterol synthesis is increased in the RAB18(Gln67Leu) cells.

      Response to Reviewer #2

      Reply to point 1)

      We anticipate that the approach of comparative proximity biotinylation in GEF-null and wild-type cell lines will be broadly useful in small GTPase research.

      While RAB18 has previously been implicated in regulating membrane contacts, the identification of SEC22A as a RAB18-interactor adds to the previous model for their assembly.

      While ORP2 and INPP5B have previously been shown to mediate cholesterol mobilization, the novel finding that they both interact with RAB18 adds to this work. We argue that RAB18-ORP2-INPP5B functions in an analogous manner to ARF1-OSBP-SAC1 in mediating sterol exchange. The broad Rab-binding specificity of multiple OSBP-homologs, and that of multiple phosphoinositide phosphatase enzymes, suggests that this may be a common conserved relationship.

      Our new data indicating that RAB18 coordinates generation of sterol intermediates by EBP and their delivery to post-EBP biosynthetic enzymes reveals a new role for Rab proteins in lipid biogenesis. Most importantly, our new findings that RAB18 deficiency is associated with impaired cholesterol biogenesis suggest that Warburg Micro syndrome is a cholesterol biogenesis disorder. Further, that it may be amenable to therapeutic intervention.

      Reply to point 2)

      Recognising that the effect of RAB18 on cholesterol esterification and efflux could arise from various causes, we previously carried out Western blotting of the CHO cell lines for ABCA1 to determine whether this protein was involved (Figure 5).

      Figure 5

      Similar levels of ABCA1 expression in these lines suggests it is not. We will include these data in supplemental material in an updated version of our manuscript.

      We feel that our new data implicating RAB18 in lathosterol mobilization provides important insight into its role in cholesterol biogenesis. Further, it supports our previous suggestion that RAB18 mediates cholesterol mobilization.

      Reply to point 3)

      We agree that the established roles for ORP2, TMEM24/C2CD2L and PIP2 at the plasma membrane make this an extremely interesting area for future research; it is one we are actively investigating. However, we respectfully feel that to comprehensively explore the subcellular locations of RAB18-mediated sterol/PIP2 exchange requires another study and is beyond the scope of the present report.

      Response to Reviewer #3

      Reply to point 1)

      The RAB18-SPG20 interaction has already been validated with a co-immunoprecipitation experiment (Gillingham et al. 2014). We will update the text of our manuscript to make this more explicit, but do not feel it is necessary to recapitulate this work.

      We argue in the manuscript that RAB18 may coordinate the assembly of a non-canonical SNARE complex incorporating SEC22A, STX18, BNIP1 and USE1. However, this role may be mediated through prior interaction with the NBAS-RINT1-ZW10 (NRZ) tethering complex and the SM-protein SCFD2 rather than through a direct interaction. We therefore feel that a RAB18-SEC22A interaction may be difficult to validate by conventional means.

      The reciprocal experiments with BioID2(Gly40S)-SEC22A did provide tentative confirmation of the interaction together with evidence that a subset of SEC22A-interactions are attenuated when RAB18 is absent or dysregulated. In the light of our new findings reinforcing a role for RAB18 in sterol mobilization at membrane contact sites, it is interesting that one of these is DHRS7, an enzyme with steroids among its putative substrates.

      Reply to point 2)

      We previously analysed the localization of the BirA*-RAB18 fusion protein in HeLa cells (Figure 6).

      Figure 6

      It shows a reticular staining pattern consistent with the reported localization of RAB18 to the ER (Gerondopoulos et al. 2014; Ozeki et al. 2005). We will include these data in supplemental material in an updated version of our manuscript.

      Heterologous expression of the BirA*-RAB18 fusion protein in HeLa cells identified the interactions between RAB18 and EBP, ORP2 and INPP5B, for which we now have supportive functional evidence. Since the evidence for impaired lathosterol mobilization and cholesterol biosynthesis was derived from experiments on null-cells, in which endogenous protein expression is absent, we feel that rescue experiments are not necessary in the present study. However, such experiments could be highly useful in future studies.

      Reply to point 3)

      Our screening approach did use both a RAB3GAP-null:wild-type comparison (manuscript, Figure 2, Table S2) and also a RAB18(S22N):RAB18(WT) comparison (manuscript, Figure 1, Table S1). Differences should be expected between these datasets, since they used different cell lines and slightly different methodologies. Nevertheless, proteins identified in both datasets included the known RAB18 effectors NBAS, RINT1, ZW10 and SCFD2, and the novel potential effectors CAMSAP1 and FAM134B.

      There is prior evidence for 12 of the 25 RAB3GAP-dependent RAB18 interactions we identified (manuscript, Figure 2D). Among the 6 lipid modifying/mobilizing proteins found exclusively in our HeLa dataset, we previously presented direct evidence for the interaction of RAB18 with TMCO4. We now also have strong functional evidence for its interaction with EBP, ORP2 and INPP5B.

      Reply to point 4)

      It has been reported that knockdown of SEC22B does not affect the size distribution of lipid droplets (Xu et al. 2018) Figure 8H). Nevertheless, we will carry out qPCR experiments to determine whether the SEC22A siRNAs used in our study affect SEC22B expression. We have found that exogenous expression of SEC22A can cause cellular toxicity. Rescue experiments would therefore be difficult to perform.

      Reply to point 5)

      The background fluorescence measured in SPG20-null cells and presented in Figure 4B in the manuscript does not imply that the SPG20 antibody shows significant cross-reactivity. Rather, it reflects the fact that fluorescence intensity is recorded by our Operetta microscope in arbitrary units.

      Figure 7

      Above (Figure 7), is a version of the panel in which fluorescence from staining cells with only the secondary antibody is included (recorded in a previous experiment and expressed as a proportion of total SPG20 fluorescence in this experiment).

      We have found that comparative fluorescence microscopy is more sensitive than immunoblotting. The SPG20 antibody we used to stain the HeLa cells has previously been used in quantitative fluorescence microscopy (Nicholson et al. 2015).

      Furthermore, we showed corresponding, significantly reduced, expression of SPG20 in RAB18- and TBC1D20-null RPE1 cells, using quantitative proteomics (manuscript, Table S3).

      We acknowledge that quantification of SPG20 transcript levels would clarify the level at which it is downregulated and will carry out qPCR experiments accordingly.

      Reply to point 6)

      We interpret both the enhanced CE-synthesis following oleate-loading and the rapid efflux upon incubation with HDL (manuscript, Figure 7A) as resulting from increased cholesterol mobilization. Our new data implicating RAB18 in the mobilization of lathosterol support this interpretation.

      In the [3H]-cholesterol efflux assay (manuscript, Figure 7B) total [3H]-cholesterol loading at t=0 was 156392±8271 for RAB18(WT) cells, 168425±9103 for RAB18(Gln67Leu) cells and 148867±7609 (cpm determined through scintillation counting). Normalizing to total cellular radioactivity assured that differences in loading between replicates did not skew the results.

      The candidate effector likely to directly mediate cholesterol mobilization is ORP2. It has been shown that ORP2 overexpression drives cholesterol to the plasma membrane (Wang et al. 2019). Further, there is evidence for reduced plasma membrane cholesterol in ORP2-null cells (Wang et al. 2019).

      We previously carried out Western blotting of the CHO cell lines for ABCA1 to determine whether this protein was involved in altered efflux (Figure 5, above). Similar levels of ABCA1 expression in these lines suggests it is not. We will include these data in supplemental material in an updated version of our manuscript.

      References

      Gerondopoulos, A., R. N. Bastos, S. Yoshimura, R. Anderson, S. Carpanini, I. Aligianis, M. T. Handley, and F. A. Barr. 2014. 'Rab18 and a Rab18 GEF complex are required for normal ER structure', J Cell Biol, 205: 707-20.

      Gillingham, A. K., R. Sinka, I. L. Torres, K. S. Lilley, and S. Munro. 2014. 'Toward a comprehensive map of the effectors of rab GTPases', Dev Cell, 31: 358-73.

      Nicholson, J. M., J. C. Macedo, A. J. Mattingly, D. Wangsa, J. Camps, V. Lima, A. M. Gomes, S. Doria, T. Ried, E. Logarinho, and D. Cimini. 2015. 'Chromosome mis-segregation and cytokinesis failure in trisomic human cells', eLife, 4.

      Ozeki, S., J. Cheng, K. Tauchi-Sato, N. Hatano, H. Taniguchi, and T. Fujimoto. 2005. 'Rab18 localizes to lipid droplets and induces their close apposition to the endoplasmic reticulum-derived membrane', J Cell Sci, 118: 2601-11.

      Wang, H., Q. Ma, Y. Qi, J. Dong, X. Du, J. Rae, J. Wang, W. F. Wu, A. J. Brown, R. G. Parton, J. W. Wu, and H. Yang. 2019. 'ORP2 Delivers Cholesterol to the Plasma Membrane in Exchange for Phosphatidylinositol 4, 5-Bisphosphate (PI(4,5)P2)', Mol Cell, 73: 458-73 e7.

      Xu, D., Y. Li, L. Wu, Y. Li, D. Zhao, J. Yu, T. Huang, C. Ferguson, R. G. Parton, H. Yang, and P. Li. 2018. 'Rab18 promotes lipid droplet (LD) growth by tethering the ER to LDs through SNARE and NRZ interactions', J Cell Biol, 217: 975-95.

    2. Reviewer #3:

      This study by Kiss and colleagues reports the findings of proximity biotinylation experiments for the discovery of novel RAB18 effectors. The authors perform careful proteomic analysis that appears well-controlled and successful in recapitulating known interactions. That small GTPase interactions can be identified with this approach has been previously demonstrated, though the application of this approach to RAB18 is novel and of interest to the field. A number of intriguing findings with potentially important implications are reported. However, this manuscript has several weaknesses.

      Major concerns and questions:

      1) As the authors report, proximity biotinylation may not reflect direct protein-protein interactions but simply colocalization of bait and prey proteins. A true protein-protein interaction ideally would be further supported by ancillary experiments such as in vitro binding or co-immunoprecipitation, including an assessment of whether the interaction is affected by the GTP- or GDP-bound state. While co-IP in WT and GEF-deficient cells was performed for 1 candidate interactor (TMC04, Figure 6C), protein-protein interactions were not tested for the other 2, with the latter relying on either repeat BioID (SPG20, Figure 3A) or reciprocal BioID (SEC22A, Figure 5B).

      2) Putative RAB18 interactions may be affected by the BioID fusion itself or by heterologous expression. While it is reassuring that known interactors were detected with this approach, the conclusions would be better supported by testing the localization of the fusion protein in comparison to endogenous RAB18, and/or by rescue of a phenotype associated with RAB18-deficiency.

      3) Conclusions about the dependence of RAB18 interactions on its GTP or GDP-bound state rely on differences observed in cells with deficiency of RAB18 GEFs. It is certainly possible however that RAB3GAP may serve as a GEF for other GTPases, or have other functions, that cause the observed differences in labeling. The conclusions would be strengthened by additional experiments showing a direct effect - e.g. reproducing the disrupted labeling of candidate effectors with a GDP-locked RAB18 point mutant, or showing that RAB3GAP deficiency reduces binding of a candidate effector to RAB18.

      4) The putative role of SEC22A in regulating lipid droplet morphology relies on siRNA perturbations that are prone to off-target effects. This is especially concerning given the high degree of sequence similarity between SEC22A and SEC22B, the latter of which has a known role in regulating LD morphology. Rescue of this phenotype with a siRNA-resistant SEC22A cDNA would rule out this possibility.

      5) The finding of SPG20 protein abundance being affected by RAB18-deficiency relies on immunofluorescence with an antibody exhibiting cross-reactivity. While the authors do attempt to adjust for this non-specific background fluorescence, this conclusion would be strengthened by immunoblotting for a change in abundance of the specific band corresponding to SPG20. If confirmed, measurement of SPG20 transcripts levels would also help clarify the level of regulation for the altered protein abundance.

      6) The influence of stable expression of a RAB18 GTP-locked point mutant on cholesterol metabolism is intriguing but the experimentation appears perfunctory. For 14C-CE cellular levels in 14C-oleate-loaded cells (Figure 7A), the most striking difference is the greatly enhanced synthesis level of CE at t=0. Is the subsequent drop due to an effect of RAB18 on efflux, or simply a consequence of the higher starting level at t=0? For efflux assays on 3H-cholesterol-loaded cells (Figure 7B), the data is only presented as a ratio of 3H activity in media relative to lysates after a 5 hr incubation with HDL. Interpretation of these results would be aided by a more detailed analysis. How does 3H-cholesterol uptake compare after 24 hr incubation but prior to addition of HDL (t=0)? After the 5 hr HDL chase, are the differences in the ratio driven by an increase in extracellular activity, a decrease in intracellular activity, or both? Ultimately these conclusions would be better supported by a more detailed analysis. Does disruption of the candidate effectors phenocopy the effect of RAB18 disruption? Are any known mediators of cholesterol efflux affected by RAB18 disruption? While a comprehensive mechanism may be reasonably considered beyond the scope of this paper, some additional descriptive analysis would be useful in interpreting these findings.

    3. Reviewer #2:

      This study used WT and mutant RAB18 to look for interacting proteins in normal and GEF-deficient cells. A catalog of interactions that are regulated by nucleotide binding and/or GEF activity were uncovered. Among identified proteins, there are known/established ones and there are some new ones. Initial validation was carried out for some newly identified effectors such as TMCO4 and Sec22A.

      Major concerns and questions:

      1) While the addition of new RAB18 effectors is useful to researchers who are interested in RAB18, the overall conclusion that RAB18 may regulate membrane contacts and lipid metabolism is not new.

      2) Figure 7: the effect of RAB18 on cholesterol esterification and efflux may arise from multiple causes. This set of experiments do not provide any real insights into RAB18's role in cholesterol metabolism.

      3) Given RAB18's interaction with ORP2, TMEM24 and OCRL, perhaps the authors may examine plasma membrane PIP2. The results would be more specific and novel.

    4. Reviewer #1:

      This manuscript used proximity biotinylation to discriminate functional RAB18 interactions. The authors provide some evidence for several of the interactors and some functional data supporting a role for RAB18 in modulating cholesterol mobilization.

      Major concerns and questions:

      1) Based on the spectral counts, the author calculated a mutant:WT ratios as a readout to identify nucleotide-binding-dependent effectors. But it is important to show that WT protein and mutant protein have similar expression level to begin with. And the intracellular localization of the mutation and WT should also be determined. Do they show the similar intracellular localization?

      2) The ratio of mutation:WT is useful to remove some background. But this may omit some very highly interacting proteins just because their fold change is low. The converse is true for rare proteins. It would be better to have a list of candidate effectors based on the absolute counts.

      3) Sec22A knockdown will change the morphology of lipid droplets. A knockdown efficiency test and some representative fluorescence images here would make this data more compelling.

      4) Same comment for the cholesterol mobilization experiment. Expression level of the protein is needed. Figure 7A is rather confusing, as the Gln67Leu mutation already has higher CE before loading HDL. Why is this this? Better uptake or reduced efflux? What is the de novo cholesterol synthesis activity in this cell line?

    5. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to Version 1 of the preprint: https://www.biorxiv.org/content/10.1101/871517v1

      Summary:

      As a possible path to better understand and develop treatments for Warburg Micro Syndrome (WMS), the authors have investigated the networks of protein-protein interactions involving genes mutated in this rare genetic disease. The goals of the work are to identify new proteins involved in the pathophysiology of the disease and to better understand the molecular and cellular effects of disease-causing mutations. The data will likely be of interest to researchers studying WMS and RAB18, the protein focused on here, but reviewers expressed some concerns about the validation and interpretation of the presented protein interaction data.

    1. Author Response

      Reviewer #1:

      This paper addresses the very interesting topic of genome evolution in asexual animals. While the topic and questions are of interest, and I applaud the general goal of a large-scale comparative approach to the questions, there are limitations in the data analyzed. Most importantly, as the authors raise numerous times in the paper, questions about genome evolution following transitions to asexuality inherently require lineage-specific controls, i.e. paired sexual species to compare with the asexual lineages. Yet such data are currently lacking for most of the taxa examined, leaving a major gap in the ability to draw important conclusions here. I also do not think the main positive results, such as the role of hybridization and ploidy on the retention and amount of heterozygosity, are novel or surprising.

      We agree with the reviewer that having the sexual outgroups would improve the interpretations; this is one of the points we make in our manuscript. Importantly however, all previous genome studies of asexual species focus on individual asexual lineages, generally without sexual species for comparison. Yet reported genome features have been interpreted as consequences of asexuality (e.g., Flot et al. 2013). By analysing and comparing these genomes, we can show that these features are in fact lineage-specific rather than general consequences of asexuality. Unexpectedly, we find that asexuals that are not of hybrid origin are largely homozygous, independently of the cellular mechanism underlying asexuality. This contrasts with the general view that cellular mechanisms such as central fusion (which facilitates heterozygosity retention between generation) promotes the evolutionary success of asexual lineages relative to mechanisms such as gamete duplication (which generate complete homozygosity) by delaying the expression of the recessive load. We also do not observe the expected relationship between cellular mechanism of asexuality and heterozygosity retention in species of hybrid origin. Thus we respectfully disagree that our results are not surprising. Reviewer #2 found our results “interesting” and a “potentially important contribution”, and reviewer #3 wrote that we “call into question the generality of the theoretical expectations, and suggest that the genomic impacts of asexuality may be more complicated than previously thought”.

      We also make it very clear that some of the patterns we uncover (e.g. low TE loads in asexual species) cannot be clearly evaluated with asexuals alone. Our study emphasizes the importance of the fact that asexuality is a lineage-level trait and that comparative analyses using asexuals requires lineage-level replication in addition to comparisons to sexual species.

      References

      Flot, Jean-François, et al. "Genomic evidence for ameiotic evolution in the bdelloid rotifer Adineta vaga." Nature 500.7463 (2013): 453-457.

      Reviewer #2:

      [...] Major Issues and Questions:

      1) The authors choose to refer to asexuality when describing thelytokous parthenogenesis. Asexuality is a very general term that can be confusing: fission, vegetative reproduction could also be considered asexuality. I suggest using parthenogenesis throughout the manuscript for the different animal clades studied here. Moreover, in thelytokous parthenogenesis meiosis can still occur to form the gametes, it is therefore not correct to write that "gamete production via meiosis... no longer take place" (lines 57-58). Fertilization by sperm indeed does not seem to take place (except during hybridogenesis, a special form of parthenogenesis).

      We will clarify more explicitly what asexuality refers to in our manuscript. Notably our study does not include species that produce gametes which are fertilized (which is the case under hybridogenesis, which sensu stricto is not a form of parthenogenesis). Even though many forms of parthenogenesis do indeed involve meiosis (something we explain in much detail in box 2), there is no production of gametes.

      2) The cellular mechanisms of asexuality in many asexual lineages are known through only a few, old cytological studies and could be inaccurate or incomplete (for example Triantaphyllou paper of 1981 of Meloidogyne nematodes or Hsu, 1956 for bdelloid rotifers). The authors should therefore mention in the introduction the lack of detailed and accurate cellular and genetic studies to describe the mode of reproduction because it may change the final conclusion.

      For example, for bdelloid rotifers the literature is scarce. However the authors refer in Supp Table 1 to two articles that did not contain any cytological data on oogenesis in bdelloid rotifers to indicate that A. vaga and A. ricciae use apomixis as reproductive mode. Welch and Meselson studied the karyotypes of bdelloid rotifers, including A. vaga, and did not conclude anything about absence or presence of chromosome homology and therefore nothing can be said about their reproduction mode. In the article of Welch and Meselson the nuclear DNA content of bdelloid species is measured but without any link with the reproduction mode. The only paper referring to apomixis in bdelloids is from Hsu (1956) but it is old and new cytological data with modern technology should be obtained.

      We will correct the rotifer citations and thank the reviewer for picking up the error. We agree that there are uncertainties in some cytological studies, but the same is true for genomic studies (which is why we base our analyses as much as possible on raw reads rather than assemblies because the latter may be incorrect). We in fact excluded cytological studies where the findings could not be corroborated. For example, we discarded the evidence for meiosis and diploidy by Handoo at al. 2004 for its incompatibility with genomic data because this study does not provide any verifiable evidence (there are no data or images, only descriptions of observations). We provide all the references in the supplementary material concerning the cytological evidence used.

      3) In the section on Heterozygosity, the authors compute heterozygosity from kmer spectra analysis from reads to "avoid biases from variable genome assembly qualities" (page 16). But such kmer analysis can be biased by the quality and coverage of sequencing reads. While such analyses are a legitimate tool for heterozygosity measurements, this argument (the bias of genome quality) is not convincing and the authors should describe the potential limits of using kmer spectra analyses.

      We excluded all the samples with unsuitable quality of data (e.g. one tardigrade species with excessive contamination or the water flea samples for insufficient coverage), and T. Rhyker Ranallo Benavidez, the author of the method we used, collaborated with us on the heterozygosity analyzes. However, we will clarify the limitations of the method for species with extremely low or high heterozygosity (see also comment 5 of this reviewer).

      4) The authors state that heterozygosity levels “should decay over time for most forms of meiotic asexuality". This is incorrect, as this is not expected with "central fusion" or with "central fusion automixis equivalent" where there is no cytokinesis at meiosis I.

      Our statement is correct. Note that we say “most” and not “all” because certain forms of endoduplication in F1 hybrids result in the maintenance of heterozygosity. Central fusion is expected to fully retain heterozygosity only if recombination is completely suppressed (see for example Suomalainen et al. 1987 or Engelstädter 2017).

      5) I do not fully agree with the authors’ statement that: "In spite of the prediction that the cellular mechanism of asexuality should affect heterozygosity, it appears to have no detectable effect on heterozygosity levels once we control for the effect of hybrid origins (Figure 2)." (page 17)

      The scaling on Figure 2 is emphasizing high values, while low values are not clearly separated. By zooming in on the smaller heterozygosity % values we may observe a bigger difference between the "asexuality mechanisms". I do not see how asexuality mechanism was controlled for, and if you look closely at intra group heterozygosity, variability is sometimes high.

      It is expected that hybrid origin leads to higher heterozygosity levels but saying that asexuality mechanism is not important is surprising: on Figure 2 the orange (central fusion) is always higher than yellow (gamete duplication).

      As we explain in detail in the text, the three comparatively high heterozygosity values under spontaneous origins of asexuality (“orange” points in the bottom left corner of the figure) are found in an only 40-year old clone of the Cape bee. Among species of hybrid origin, we see no correlation between asexuality mechanism and heterozygosity. These observations suggest that the asexuality mechanism may have an impact on genome-wide heterozygosity in recent incipient asexual lineages, but not in established asexual lineages.

      Also, the variability found within rotifers could be an argument against a strong importance of asexuality origin on heterozygosity levels: the four bdelloid species likely share the same origin but their allelic heterozygosity levels appears to range from almost 0 to almost 6% (Fig 2 and 3, however the heterozygosity data on Rotaria should be confirmed, see below).

      We prefer not using the data from rotifers for making such arguments, given the large uncertainty with respect to genome features in this group (including the possibility of octoploidy in some species which we describe in the supplemental information). One could even argue that the highly variable genome structure among rotifer species could indicate repeated transitions to asexuality and/or different hybridization events, but the available genome data would make all these arguments highly speculative.

      The authors’ main idea (i.e. asexuality origin is key) seems mostly true when using homoeolog heterozygosity and/or composite heterozygosity which is not what most readers will usually think as "heterozygosity". This should be made clear by the authors mostly because this kind of heterozygosity does not necessarily undergo the same mechanism as the one described in Box 2 for allelic heterozygosity. If homoeolog heterozygosity is sometimes not distinguishable from allelic heterozygosity, then it would be nice to have another box showing the mechanisms and evolution pattern for such cases (like a true tetraploid, in which all copies exist).

      The heterozygosity between homoeologs is always high in this study while it appears low between alleles, but since the heterozygosity between homeologs can only be measured when there is a hybrid origin, the only heterozygosity that can be compared between ALL the asexual groups is the one between alleles.

      By definition, homoeologs have diverged between species, while alleles have diverged within species. So indeed divergence between homoeologs will generally exceed divergence between alleles. We will consider adding expected patterns in perfect tetraploid species for Box 2.

      Both in the results and the conclusion the authors should not over interpret the results on heterozygosity. The variation in allelic heterozygosity could be small (although not in all asexuals studied) also due to the age of the asexual lineages. This is not mentioned here in the result/discussion section..

      We explain in section Overview of species and genomes studied that age effects are important but that we do not consider them quantitatively because age estimates are not available for the majority of asexual species in our paper.

      6) Regarding the section on Heterozygosity structure in polyploids

      There is inconsistency in many of the numbers. For example, A. vaga heterozygosity is estimated at 1.42% in Figure 1, but then appears to show up around 2% in Figure 2, and then becomes 2.4% on page 20. It is unclear is this is an error or the result of different methods.

      It is also unclear how homologs were distinguished from homeologs. How are 21 bp k-mers considered homologous? In the method section. the authors describe extracting unique k-mer pairs differing by one SNP, so does this mean that no more than one SNP was allowed to define heterozygous homologous regions? Does this mean that homologues (and certainly homoeologs) differing by more than 5% would not be retrieved by this method. If so, then It is not surprising that for A.vaga is classified as a diploid.

      Figure 1 a presents the values reported in the original genome studies, not our results. This is explained in the corresponding figure legend. Hence, 1.42 is the value reported by Flot at al. 2013. 2.4 is the value we measure and it is consistent in Figures 2 and 3.

      We used k-mer pairs differing by one SNP to estimate ploidy (smudgeplot). The heterozygosity estimates were estimated from kmer spectra (GenomeScope 2.0). The kmers that are found in 1n must be heterozygous between homologs, as the homoeolog heterozygosity would produce 2n kmers, We used the kmer approach to estimate heterozygosity in all other cases than homoeologs of rotifers, which were directly derived from the assemblies. We explain this in the legend to Figure 3, but we will add the information also to the Methods section for clarification.

      The result for A. ricciae is surprising and I am still not convinced by the octoploid hypothesis. In Fig S2. there is a first peak at 71x coverage that still could be mostly contaminants. It would be helpful to check the GC distribution of k-mers in the first haploid peak of A. ricciae to check whether there are contaminants. The karyotypes of 12 chromosomes indeed do not fit the octoploid hypothesis. I am also surprised by the 5.5% divergence calculated for A. ricciae, this value should be checked when eliminating potential contaminants (if any). In general, these kind of ambiguities will not be resolved without long-read sequencing technology to improve the genome assemblies of asexual lineages.

      We understand the scepticism of the reviewer regarding the octoploidy hypothesis, but it is important to note that we clearly present it as a possible explanation for the data that needs to be corroborated, i.e., we state that the data are better consistent with octo- than tetraploidy. Contamination seems quite unlikely, as the 71.1x peak represents nearly exactly half the coverage of the otherwise haploid peak (142x). Furthermore, the Smudgeplot analysis shows that some of the kmers from the 71x peak pair with genomic kmers of the main peaks. We also performed KAT analysis (not presented in the manuscript) showing that these kmers are also represented in the decontaminated assembly. We will add this clarification regarding possible contamination to the supplementary materials.

      7) Regarding the section on palindromes and gene conversion

      The authors screened all the published genomes for palindromes, including small blocks, to provide a more robust unbiased view. However, the result will be unbiased and robust if all the genomes compared were assembled using the same sequencing data (quality, coverage) and assembly program. While palindromes appear not to play a major role in the genome evolution of parthenogenetic animals since only few palindromes were detected among all lineages, mitotic (and meiotic) gene conversion is likely to take place in parthenogens and should indeed be studied among all the clades.

      We agree with the reviewer that gene conversion might be one of the key aspects of asexual genome evolution. Our study merely pointed out that genomes of asexual animals do not show organisation in palindromes, indicating that palindromes might not be of general importance in asexual genome evolution. Note also that we clearly point out that these analyses are biased by the quality of the available genome assemblies.

      8) Regarding the section on transposable elements

      The authors are aware that the approach used may underestimate the TEs present in low copy numbers, therefore the comparison might underestimate the TE numbers in certain asexual groups.

      Yes. We clearly explain this limitation in the manuscript. The currently available alternatives are based on assembled genomes, so the results are biased by the quality of the assemblies (and similarities to TEs in public databases) and our aim was to broadly compare genomes in the absence of assembly-generated biases.

      9) Regarding the section on horizontal gene transfer. For the HGTc analysis, annotated genes were compared to the UniRef90 database to identify non-metazoan genes and HGT candidates were confirmed if they were on a scaffold containing at least one gene of metazoan origin. While this method is indeed interesting, it is also biased by the annotation quality and the length of the scaffolds which vary strongly between studies.

      Yes, this is true and we explain many limitations in the supplemental information, but re-assembling and re-annotating all these genomes would be beyond reasonable computational possibilities.

      10) Regarding the use of GenomeScope2.0

      When homologues are very divergent (as observed in bdelloid rotifers) GenomeScope probably considers these distinct haplotypes as errors, making it difficult to model the haploid genome size and giving a high peak of errors in the GenomeScope profile. Moreover, due to the very divergent copies in A. vaga, GenomeScope indeed provides a diploid genome (instead of tetraploid).

      For A. vaga, the heterozygosity estimated par GenomeScope2.0. on our new sequencing dataset is 2% (as shown in this paper). This % corresponds to the heterozygosity between k-mers but does not provide any information on the heterogeneity in heterozygosity measurements along the genome. A limitation of GenomeScope2.0. (which the authors should mention here) is that it is assuming that the entire genome is following the same theoretical k-mer distribution.

      The model of estimating genome wide heterozygosity indeed assumes a random distribution of heterozygous loci and indeed is unable to estimate divergence over a certain threshold, which is the reason why we used genome assemblies for the estimation of divergence of homoeologs. Regarding estimates in all other genomes, the assumptions are unlikely to fundamentally change the output of the analysis. GenomeScope2 is described in detail in a recent paper (Ranallo-Benavidez et al. 2019), where the assumption that heterozygosity rates are constant across the genome is explicitly mentioned.

      References

      Engelstädter, Jan. "Asexual but not clonal: evolutionary processes in automictic populations." Genetics 206.2 (2017): 993-1009.

      Flot, Jean-François, et al. "Genomic evidence for ameiotic evolution in the bdelloid rotifer Adineta vaga." Nature 500.7463 (2013): 453-457.

      Handoo, Z. A., et al. "Morphological, molecular, and differential-host characterization of Meloidogyne floridensis n. sp.(Nematoda: Meloidogynidae), a root-knot nematode parasitizing peach in Florida." Journal of nematology 36.1 (2004): 20.

      Suomalainen, Esko, Anssi Saura, and Juhani Lokki. Cytology and evolution in parthenogenesis. CRC Press, 1987.

      Ranallo-Benavidez, Timothy Rhyker, Kamil S. Jaron, and Michael C. Schatz. "GenomeScope 2.0 and Smudgeplots: Reference-free profiling of polyploid genomes." BioRxiv (2019): 747568. 

      Reviewer #3:

      Jaron and collaborators provide a large-scale comparative work on the genomic impact of asexuality in animals. By analysing 26 published genomes with a unique bioinformatic pipeline, they conclude that none of the expected features due to the transition to asexuality is replicated across a majority of the species. Their findings call into question the generality of the theoretical expectations, and suggest that the genomic impacts of asexuality may be more complicated than previously thought.

      The major strengths of this work is (i) the comparison among various modes and origins of asexuality across 18 independent transitions; and (ii) the development of a bioinformatic pipeline directly based on raw reads, which limits the biases associated with genome assembly. Moreover, I would like to acknowledge the effort made by the authors to provide on public servers detailed methods which allow the analyses to be reproduced. That being said, I also have a series of concerns, listed below:

      We thank this reviewer for the relevant comments and for providing many constructive suggestions in the points below. We will take them into account for our final version of the manuscript.

      1) Theoretical expectations

      As far as I understand, the aim of this work is to test whether 4 classical predictions associated with the transition to asexuality and 5 additional features observed in individual asexual lineages hold at a large phylogenetic scale. However, I think that these predictions are poorly presented, and so they may be hardly understood by non-expert readers. Some of them are briefly mentioned in a descriptive way in the Introduction (L56 - 61), and with a little more details in the Boxes 1 and 2. However, the evolutive reasons why one should expect these features to occur (and under which assumptions) is not clearly stated anywhere in the Introduction (but only briefly in the Results & Discussion). I think it is important that the authors provide clear-cut quantitative expectations for each genomic feature analysed and under each asexuality origin and mode (Box 1 and 2). Also highlighting the assumptions behind these expectations will help for a better interpretation of the observed patterns.

      We will clarify the expectations for non expert readers.

      2) Mutation accumulation & positive selection

      A subtlety which is not sufficiently emphasized to my mind is that the different modes of asexuality encompass reproduction with or without recombination (Box 2), which can lead to very different genetic outcomes. For example, it has been shown that the Muller's ratchet (the accumulation of deleterious mutations in asexual populations) can be stopped by small amounts of recombination in large-sized populations (Charlesworth et al. 1993; 10.1017/S0016672300031086). Similarly a new recessive beneficial mutation can only segregate at a heterozygous state in a clonal lineage (unless a second mutation hits the same locus); whereas in the presence of recombination, these mutations will rapidly fix in the population by the formation of homozygous mutants (Haldane's Sieve, Haldane 1927; 10.1017/S0305004100015644). Therefore, depending on whether recombination occurs or not during asexual reproduction, the expectations may be quite different; and so they could deviate from the "classical predictions". In this regard, I would like to see the authors adjust their conclusions. Moreover, it is also not very clear whether the species analysed here are 100% asexuals or if they sometimes go through transitory sexual phases, which could reset some of the genomic effects of asexuality.

      Yes, the predictions regarding the efficiency of selection are indeed influenced by cellular modes of asexuality. Adding some details or at least a good reference would certainly increase the readability of the section. We thank the reviewer for this suggestion.

      3) Transposable elements

      I found the predictions regarding the amount of TEs expected under asexuality quite ambiguous. From one side, TEs are expected not to spread because they cannot colonize new genomes (Hickey 1982); but on the other side TEs can be viewed as any deleterious mutation that will accumulate in asexual genome due to the Muller's ratchet. The argument provided by the authors to justify the expectation of low TE load in asexual lineages is that "Only asexual lineages without active TEs, or with efficient TE suppression mechanisms, would be able to persist over evolutionary timescales". But this argument should then equally be applied to any other type of deleterious mutations, and so we won't be able to see Muller's ratchet in the first place. Therefore, not observing the expected pattern for TEs in the genomic data is not so surprising as the expectation itself does not seem to be very robust. I would like the authors to better acknowledge this issue, which actually goes into their general idea that the genomic consequences of asexuality are not so simple.

      Indeed, the survivorship bias should affect all genomic features. Nothing that is incompatible with the viability of the species will ever be observed in nature. Perhaps the difference between Muller’s ratchet and the dynamics of accumulation of transposable elements (TEs) is that TEs are expected to either propagate very fast or not at all (Dolgin and Charlesworth 2006), while the effects of Muller’s ratchet are expected to vary among different populations and cellular mechanisms of asexuality. We will rephrase the text to better reflect the complexity of the predicted consequences of TE dynamics.

      4) Heterozygosity

      Due to the absence of recombination, asexual populations are expected to maintain a high level of diversity at each single locus (heterozygosity), but a low number of different haplotypes. However, as presented by the authors in the Box 2, there are different modes of parthenogenesis with different outcomes regarding heterozygosity: (1) preservation at all loci; (2) reduction or loss at all loci; (3) reduction depending on the chromosomal position relative to the centromere (distal or proximal). Therefore, the authors could benefit from their genome-based dataset to explore in more detail the distribution of heterozygosity along the chromosomes, and further test whether it fits with the above predictions. If the differing quality of the genome assemblies is an issue, the authors could at least provide the variance of the heterozygosity across the genome. The mode #3 (i.e. central fusions and terminal fusions) would be particularly interesting as one would then be able to compare, within the same genome, regions with large excess vs. deficit of heterozygosity and assess their evolutive impacts.

      Moreover, the authors should put more emphasis on the fact that using a single genome per species is a limitation to test the subtle effects of asexuality on heterozygosity (and also on "mutation accumulation & positive selection"). These effects are better detected using population-based methods (i.e. with many individuals, but not necessarily many loci). For example, the FIS value of a given locus is negative when its heterozygosity is higher than expected under random mating, and positive when the reverse is true (Wright 1951; 10.1111/j.1469-1809.1949.tb02451.x).

      We agree with the reviewer that the analysis of the distribution of heterozygosity along the chromosomes would be very interesting. However, the necessary data is available only for the Cape honey bee, and its analysis has been published by Smith et al. 2018. Calculating the probability distribution of heterozygosities would be possible, but it would require SNP calling for each of the datasets. Such an analysis would be computationally intensive and prone to biases by the quality of the genome assemblies.

      5) Absence of sexual lineages

      A second limit of this work is the absence of sexual lineages to use as references in order to control for lineage-specific effects. I do not agree with the authors when they say that "the theoretical predictions pertaining to mutation accumulation, positive selection, gene family expansions, and gene loss are always relative to sexual species [...] and cannot be independently quantified in asexuals." I think that this is true for all the genomic features analysed, because the transition to asexuality is going to affect the genome of asexual lineages relative to their sexual ancestors. This is actually acknowledged at the end of the Conclusion by the authors.

      To give an example, the authors say that "Species with an intraspecific origin of asexuality show low heterozygosity levels (0.03% - 0.83%), while all of the asexual species with a known hybrid origin display high heterozygosity levels (1.73% - 8.5%)". Interpreting these low vs. high heterozygosity values is difficult without having sexual references, because the level of genetic diversity is also heavily influenced by the long term life history strategies of each species (e.g. Romiguier et al. 2014; 10.1038/nature13685).

      I understand that the genome of related sexual species are not available, which precludes direct comparisons with the asexual species. However, I think that the results could be strengthened if the authors provided for each genomic feature that they tested some estimates from related sexual species. Actually, they partially do so along the Result & Discussion section for the palindromes, transposable elements and horizontal gene transfers. I think that these expectations for sexual species (and others) could be added to Table 1 to facilitate the comparisons.

      Our statement "the theoretical predictions pertaining to mutation accumulation, positive selection, gene family expansions, and gene loss are always relative to sexual species [...] and cannot be independently quantified in asexuals." specifically refers to methodology: analyses to address these predictions require orthologs between sexual and asexual species. We fully agree that in addition to methodological constraints, comparisons to sexual species are also conceptually relevant - which is in fact one of the major points of our paper. We will clarify these points.

      6) Regarding statistics, I acknowledge that the number of species analysed is relatively low (n=26), which may preclude getting any significant results if the effects are weak. However, the authors should then clearly state in the text (and not only in the reporting form) that their analyses are descriptive. Also, their position regarding this issue is not entirely clear as they still performed a statistical test for the effect of asexuality mode / origin on TE load (Figure 2 - supplement 1). Therefore, I would like to see the same statistical test performed on heterozygosity (Figure 2).

      We will unify the sections and add an appropriate test everywhere where suited.

      7) As you used 31 individuals from 26 asexual species, I was wondering whether you make profit of the multi-sample species. For example, were the kmer-based analyses congruent between individuals of the same species?

      Unfortunately, some of the 31 individuals do not have publicly available reads (some of the root-knot nematode datasets are missing), others do not have sufficient quality (the coverage for some water flea samples is very low). Our analyses were consistent for the few cases where we have multiple datasets available.

      References

      Dolgin, Elie S., and Brian Charlesworth. "The fate of transposable elements in asexual populations." Genetics 174.2 (2006): 817-827.

      Smith, Nicholas MA, et al. "Strikingly high levels of heterozygosity despite 20 years of inbreeding in a clonal honey bee." Journal of evolutionary biology 32.2 (2019): 144-152.

    2. Reviewer #3

      Jaron and collaborators provide a large-scale comparative work on the genomic impact of asexuality in animals. By analysing 26 published genomes with a unique bioinformatic pipeline, they conclude that none of the expected features due to the transition to asexuality is replicated across a majority of the species. Their findings call into question the generality of the theoretical expectations, and suggest that the genomic impacts of asexuality may be more complicated than previously thought.

      The major strengths of this work is (i) the comparison among various modes and origins of asexuality across 18 independent transitions; and (ii) the development of a bioinformatic pipeline directly based on raw reads, which limits the biases associated with genome assembly. Moreover, I would like to acknowledge the effort made by the authors to provide on public servers detailed methods which allow the analyses to be reproduced. That being said, I also have a series of concerns, listed below:

      1) Theoretical expectations.

      As far as I understand, the aim of this work is to test whether 4 classical predictions associated with the transition to asexuality and 5 additional features observed in individual asexual lineages hold at a large phylogenetic scale. However, I think that these predictions are poorly presented, and so they may be hardly understood by non-expert readers. Some of them are briefly mentioned in a descriptive way in the Introduction (L56 - 61), and with a little more details in the Boxes 1 and 2. However, the evolutive reasons why one should expect these features to occur (and under which assumptions) is not clearly stated anywhere in the Introduction (but only briefly in the Results & Discussion). I think it is important that the authors provide clear-cut quantitative expectations for each genomic feature analysed and under each asexuality origin and mode (Box 1 and 2). Also highlighting the assumptions behind these expectations will help for a better interpretation of the observed patterns.

      2) Mutation accumulation & positive selection.

      A subtlety which is not sufficiently emphasized to my mind is that the different modes of asexuality encompass reproduction with or without recombination (Box 2), which can lead to very different genetic outcomes. For example, it has been shown that the Muller's ratchet (the accumulation of deleterious mutations in asexual populations) can be stopped by small amounts of recombination in large-sized populations (Charlesworth et al. 1993; 10.1017/S0016672300031086). Similarly a new recessive beneficial mutation can only segregate at a heterozygous state in a clonal lineage (unless a second mutation hits the same locus); whereas in the presence of recombination, these mutations will rapidly fix in the population by the formation of homozygous mutants (Haldane's Sieve, Haldane 1927; 10.1017/S0305004100015644). Therefore, depending on whether recombination occurs or not during asexual reproduction, the expectations may be quite different; and so they could deviate from the "classical predictions". In this regard, I would like to see the authors adjust their conclusions. Moreover, it is also not very clear whether the species analysed here are 100% asexuals or if they sometimes go through transitory sexual phases, which could reset some of the genomic effects of asexuality.

      3) Transposable elements.

      I found the predictions regarding the amount of TEs expected under asexuality quite ambiguous. From one side, TEs are expected not to spread because they cannot colonize new genomes (Hickey 1982); but on the other side TEs can be viewed as any deleterious mutation that will accumulate in asexual genome due to the Muller's ratchet. The argument provided by the authors to justify the expectation of low TE load in asexual lineages is that "Only asexual lineages without active TEs, or with efficient TE suppression mechanisms, would be able to persist over evolutionary timescales". But this argument should then equally be applied to any other type of deleterious mutations, and so we won't be able to see Muller's ratchet in the first place. Therefore, not observing the expected pattern for TEs in the genomic data is not so surprising as the expectation itself does not seem to be very robust. I would like the authors to better acknowledge this issue, which actually goes into their general idea that the genomic consequences of asexuality are not so simple.

      4) Heterozygosity.

      Due to the absence of recombination, asexual populations are expected to maintain a high level of diversity at each single locus (heterozygosity), but a low number of different haplotypes. However, as presented by the authors in the Box 2, there are different modes of parthenogenesis with different outcomes regarding heterozygosity: (1) preservation at all loci; (2) reduction or loss at all loci; (3) reduction depending on the chromosomal position relative to the centromere (distal or proximal). Therefore, the authors could benefit from their genome-based dataset to explore in more detail the distribution of heterozygosity along the chromosomes, and further test whether it fits with the above predictions. If the differing quality of the genome assemblies is an issue, the authors could at least provide the variance of the heterozygosity across the genome. The mode #3 (i.e. central fusions and terminal fusions) would be particularly interesting as one would then be able to compare, within the same genome, regions with large excess vs. deficit of heterozygosity and assess their evolutive impacts.

      Moreover, the authors should put more emphasis on the fact that using a single genome per species is a limitation to test the subtle effects of asexuality on heterozygosity (and also on "mutation accumulation & positive selection"). These effects are better detected using population-based methods (i.e. with many individuals, but not necessarily many loci). For example, the FIS value of a given locus is negative when its heterozygosity is higher than expected under random mating, and positive when the reverse is true (Wright 1951; 10.1111/j.1469-1809.1949.tb02451.x).

      5) Absence of sexual lineages.

      A second limit of this work is the absence of sexual lineages to use as references in order to control for lineage-specific effects. I do not agree with the authors when they say that "the theoretical predictions pertaining to mutation accumulation, positive selection, gene family expansions, and gene loss are always relative to sexual species [...] and cannot be independently quantified in asexuals." I think that this is true for all the genomic features analysed, because the transition to asexuality is going to affect the genome of asexual lineages relative to their sexual ancestors. This is actually acknowledged at the end of the Conclusion by the authors.

      To give an example, the authors say that "Species with an intraspecific origin of asexuality show low heterozygosity levels (0.03% - 0.83%), while all of the asexual species with a known hybrid origin display high heterozygosity levels (1.73% - 8.5%)". Interpreting these low vs. high heterozygosity values is difficult without having sexual references, because the level of genetic diversity is also heavily influenced by the long term life history strategies of each species (e.g. Romiguier et al. 2014; 10.1038/nature13685).

      I understand that the genome of related sexual species are not available, which precludes direct comparisons with the asexual species. However, I think that the results could be strengthened if the authors provided for each genomic feature that they tested some estimates from related sexual species. Actually, they partially do so along the Result & Discussion section for the palindromes, transposable elements and horizontal gene transfers. I think that these expectations for sexual species (and others) could be added to Table 1 to facilitate the comparisons.

      6) Regarding statistics, I acknowledge that the number of species analysed is relatively low (n=26), which may preclude getting any significant results if the effects are weak. However, the authors should then clearly state in the text (and not only in the reporting form) that their analyses are descriptive. Also, their position regarding this issue is not entirely clear as they still performed a statistical test for the effect of asexuality mode / origin on TE load (Figure 2 - supplement 1). Therefore, I would like to see the same statistical test performed on heterozygosity (Figure 2).

      7) As you used 31 individuals from 26 asexual species, I was wondering whether you make profit of the multi-sample species. For example, were the kmer-based analyses congruent between individuals of the same species?

    3. Reviewer #2

      This paper is interesting because it is studying, through a comparative genomic approach, how asexuality affects genome evolution in animal lineages while focusing on the same features. Such an extensive comparison can, in principle, distinguish the common consequences of asexuality, in contrast to previous studies that focused on few asexual species (or only one). It is interesting that the authors did not find a universal genomic feature of "asexual" species. This is a potentially important contribution to the field of the evolution of reproductive systems.

      However, I am concerned about limitations and potential biases in many of the specific genomic features analysed, and resultant difficulties in drawing any general conclusions from these analyses. For example, the heterozygosity analyses need to be more clearly explained and the potential limits of the methods used discussed further. The use of kmer spectra analyses as opposed to genome assemblies is understandable, but these are biases here that were not discussed. I am also concerned about the impact of low read quality and low coverage genomic data, and whether issues with genome assembly affect the conclusions. There are also issues about conclusions related to species of hybrid origin as there are numerous "unknown" cases and cytological data is lacking for many of the studied animal groups (therefore the authors should be cautious on the evidence of reproduction mode).

      Ideally, all the genomes of the asexual animal clades studied should have been sequenced and assembled using the same method which would make this comparative study much stronger. We realize this may not yet be practical, but the absence of such data must temper the conclusions. It is nevertheless the first article including and comparing many distinct parthenogenetic animal clades and the main result that no common universal genomic feature of parthenogenesis is, with caveats, interesting.

      Major Issues and Questions:

      1) The authors choose to refer to asexuality when describing thelytokous parthenogenesis. Asexuality is a very general term that can be confusing: fission, vegetative reproduction could also be considered asexuality. I suggest using parthenogenesis throughout the manuscript for the different animal clades studied here. Moreover, in thelytokous parthenogenesis meiosis can still occur to form the gametes, it is therefore not correct to write that "gamete production via meiosis... no longer take place" (lines 57-58). Fertilization by sperm indeed does not seem to take place (except during hybridogenesis, a special form of parthenogenesis).

      2) The cellular mechanisms of asexuality in many asexual lineages are known through only a few, old cytological studies and could be inaccurate or incomplete (for example Triantaphyllou paper of 1981 of Meloidogyne nematodes or Hsu, 1956 for bdelloid rotifers). The authors should therefore mention in the introduction the lack of detailed and accurate cellular and genetic studies to describe the mode of reproduction because it may change the final conclusion.

      For example, for bdelloid rotifers the literature is scarce. However the authors refer in Supp Table 1 to two articles that did not contain any cytological data on oogenesis in bdelloid rotifers to indicate that A. vaga and A. ricciae use apomixis as reproductive mode. Welch and Meselson studied the karyotypes of bdelloid rotifers, including A. vaga, and did not conclude anything about absence or presence of chromosome homology and therefore nothing can be said about their reproduction mode. In the article of Welch and Meselson the nuclear DNA content of bdelloid species is measured but without any link with the reproduction mode. The only paper referring to apomixis in bdelloids is from Hsu (1956) but it is old and new cytological data with modern technology should be obtained.

      3) In the section on Heterozygosity, the authors compute heterozygosity from kmer spectra analysis from reads to "avoid biases from variable genome assembly qualities" (page 16). But such kmer analysis can be biased by the quality and coverage of sequencing reads. While such analyses are a legitimate tool for heterozygosity measurements, this argument (the bias of genome quality) is not convincing and the authors should describe the potential limits of using kmer spectra analyses.

      4) The authors state that heterozygosity levels “should decay over time for most forms of meiotic asexuality". This is incorrect, as this is not expected with "central fusion" or with "central fusion automixis equivalent" where there is no cytokinesis at meiosis I.

      5) I do not fully agree with the authors’ statement that: "In spite of the prediction that the cellular mechanism of asexuality should affect heterozygosity, it appears to have no detectable effect on heterozygosity levels once we control for the effect of hybrid origins (Figure 2)." (page 17)

      The scaling on Figure 2 is emphasizing high values, while low values are not clearly separated. By zooming in on the smaller heterozygosity % values we may observe a bigger difference between the "asexuality mechanisms". I do not see how asexuality mechanism was controlled for, and if you look closely at intra group heterozygosity, variability is sometimes high.

      It is expected that hybrid origin leads to higher heterozygosity levels but saying that asexuality mechanism is not important is surprising: on Figure 2 the orange (central fusion) is always higher than yellow (gamete duplication). Also, the variability found within rotifers could be an argument against a strong importance of asexuality origin on heterozygosity levels: the four bdelloid species likely share the same origin but their allelic heterozygosity levels appears to range from almost 0 to almost 6% (Fig 2 and 3, however the heterozygosity data on Rotaria should be confirmed, see below).

      The authors’ main idea (i.e. asexuality origin is key) seems mostly true when using homoeolog heterozygosity and/or composite heterozygosity which is not what most readers will usually think as "heterozygosity". This should be made clear by the authors mostly because this kind of heterozygosity does not necessarily undergo the same mechanism as the one described in Box 2 for allelic heterozygosity. If homoeolog heterozygosity is sometimes not distinguishable from allelic heterozygosity, then it would be nice to have another box showing the mechanisms and evolution pattern for such cases (like a true tetraploid, in which all copies exist).

      The heterozygosity between homoeologs is always high in this study while it appears low between alleles, but since the heterozygosity between homeologs can only be measured when there is a hybrid origin, the only heterozygosity that can be compared between ALL the asexual groups is the one between alleles.

      Both in the results and the conclusion the authors should not over interpret the results on heterozygosity. The variation in allelic heterozygosity could be small (although not in all asexuals studied) also due to the age of the asexual lineages. This is not mentioned here in the result/discussion section.

      6) Regarding the section on Heterozygosity structure in polyploids.

      There is inconsistency in many of the numbers. For example, A. vaga heterozygosity is estimated at 1.42% in Figure 1, but then appears to show up around 2% in Figure 2, and then becomes 2.4% on page 20. It is unclear is this is an error or the result of different methods.

      It is also unclear how homologs were distinguished from homeologs. How are 21 bp k-mers considered homologous? In the method section. the authors describe extracting unique k-mer pairs differing by one SNP, so does this mean that no more than one SNP was allowed to define heterozygous homologous regions? Does this mean that homologues (and certainly homoeologs) differing by more than 5% would not be retrieved by this method. If so, then It is not surprising that for A. vaga is classified as a diploid.

      The result for A. ricciae is surprising and I am still not convinced by the octoploid hypothesis. In Fig S2. there is a first peak at 71x coverage that still could be mostly contaminants. It would be helpful to check the GC distribution of k-mers in the first haploid peak of A. ricciae to check whether there are contaminants. The karyotypes of 12 chromosomes indeed do not fit the octoploid hypothesis. I am also surprised by the 5.5% divergence calculated for A. ricciae, this value should be checked when eliminating potential contaminants (if any). In general, these kind of ambiguities will not be resolved without long-read sequencing technology to improve the genome assemblies of asexual lineages.

      7) Regarding the section on palindromes and gene conversion.

      The authors screened all the published genomes for palindromes, including small blocks, to provide a more robust unbiased view. However, the result will be unbiased and robust if all the genomes compared were assembled using the same sequencing data (quality, coverage) and assembly program. While palindromes appear not to play a major role in the genome evolution of parthenogenetic animals since only few palindromes were detected among all lineages, mitotic (and meiotic) gene conversion is likely to take place in parthenogens and should indeed be studied among all the clades.

      8) Regarding the section on transposable elements.

      The authors are aware that the approach used may underestimate the TEs present in low copy numbers, therefore the comparison might underestimate the TE numbers in certain asexual groups.

      9) Regarding the section on horizontal gene transfer.

      For the HGTc analysis, annotated genes were compared to the UniRef90 database to identify non-metazoan genes and HGT candidates were confirmed if they were on a scaffold containing at least one gene of metazoan origin. While this method is indeed interesting, it is also biased by the annotation quality and the length of the scaffolds which vary strongly between studies.

      10) Regarding the use of GenomeScope2.0.

      When homologues are very divergent (as observed in bdelloid rotifers) GenomeScope probably considers these distinct haplotypes as errors, making it difficult to model the haploid genome size and giving a high peak of errors in the GenomeScope profile. Moreover, due to the very divergent copies in A. vaga, GenomeScope indeed provides a diploid genome (instead of tetraploid).

      For A. vaga, the heterozygosity estimated par GenomeScope2.0. on our new sequencing dataset is 2% (as shown in this paper). This % corresponds to the heterozygosity between k-mers but does not provide any information on the heterogeneity in heterozygosity measurements along the genome. A limitation of GenomeScope2.0. (which the authors should mention here) is that it is assuming that the entire genome is following the same theoretical k-mer distribution.

    4. Reviewer #1

      This paper addresses the very interesting topic of genome evolution in asexual animals. While the topic and questions are of interest, and I applaud the general goal of a large-scale comparative approach to the questions, there are limitations in the data analyzed. Most importantly, as the authors raise numerous times in the paper, questions about genome evolution following transitions to asexuality inherently require lineage-specific controls, i.e. paired sexual species to compare with the asexual lineages. Yet such data are currently lacking for most of the taxa examined, leaving a major gap in the ability to draw important conclusions here. I also do not think the main positive results, such as the role of hybridization and ploidy on the retention and amount of heterozygosity, are novel or surprising.

    5. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to Version 2 of the preprint: https://www.biorxiv.org/content/10.1101/497495v2

      Summary

      This paper addresses the question of whether there are distinct genomic features in animals that reproduce asexually. The authors examine a range of features in the genomes of 26 species representing 18 independent evolutionary origins of asexuality. The reviewers were unanimous that this is an interesting question, and find that exploring it in a broad evolutionary context is the right approach. However, they raised questions about biases in specific analyses that complicated their interpretation, and the extent to which the central claims can be supported without comparison to closely related sexual species.

    1. Reviewer #2:

      In 2011 these authors showed that Drosophila DmPI31 is a binding partner of the F box protein Nutcracker, a component of an SCF ubiquitin ligase (E3) required for caspase activation during sperm differentiation in Drosophila. DmPI31 binds Nutcracker via a mechanism that is also used by mammalian FBXO7 and PI31. Subsequently, they have shown that PI31 serves as an adaptor to couple proteasomes with dynein light chains and inactivation of PI31 inhibited proteasome motility in axons and disrupted synaptic proteostasis, structure, and function. In addition, conditional loss of PI31 in spinal motor neurons (MNs) and cerebellar Purkinje cells (PCs) caused axon degeneration, neuronal loss, and progressive spinal and cerebellar neurological dysfunction.

      Here the authors show that like Fbxo7 mutant mice, PI31 conditional KO mice have a decreased testis and thymus size and motor neuron specific loss of either FBXO7 or PI31 produced similar phenotypes in motor neurons. They generated a mouse that conditionally expressed FLAG-tagged PI31 this could rescue PI31 mutant mice; this transgene (under a Chat driver) rescued the phenotype of FBXO7 mutant mice from which they concluded that the consequences of FBXO7 mutation relate to loss of PI31 function in the cell types studied.

      FBXO7 is the substrate recognition module of a novel proteasome‐interacting E3 ubiquitin ligase. In addition to binding PI31, FBXO7 also drives PI31 ubiquitylation and thus regulates its cellular levels. That the transgene can rescue the phenotype in the Chat-expressing cells is surprising and striking. However, it would necessary to reveal more about the underlying molecular mechanism. In the cell types rescued, is there another E3 ligase with overlapping substrate specificity? Are there mitochondrial phenotypes that are not rescued?

    2. Reviewer #1:

      This manuscript focuses on the role played by the PI31 protein in regulating presynaptic proteasome abundance and the health of motor neurons. In particular, it presents striking data from knockout and conditional KO mice showing that depletion of PI31 and Fbxo7/PARK15 (Parknson's disease gene) yield similar phenotypes, including motor neuron defects, following their conditional depletion. Furthermore, in the absence of Fbxo7/PARK15, PI31 levels were greatly reduced. This suggested that a major role for Fbxo7 is to promote the abundance/stability of PI31. In support of this model, transgenic expression of PI31 completely rescued overall health, body weight and motor neuron morphology in Fbxo7 mutant mice. These results are impressive. However, the manuscript implies but does not show that the mechanism through which PI31 supports neuronal health is by promoting the axonal transport of proteasomes and thus suppressing the presynaptic accumulation of ubiquitinated proteins. Several key experiments to address this issue would greatly strengthen the manuscript (outlined below).

      1) Major statements are made about the importance of PI31 for axonal transport of proteasomes and presynaptic aggregate clearance. In order to establish that PI31 is indeed supporting neuronal health by promoting axonal transport of proteasomes and clearing presynaptic protein aggregates, it is necessary to show:

      -- That motor neuron presynaptic proteasome number is reduced in the PI31 and Fbxo7 KO mice and restored in the Fbxo7 mutant mice that express the PI31 transgene.

      -- That expression of the PI31 transgene in the Fbxo7 mutant mice suppresses the presynaptic accumulation of P62 aggregates.

      2) It would be helpful if the abstract defined the Parkinson's disease model (PARK15) that was investigated.

      3) Quantification of the presynaptic P62 aggregate phenotype in figure 2 would be helpful as would including a higher magnification image of the wildtype synapse with the P62 labeling.

      4) Given that the major phenotypes that are characterized are not directly related to Parkinson's disease, the upfront emphasis on Parkinson's disease might not be warranted. Although the mouse phenotypes that are reported are striking, the title in particular suggests a more direct connection to this disease than is warranted by the data.

      5) Figures 3C and 4B: Individual data points should be plotted and a statistical test would be helpful.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      There is great interest in understanding the molecular basis of FBXO7/PARK15 pathogenesis and the present, high quality story includes an impressive rescue in cells transgenically overexpressing PI31 protein. Nevertheless, as discussed in greater detail below, the two reviewers felt that more work would be needed to document the molecular basis for this phenotype rescue.

    1. Reviewer #3:

      Summary

      The manuscript presents an experiment in which participants listened to ten auditory sequences, generated with either first- or second-order statistical structure ("simple" vs "complex" SL respectively) and predicted 20 elements in each sequence during simultaneous EEG recording. Behavioural results showed that all participants performed better for simple than complex sequences and musicians performed better than non-musicians for both sequence types. A Bayesian model was developed with parameters controlling memory decay, sensory noise, model order (hierarchy) and selection noise, which were fitted to the responses of each participant. The results showed differences between musicians and non-musicians for parameters related to SL (model order, selection noise) but not parameters related to stimulus processing (sensory noise and memory decay). Specifically musicians showed evidence of higher-order prediction and lower selection noise. The EEG results linked increased amplitude at fronto-central electrodes at around 300 ms to modelled surprise for each participant, which was stronger for musicians than non-musicians. Separate analyses for models of different order produced evidence for an early modulation around 200ms for zeroth-order predictions which did not differ between musicians and non-musicians and a later modulation around 300ms for first- and second-order predictions which did differ between the two groups. These modulations were linked to the MMN and P300 respectively. The results are taken as evidence for better SL in musicians and discussed in terms of the Bayesian brain hypothesis.

      Substantive Concerns

      -- p. 4, para. 2: I believe that the evidence for musicians showing better SL is less strong than presented in the manuscript. In particular, using different stimuli and methods, both Loui et al., (2010) and Rohrmeier et al. (2011) found no difference between musicians and non-musicians in statistical learning of auditory sequences. Furthermore, with regard to reference 7 in the manuscript, although some studies have found larger ERAN amplitudes in musicians than non-musicians (Jentschke & Koelsch, 2009; Kim et al., 2011; Koelsch et al., 2007, 2002; Regnault et al., 2001) the differences are usually small and have not been replicated in all studies (e.g., Koelsch & Jentschke, 2008; Koelsch & Sammler, 2008; Miranda & Ullman, 2007; Steinbeis et al., 2006). The introduction and motivation for the experiment should be adapted to give a more detailed and balanced view of the literature and the divergence between the present results and those of Loui et al., (2010) and Rohrmeier et al., (2011) should be discussed and accounted for.

      -- I'm not sure complexity is the most appropriate term to use in distinguishing statistical regularities of different order, since different transition tables at a single given order could be described as varying in statistical complexity. Having introduced the term, why not stick to "higher-order" and "lower-order"?

      -- p 7: "Control analysis revealed that musicians and non-musicians do not benefit from an overall increase in performances during the course of the experiment." But there should be an improvement during each individual sequence, right? Is it possible to demonstrate this?

      -- I think the authors should analyse the interaction in Fig. 1B and report whether or not it is significant.

      -- I noted that while the authors report the consistency between the model and participants, they do not report the average accuracy of the model, which should be included for completeness. It would be good to report both of these analyses separately for complex and simple sequences, given the significant difference in performance between them.

      -- p. 15: clarify that the same transition matrix was used for all five sequences of a given order

      -- p. 15: what were the inclusion/exclusion criteria for the groups of musicians and non-musicians? How were participants recruited? This is important, especially given the divergence between the present findings and previous results (as noted above).

      -- p. 16: are there any consequences of the fact that participants were aware of the probabilistic nature of the sequences and the differences between the two sequence types? Again, this seems to me to be an important divergence from other SL studies which could impact on the behavioural and neural effects observed and should, therefore, be discussed.

      -- p. 16: "one participant was removed" - musician or non-musician?

      -- p. 18 why was FCz used as the reference?

      -- there are some inconsistencies in the way the model parameters are named - e.g., "late noise" in Supp. Figure 5. Please check through and use consistent terms throughout.

      -- To facilitate replication and follow-up research, I would encourage the authors to make their data and model openly available.

    2. Reviewer #2:

      The paper compares musicians' behavior and ERP responses to those of non-musicians with the following statement in the abstract:

      "these better performances could be due to an improved ability to process sensory information, as opposed to an improved ability to learn sequence statistics. Unfortunately, these very different explanations make similar predictions on the performances averaged over multiple trials. To solve this controversy, we developed a Bayesian model and recorded electroencephalography (EEG) to study trial-by-trial responses."

      The authors claim:

      "This higher performance is explained in the Bayesian model by parameters governing SL, as opposed to parameters governing sensory information processing. " This is correct - but meaningless - the experiment does not challenge sensory noise since the 3 sounds used are so distinct that sensory noise is zero in the two groups. Given that basic design - this phrasing is not only too strong, it is in proper.

      My understanding is that are two actual observations in the paper:

      1) Musicians' learning of second order markov statistics is better than that of non-musicians based on parameter fitting of a Bayesian model of their behavior in answering explicit questions regarding which sound (of 3 very distinct options) should come next.

      2) ERP measures - specifically P300 of musicians, is more sensitive to this statistics as evident by its magnitude with respect to predictability/surprise of the sound based on serial statistics. These claims are interesting BUT - I am not convinced by the claim of specificity. I think the data (and previous studies) suggest that musicians do better with sound related judgments - with all respects.

      I am not convinced that the model adds information since it explains the data as a good as single accuracy numbers (or did I miss something?). So I am not convinced that this trial by trial analysis adds information.

      With respect to the specific model parameters:

      Sensory noise is zero - the sounds are quite distinct. This is not an observation - this is how the experiment was designed. The authors admit that (indeed - any study that focused on sensory discrimination found an advantage in musicians) - but then state specificity, particularly in the abstract.

      Regarding rate of decay - I wonder if this is relevant to overall performance when asked only up to 2nd order serial statistics. It may be sufficient for the task. The relevance of this parameter should be clarified.

      Thus the lack of group difference in these parameters probably tells about the experiment rather than the groups.

      Similarly, musicians' ERP responses are larger. But the early difference is not addressed at all. Is the earlier response sensitive to simpler stat - but in a similar way in both populations? Can't be - since they have a different magnitude. The authors base their analysis on (MEG analysis) in their 2019 paper. I tried to do the exact comparison, and wasn't sure about the mapping to components - please clarify the exact similarity.

      Thus - overall - I am not sure that the model analysis provides new conceptual insights.

    3. Reviewer #1:

      In this work, the authors used a combination of modelling, behavioral methods and EEG to understand whether sensitivity to the statistical structure of unfolding sound sequences differs between musician and non-musicians. Overall they demonstrate that musicians are better than non musicians at predicting forthcoming items. Modelling suggests that this advantage arises because they estimate higher order transition probabilities than non-musicians. The analysis of EEG data recorded during task performance showed that the amplitude of the P3 correlated with item predictability. Further analyses suggested that musicians and non-musicians have similar responses to surprise in simple sequences, with divergence between the groups occurring for higher order transition probabilities.

      I have several concerns about task design, analysis and interpretation of the data which are detailed below:

      1) The EEG data are recorded whilst participants are performing the behavioral prediction task. Though probe trials occurred rarely, it is conceivable that participants were making an active judgement for each sequence item. There is therefore a concern that the measured EEG data would reflect this aspect (active task performance) rather than automatic SL. This makes conclusions about "neural statistical learning" (e.g. as in the title) difficult to make.

      2) In the results section the authors consider various differences between the musician and non-musician groups that could lead to differences in performance. One aspect that does not seem to be considered is that of attention, or task engagement. Is it possible that the musician participants were simply more engaged/less bored by the task? The EEG data (figure 3) are consistent with this interpretation showing overall substantially larger responses in the musicians relative to the non musicians.

      3) Relatedly, is it possible that the results in Figure 3C are at least partly related to the overall amplitude differences between groups? Higher SNR in the musician group may lead to higher beta values. One way around this is to normalize the data (e.g. based on the P1 response) before computing the correlations.

      4) Figure 4: can you show the ERP data on which the beta values are based?

      5) Figure 4: the authors seek to conclude that the two groups have similar responses to surprise in simple statistical contexts (K=0) with divergence occurring for more complex statistical structure. However, they do not provide statistics to support this claim. It is not enough to show no significant difference between groups for K=0, but significant differences for K=1, 2 : you need to demonstrate an interaction.

      6) More broadly, though, I do not understand the theoretical implications for this finding: why would brain response to K=0 occur earlier than k=2? Shouldn't the prediction be formed already before sound onset (especially given the relatively slow sequence rate).

      7) Discussion: "Our results shed light on the musical training induced plasticity". This statement confuses correlation with causation. The authors discuss the reservation later in the discussion but it should be removed altogether.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This work constitutes an innovative and timely combination of modelling, behaviour and EEG to understand potential differences in SL abilities between musicians and non-musicians. However, as detailed below, we have many concerns regarding the modelling, experimental design and interpretation of the results.

      Our major concerns are summarized here (and further elaborated in the individual reviews below):

      1) Modelling: please report the accuracy of the model and whether this differs between groups.

      2) You should analyse the interaction in Fig. 1B and report whether or not it is significant.

      3) Relatedly, there appears to be an inconsistency between the behavioural results and the modelling. In the behavioural data you report a main effects of musicianship and of sequence complexity. Modelling of this data suggests that whilst the K for musicians is higher than non musicians it is substantially above 1 for both. If anything this should predict larger differences between groups in larger K than smaller K which is different from what is seen behaviourally. A similar inconsistency is present between the behavioural results and the results in figure 4 (see below). This requires careful consideration.

      4) Can you do more to convince the reader that the model is performing well? Is the fit good, how does it vary across participants? Does rate of memory decay affect performance at all? Can you show good versus poor performers within the same group - do parameters also vary there?

      5) It is important that you address the issues related to participants being aware of the stimulus construction. Are there any consequences of the fact that participants were aware of the probabilistic nature of the sequences and the differences between the two sequence types? This seems to be an important divergence from other SL studies which could impact on the behavioural and neural effects observed and should, therefore, be discussed.

      6) The EEG data are recorded whilst participants are performing the behavioural prediction task. Though probe trials occurred rarely, it is conceivable that participants were making an active judgement for each sequence item. There is therefore a concern that the measured EEG data would reflect this aspect (active task performance) rather than automatic SL. This makes conclusions about "neural statistical learning" (e.g. as in the title) difficult to make.

      7) In the results section the authors consider various differences between the musician and non-musician groups that could lead to differences in performance. One aspect that does not seem to be considered is that of attention, or task engagement. Is it possible that the musician participants were simply more engaged/less bored by the task? The EEG data (figure 3) are consistent with this interpretation showing overall substantially larger responses in the musicians relative to the non musicians.

      8) In general, we think the model has been constructed with due care and attention and we like the separation of parameters related to statistical learning (model order and selection noise) and more general aspects of perception and cognition (sensory noise and memory decay). We think the difficulties arise in the relationship between the model and the experiment. Specifically, the sensory noise model parameter reveals very little in the analysis of this data because the sounds were so readily distinguishable, which appears to have been a deliberate choice in the experimental design, somewhat confusingly. The present stimulus set is therefore not suitable for distinguishing differences in sensory processing vs. SL between groups. We suggest that the authors could simply remove this parameter from the analysis and the paper would be clearer as a result. This would involve re-modelling and you will also have to reshape the way the experiment is motivated.

      9) We have some questions about how the EEG data are analysed. In particular, the large amplitude difference between groups should be quantified, discussed and interpreted. We would also like to see stronger justification and discussion of why these differences are not affecting the main conclusions. We note that the authors provide R2 results in supp materials but we feel that a better approach may involve normalizing the responses before modelling. Higher SNR in the musician group may lead to stronger correlations. One way around this is to normalize the data (e.g. based on the P1 response) before computing the correlations.

      10) You should perform the appropriate statistical analysis to support the claims associated with Figure 4. You seek to conclude that the two groups have similar responses to surprise in simple statistical contexts (K=0) with divergence occurring for more complex statistical structure. However, you do not provide statistics to support this claim. It is not enough to show no significant difference between groups for K=0, but significant differences for K=1, 2. You need to demonstrate an interaction between group and model order. Additionally, it was also not quite clear how modelling was performed here. We understand that you take surprise values from the model fitted to each participant but with the order fixed at 0, 1 or 2. This may mean that the other parameters might no longer be optimal in the context of the new fixed K values, depending on how different these were from the fitted values for each participant, which might plausibly differ for the musicians and non-musicians. To address this, Can you supplement the existing analysis with an analysis in which the K parameters are fixed at 0, 1 and 2, and the other parameters are re-optimised in the context of these fixed parameter values. Please also provide information about how well each individual data were fit, and whether there was a significant difference between musicians and non musicians. In general, we think the authors should present the result in figure 4 more cautiously and also flesh out the interpretation in more detail in relation to the literature along with a consideration of other potential interpretations. A small related point is that the term hierarchy is strongly related to this interpretation and we would prefer a more neutral term such as 'model order'.

      11) The paper would benefit from a careful discussion of exactly what information, on top of that revealed with behaviour, is added by EEG and the significance of this in the context of the existing literature on expectation related ERP components.

    1. Reviewer #3:

      In this manuscript, Ramachandran and colleagues describe how cholecystokinin-related NLP-12 neuropeptide signalling in C. elegans can regulate two different behavioural programmes, area-restricted search (ARS) and basal locomotion, by conditionally engaging different specific receptors that are expressed in different neuronal targets. They thoroughly characterise the CKR-1 receptor which had not been described previously, and place its function in context with that of the previously known NLP-12 receptor CKR-2. The manuscript gives new insight into an interesting and likely conserved mechanisms of how neuromodulatory systems enable adaptive behaviour by coordinating the action of neural circuits even when they are not directly connected. The conclusions drawn appear solid and are justified by the data presented, and the experimental approaches and results are well documented.

      The main problem with the work is a certain lack of clarity regarding the separation of the roles of the CKR-1 and CKR-2 receptors on basal locomotion/body bending and head bending/reorientations. Overexpression of NLP-12 places animals in a chronic ARS state, as described in a previous publication. Is the NLP-12 overexpression model representative of the increased reorientation in area restricted search, or of control of undulations in basal locomotion, or both? If it is primarily representative of area restricted search, this would mean that CKR-2, similarly to CKR-1, mediates the chronic ARS state induced by NLP-12 overexpression, because in fig. 1B and C its mutation causes a reduction in the phenotype, and deletion of both ckr-1 and ckr-2 causes a stronger reduction.

      Also, it is unconvincing that SMD neurons do not express ckr-2 (see S3D); no comparison of ckr-1 and ckr-2 expression levels in SMD is provided and in fact the CeNGEN data of single cell RNAseq of C. elegans neurons shows similar expression of both receptors in SMDD (accessible at cengen.shinyapps.io/SCeNGEA). On the other hand, loss of ckr-2 on its own does not cause a significant reduction in ARS (fig 3A). To clarify this, the authors could measure the reorientation rate in the nlp-12OE ckr-2 mutant strain.

      Given that ckr-1 overexpression as shown in figs 4-6 increases both body bending amplitude (and ARS-like high reorientation rate, the authors offer the interesting possibility that SMD may also affect basal locomotion. I would suggest an experiment that clarifies whether SMD also controls body bending in basal locomotion using the single-worm tracking assay shown in fig 2A with the SMD-specific ckr-1 rescue strains in a ckr-1 mutant background (as used in figure 7). Also they could measure body bending in the existing data on the SMD::Chrimson optogenetics.

    2. Reviewer #2:

      Ramachandran et al. report the discovery of a C. elegans GPCR - CKR-1 - that mediates some of the effects of the cholecystokinin-like neuropeptide NLP-12 on posture and foraging behavior. The discovery of this receptor permits further study of this neuropeptide signaling system, which is conserved from worms to vertebrates. Although CKR-1 is expressed in many neurons, the authors show that its function in SMD head-motorneurons is especially important for control of posture and foraging. The manuscript's strengths include: (1) rigorous characterization of receptor-ligand interactions in vitro, using a cell-based assay for GPCR activation, and in vivo, using genetic analysis, (2) compelling data in support of a model in which NLP-12 regulates SMD neurons to control foraging, (3) high-resolution analysis of C. elegans posture during foraging, which illustrates the complexity and richness of this behavior, and (4) the circuit model, i.e. a role for SMDs, is tested using a number of independent methods and clearly indicated.

      The manuscript does have some weaknesses. In addition to specific technical points listed below, the manuscript discussed neuropeptides derived from a single source, the DVA pre-motor neuron, acting on distinct targets via distinct receptors in a conditional manner. This interesting model is suggested by the title and the abstract and comes up plainly in the introduction and discussion. However, the model is not clearly supported by the data, which primarily focus on the characterization of CKR-1 as a relevant receptor for NLP-12 peptides. Another weakness in the manuscript arises from the authors' switching between various assays for posture during locomotion, which makes it difficult for the reader to compare data between figures. Rich kymography data are relegated to supplementary figures, and data from only a subset of relevant genotypes are shown as kymographs. The manuscript would be strengthened by more uniform analysis of posture and foraging. Finally, while the data clearly show that effects of NLP-12 on posture and foraging require SMD neurons, the manuscript does not investigate how NLP-12 affects SMD activity. The manuscript would be strengthened by experiments showing a functional connection between DVA and SMD neurons, e.g. functional imaging of SMDs during optogenetic manipulation of DVAs.

      Specific comments:

      1) One premise of the work is that DVA neurons are the sole source in vivo of NLP-12 peptides. A recent study (Tao et al. 2019, Dev. Cell) shows that there is an alternate source of NLP-12, the PVD nociceptors. The authors should address the possibility that their assays also detect a contribution of PVD neurons to posture/foraging.

      2) The text associated with Figure 1B-C is tentative with respect to assigning redundant functions to CKR-1 and CKR-2. Why? The data are clear; these receptors function redundantly.

      3) The very nice in vitro analysis of NLP-12 receptors should include negative controls. Ideally, the authors would use a scrambled neuropeptide or a related neuropeptide to demonstrate specificity of the interactions between NLP-12 and CKR-1/2.

      4) The different 'bending angles' used in Figures 1 and 2 make it difficult to compare data between figures. Also, the schematics used to explain the bending angles have small fonts and are hard to read.

      5) Figure 3E shows the results of a nice experiment in which optogenetic activation of NLP-12-expressing cells - presumably DVA - causes reorientations. The authors assert that this effect requires CKR-1 but not CKR-2. The data, however, suggest that CKR-2 might have an effect. The variance of the data does not allow the authors to reject a null hypothesis, but they err in then assuming that this means that CKR-2 plays no role in the phenomenon. This experiment should be repeated to determine whether there is indeed a specific or privileged role for CKR-1 in mediating NLP-12-dependent reorientations.

      6) Also, Figure 3E should show raw data - don't show proportional changes - and all Figure 3 should be scatter plots allowing the reader to assess the variance of the data.

      7) The authors show that effects of receptor overexpression are suppressed by loss of NLP-12 peptides. Is there precedent for this kind of genetic interaction in the literature?

      8) Also, the authors assert that suppression of effects of CKR-1 overexpression by loss of NLP-12 shows that NLP-12 peptides are the sole ligands for this receptor (page 9, line 17). It is not clear why the authors reach this conclusion.

      9) There are some very nice data that are assigned to supplementary figures but might be better placed in main figures. Fig. S3A-B shows data that are integral to the authors' model and could be presented in a main figure. Also, the localization of NLP-12::Venus in DVA axons near SMD processes would be appropriate to show in a main figure. It would be ideal to mark SMDs with a red fluor so that NLP-12::Venus colocalization with SMD processes could be assessed.

      10) The kymography data are nice but incomplete. The authors should show kymographs from strains of all relevant genotypes. This would include: (1) ckr-1(oe); nlp-12, (2) nlp-12, ckr-1, and ckr-2 single mutants, and (3) ckr-1; ckr-2 double mutants.

      11) Page 12, last paragraph indicates that 'low levels' of expression rescue ckr-1 phenotype - how has the expression level been determined? I guess that the authors refer to the amount of DNA used for transgenesis, not a direct measure of transgene expression - this should be reworded.

      12) The manuscript would be strengthened by experiments that measured the effect of DVA activation on SMD physiology and what contribution NLP-12 signaling makes to any functional connection between these neurons. One potential impact of this work is that it establishes a nice paradigm for new molecular genetic analyses of neuropeptide signaling. Direct observation of the effects of NLP-12 peptides on SMD neuron physiology would further strengthen the authors' conclusions and suggest mechanisms by which CKR-1 regulates cell physiology.

      13) Minor comment: Fig S1C is a little confusing w/ respect to how the ligand is indicated - it implies that there exists a ligand-binding site at the amino terminus of the receptors.

    3. Reviewer #1:

      In this manuscript Ramachandran et al. provide a C. elegans behavioral genetics study focused on the worm cholecystokinin-like neuropeptide-receptor system. They show that nlp-12 neuropeptides released from the DVA neuron fulfill a dual role in controlling body posture as well as head-bending mediated area restricted search (ARS). Previous work showed that DVA controls body posture via nlp-12 signaling to ckr-2 receptor in ventral cord motor neurons. Moreover, nlp-12 signaling was implicated in ARS; but the exact circuit mechanisms and targets of nlp-12 remained elusive. The present work shows in a pretty straight forward way that ckr-1 in SMD head motor neurons is the missing link. In worms, ARS is composed of quiet complex body movements including high angle turns during the worm's forward crawling state. Nlp-12 and ckr-1 mutants show reduced head bending during ARS, while overexpression leads to a stark ectopic ARS like behavior. The authors convincingly show that SMDs are the site of action for ckr-1 and implicated in ARS. They show both requirement and sufficiency of SMDs for ARS like behaviors. The regulation of ARS vs. dispersive behaviors has been extensively studied at the levels of sensory and interneurons in the worm, but how the switch is implemented at motor circuits was largely unknown. Conceptually, this is one of only a few studies investigating the selective control of head versus body movements and provides some interesting insights into the underlying mechanisms; therefore, the study is definitely important and timely. But, it is unclear still how upper sensory circuits transmit the switch between ARS and dispersal to the DVA-SMD circuit. Moreover, the present study does not investigate the signaling pathway of ckr-1 in SMDs and its role in controlling neuronal activity, e.g. via Ca++ imaging. As a sole behavioral genetics study, however, I find the manuscript quite complete. The experiments logically build upon each other and the paper is well written. My only major critique is that parts of the behavioral analyses are described with insufficient detail so that it is unclear to the expert how and what exact movements were quantified. This should be addressed by providing more detailed figure captions, methods sections, more supplemental figures and movies.

      1) The authors should exclude (or separate) reversal states and post-reversal turns in their analyses when measuring head bending, body bending and turn events, but it is unclear if they did so.

      2) Fig 1C and methods: it is unclear what defines a singular bending event as marked on the y-axis. Did the authors measure the maximum angle during each half-oscillation? If yes, this should be explained and how maxima were calculated etc. Or do the histograms represent all values from all recording frames. In the latter case, the y-axis labelling is misleading, and I suggest use "fraction of frames".

      3) Fig 1C: these are averaged histograms of n=10-12 worms, but what is the average number of events per worm and in total?

      4) Fig 1B-C, 2A etc.: to perform the measurements as depicted in upper panels is not really trivial, and I have the impression that the authors used their software packages in a black-box manner. What are the exact image processing steps to implement these measurements, i.e. how was vertex and sides of the angles exactly positioned? The authors should provide a time-series of individual examples alongside with movies demonstrating how accurately the pipeline performs during complex ARS postures.

      5) Fig 2B: the angles and body segments describing the head and head-bending angels should be unambiguously defined. The cartoon in 2B looks like they just measured nose movements.

      6) Fig 3B: reorientation events are not sufficiently defined here. During ARS, worms frequently switch between forward-backward movement, perform post-reversal turns and in a continuous manner exhibit curved trajectories. From a trajectory like the red one in 3A, it is again not trivial to identify and discretize individual turning events with a start and an end and distinguish them from reversals and post reversal turns.

      -- The procedure needs to be explained in greater detail with justification of parameter choice.

      -- How did the authors validate that the procedure performed well, especially during the complex ARS behaviors?

      -- Again, example trajectories and movies should be shown.

      7) All histogram panels lack statistics, e.g. KS test or appropriate alternatives.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      The reviewers find your work very interesting and acknowledge its importance in understanding the role of cholecystokinin signaling in differentially controlling aspects of locomotion behavior in C. elegans. In its current form, it represents a near complete and well done behavioral genetics study that could improve further with addressing some of the comments below and also harmonizing the behavior metrics that were used for quantifications. The work could be brought to another level though if the authors performed new lines of experiments that give further mechanistic insights, e.g. via physiological methods, into how ckr-1 signaling controls SMD activity.

    1. Reviewer #2:

      In this paper, the authors mainly tested peripheral blood mononuclear cells (PBMCs) samples from pediatric cancer and healthy patients by CyTOF, and analyzed the phenotypes of NK, T cells and monocytes. Some scientists have reported these related phenotypes. There is a lack of mechanistic research and many of the conclusions are not yet supported by presented data.

      Specific concerns:

      1) The authors collected pediatric cancer samples including hepatoblastoma, neuroblastoma, wilms tumor, lymphoma and et al. These types of tumors are quite different. Whether it's appropriate to analyze together? Lymphoma is a disease of the blood system unlike any other types of tumors. Their systemic immunity must have changed.

      2) No statistical analysis was performed in Fig2D and E. The conclusion of " Classical monocytes are enriched in pediatric cancer patients" is not supported.

      3) Figure 3a is different from the conventional diagram. It was a surprise to see that it showed CD56-dim CD16- and CD56-CD16+ NK cells.

      4) Figure 4 lacks statistical analysis.

      5) Figure 7 lacks correlation analysis. The conclusion of "Pediatric cancer associated immune perturbations vary by age " is not supported. In addition, the presented correlation diagram is insufficient to prove the above conclusion and title.

    2. Reviewer #1:

      The immune status of pediatric cancer patients may differ from that of adult cancer patients and healthy children. Unraveling the distinct immunological features of pediatric cancers may provide novel therapeutic strategies. Dr. Murali Krishna and colleagues analyzed the composition and phenotype of peripheral immune cells in both pediatric cancer patients and age-matched healthy individuals, and they found some interesting alternations in NK cells, monocytes, and T cell subsets. In general, this descriptive study can be potentially interesting for clinicians, immunologists and cancer researchers. However, several major points remain to be addressed.

      1) The incidence of hematologic tumors is relatively high in children. It is shown in supplemental table 2 that pediatric patients bearing solid tumor and hematologic malignancies were all included in this study. If solid tumors and lymphoma were analyzed separately, in comparison to healthy individuals, will the major conclusions remain the same?

      2) The type, stage, and therapeutic regimens of cancer may affect the landscape of peripheral immune cells. It is not clear whether any of these factors influence the major conclusion. What were the standards to include healthy pediatric individuals as controls in this study?

      3) The authors focused on immune cell-related differences between healthy and tumor-bearing children. To reveal typical immunological features of pediatric cancer patients, it is recommended to perform similar analyses with samples from adult cancer patients, particularly those bearing the same type of cancers.

      4) The authors claimed that the frequency and cytotoxicity of peripheral NK cells were reduced in young pediatric cancer patients, compared with healthy controls, but these parameters returned to normal in older pediatric cancer patients (>8yrs). Can they separately compare young and old patients with age-matched controls?

      5) The authors believe that diminished killing of tumor cells by NK cells from pediatric cancer patients was due to decreased cytotoxic capacity, rather than inefficient recognition or degranulation. More experimental evidence is needed to substantiate this conclusion. These NK cells were significantly shifted to an immunosuppressive/tolerant pattern (high in PD-1, NKG2A, but low in perforin and Granzyme-B), while Long-term (14 days) stimulation with IL-2 can improve their cytotoxicity. Can short-term IL-2 treatment achieve similar effects (e.g. increased cytotoxicity, elevated expression of lytic molecules and CD57)? Since the frequency and cytotoxicity of NK cells in older pediatric cancer patients (>8yrs) were actually similar to that in normal children, do serum IL-2 levels increase in older cancer patients?

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      Summary:

      Dr Taylor and colleagues aimed to emphasize NK cell-related defects in pediatric cancer patients, in comparison to healthy children. This study was potentially interesting, although it was based on descriptive analyses, lacking mechanistic exploration. In addition, this study included a mixed cohort of pediatric patients bearing tumors of different types, stages, and perhaps distinct therapeutic regimens. Some conclusions were not strongly supported by current experimental evidence. It remains unknown whether similar differences can be found between adult cancer patients and age-matched healthy individuals. To address all these above points, a large amount of further work will be necessary.

    1. This manuscript is in revision at eLife (August 17, 2020)

    2. Reviewer #2:

      Kroll et al. presented a strategy to achieve biallelic knockout effects in the founder (F0) generation of zebrafish, by targeting three different loci within the same target gene, with injection of Cas9 RNP mixtures. They showed that in addition to target single genes, this method could be successfully used to create double knockouts of slc24a5 and tbx5a gene pair, or tyr and ta gene pair, in F0 embryos. Strikingly, they also demonstrated direct generation of triple gene knockouts of mitfa, mpv17 and slc45a2 in F0 larvae, which fully recapitulated the pigmentation defects of the crystal mutant. Furthermore, they provide evidence of the feasibility of their method in dissecting complex and multi-parameter behavioural traits in the biallelic F0 knockouts of trpa1b, csnk1db, scn1lab genes. Interestingly, they established a rapid sequencing-free method to evaluate the activity of Cas9 RNP by using headloop PCR, facilitating the selection of target sites. Finally, the authors proposed a three-step protocol for F0 knockout screens in zebrafish. The strategy described here is quite impressive, and represents evident improvements of the method published by Wu et al. (Developmental Cell, 2018), which was based on the administration of four Cas9/gRNA RNPs. Nevertheless, the manuscript could be further clarified and improved in the following aspects.

      1) What are the essential differences in methodology of this method compared with that reported by Wu et al. in 2018 (Developmental Cell)? Or why and how the target sites could be reduced to three from four?

      2) Several genes were tested in both work, such as slc24a5, tyr, tbx16, and tbx5a, did you use or compare the same target sites in these genes as reported by Wu et al.?

      3) Is the dosage/amount of Cas9 or RNP used in this study different or comparable with Wu et al.? Does it account for the improvement of the method described in the study?

      4) The authors propose to design the three target sites in distinct exon within each gene. Is it really important and/or necessary to achieve high efficient biallelic knockouts? Any evidence?

      5) According to the section of MATERIALS AND METHODS, the synthetic gRNA was made of two components, i.e., crRNA and tracrRNA. Synthesis of gRNA as a single molecule by in vitro transcription is usually more popular and economic, is it really necessary to use crRNA and tracrRNA to achieve high efficient biallelic knockouts? Any evidence?

      6) Could headloop PCR be used for the quantification of mutagenesis efficiency (indel-producing mutation rate) of Cas9/gRNA? How sensitive is this method? Could small indels (such as 1-bp insertion or deletion) be detected by the headloop PCR?

      7) In addition to indels, deletions between two double strand breaks induced by two gRNAs are also important for the generation of biallelic knockouts of the target gene. The authors showed the analysis of mutations in each site (such as in Fig. 2A), is it possible to quantify the distribution and contribution of all the different deletions?

      8) Fig. 1C and 1D: The authors compared the effects of the injection of 1, 2, 3, and 4 loci. How were the 1, 2, and 3 loci selected from the four target sites? Will each of the four loci give the same or different phenotypic ratio if tested individually? Will different combinations of 2 loci or 3 loci give the same or different phenotypic ratio? Or which combination of 2 loci or 3 loci will give the highest mutagenic effect? For example, in Fig. 1C, the 3-loci showed comparable effect with 4-loci, while the 2-loci is less effective; is it possible to find other 2-loci combinations which could show higher mutagenic efficiency than the current 2-loci, such that the effect of the new 2-loci combination is as good as the 3-loci or 4-loci combination? Conversely, in Fig. 1D, the 2-loci already showed the highest mutagenic effect, is it because of this particular 2-loci combination, or any 2-loci combination will show the same efficiency?

      9) Figure 6: The phenotypes of scn1lab F0 knockouts are more severe than those of scn1lab-/- mutant. Any explanation?

    3. Reviewer #1:

      Kroll and colleagues describe an efficient strategy to reliably generate F0 zebrafish embryos with (multiple) genes knocked out using CRISPR/Cas9 RNPs. In their most dramatic and broadly applicable proof-of-principle experiment, authors demonstrate successful recapitulation of the triple mutant crystal phenotype in 9/10 F0 embryos. As the authors point out, their methodology is extremely likely to be adapted for candidate genes for traits which display a range of phenotypes among wild type embryos or larvae.

      The manuscript points out a rather obvious but somehow underreported feature of NHEJ-based mutagenesis: assuming random size of indels, when 100% of DNA is mutated fewer than 50% (.67x.67) of cells in an embryo will contain frameshift mutations in both alleles. Thus, successful recapitulation of a mutant phenotype in an F0 embryo relies on mutagenesis of an essential part of the protein (not always as straightforward as it seems), utilization of other repair pathways such as MMEJ (not always reliable), or fortuitous help from largely unknown factors which skew the distribution of indel sizes (multiple guide would RNAs need to be tested without guarantee of success). Simultaneously designing several guide RNAs against the gene and co-injecting them, as the authors propose, seems to be an excellent and straightforward strategy.

      My most significant criticism is that although new to zebrafish, the described strategies - use multiple guide RNAs and headloop PCR - have been successfully deployed in other systems. Adapting these strategies to the zebrafish model system offers tremendous value, but the distinction between development of new methods and adoption of existing methodologies must be considered.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 4 of the manuscript.

      Summary:

      The authors describe a new efficient strategy to reliably generate F0 zebrafish embryos with (multiple) genes knocked out using CRISPR/Cas9 RNPs. They showed that in addition to target single genes, this method could be successfully used to create double knockouts of slc24a5 and tbx5a gene pair, or tyr and ta gene pair, in F0 embryos. Strikingly, they also demonstrated direct generation of triple gene knockouts of mitfa, mpv17 and slc45a2 in F0 larvae, which fully recapitulated the pigmentation defects of the crystal mutant. Their methodology is extremely likely to be adapted for candidate genes for traits which display a range of phenotypes among wild type embryos or larvae.

      This is a new tool for the zebrafish community. Despite the presented data on several loci, it is not clear whether and how this method is better compared to a series of prior related F0 approaches. This question is the crux of this methods manuscript.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on September 9 2020, follows.

      Summary

      The role of extracellular vesicles (EVs) such as exosomes as factors potentiating metastasis by solid tumors has attracted considerable recent interest. The article by Ghoroghi et al is a very complete and thorough study of the role of 2 small GTPases, RalA and RalB, in extracellular vesicle (EV) release and breast carcinoma progression. The main advance of this manuscript is to describe a signaling role for the GTPases RalA and RalB in regulating phospholipase D (PLD) and multivesicular body function to regulate exosome biogenesis. They also show that RalA and RalB regulate expression of MCAM/CD146 on EVs, and that reduced levels of CD146 on EVs affects efficiency of lung cancer metastasis. Finally, they show RalA, RalB and CD146 levels are indicators poor prognosis for breast cancer patients. Overall these are interesting observations with clinical relevance. The study is extremely carefully performed, in general, with appropriate controls and conclusions. There are a limited number of weaknesses that need to be addressed.

      Essential Revisions

      1) Figure 3C shows that RALA and RALB have functions where they do not always act in series with each other. They may for exosome excretion, but not necessarily for proliferation. For Supp. Fig. 4C, this experiment needs to be longer than 72 hours for a proliferation assay to be truly convincing. While interesting that proliferation results differ between mice and growth on plastic, a 10 day growth experiment would lead to greater conviction that this is a sustained difference. Figure 3D better agrees with Supp. Fig 4C, where loss of RALB gives greater proliferation over control, albeit not as great as RALA loss. Was Fig. 3D performed at 12 days? The variable time points make it difficult to assess the role of RALA/B on growth.

      2) Figure 4 carefully describes why RALA/B deficient EVs fail to prime regions for metastasis. As the EVs are not directed to those locations, the contents fail to increase permeability of the regions. However, statistically significant does not mean biologically meaningful; specifically, looking at Fig 4f and 4g, knockdown of RALA/B had little effect on EV internalization. One question that is not answered is whether RALA/B are acting in series or parallel. Concurrent depletion of both RALA and RALB in the types of experiment performed in Figure 4 would help answer this question. If depletion of both isoforms is greater than either alone, then it implies they act in parallel.

      3) How are RALA/B activated in the context of 4T1 cells? The use of the RAL inhibitor in Figure 1B is most effective in cell lines without RAS mutations. Panc1 and MD231 cells have KRAS mutations, yet RALGEF inhibitors are least effective. At the concentrations used, all RALGEF activity should be inhibited. Is RAL expression levels similar between cell lines?

      4) The logic for targeting PLD1 seems rushed. There are several RAL effectors, and while the data on PLD1 are compelling, it is unclear why only this effector was selected. Were other effectors tested? The Hyenne paper, which is cited throughout, provided stronger logic for connecting these pathways. The statement in the discussion, "Our work further identifies PLD as the most likely effector acting downstream of RAL to control exosome section" is incorrect as stated. This work identified PLD1 as A downstream effector important for exosome secretion. To state that it is the most likely effector is an overstatement as no other RAL effectors were tested.

      5) The use of either PLD inhibitor at 10 uM will have significant inhibition of both PLD1 and PLD2 regardless which inhibitor is being tested, as well as additional off-target effects. Given both inhibitors have nanomolar IC50 values for their specific PLD isoform and low micromolar IC50 values for the related isoform, these experiments are inconclusive as presented. This section of the text should be reworked to account for the lack of isoform selectivity at the concentrations of inhibitor used. a. siRNA or shRNA could be used to validate the PLD1/2 inhibitor data.

      6) In all cases when using inhibitors, there are no western blots or additional validation showing target inhibition in any cell line. In this case, the use of such high concentrations of inhibitor likely resulted in inhibition of the target proteins. However, such high concentrations often result in off-target effects. Determining the lowest efficacious dose to inhibit target function should have been performed and used throughout the experiments.

      7) Would loss of MCAM expression, via direct knockdown, result in EVs that are unable to permeabilize HUVEC cells, as in Fig 4A/B? Alternatively, does loss of MCAM expression (or blocking via anti-CD146) result in decreased metastasis, similar to RALA/B knockdown? While Fig. 5g shows a decrease in EV localization to lungs on treatment with Anti-CD146, the rate of metastatic lesion formation was not assessed when CD146 was blocked.

      8) In a previous study (Hyenne et al, 2015) the authors already demonstrated a role for RalA and RalB in controlling MVB and EV secretion in the 4T1 breast cancer model, in a process evolutionarily conserved through nematodes. They also wrote a follow-up review article describing considerable evidence in the literature for RalA controling PLD and ARF6 function to control EV secretion. This diminishes the novelty and interest of the first part of the study, which could be reduced. The more novel and thought-provoking parts of the study are in the definition of the relationship between the RAL proteins and CD146, and the identification of unique properties of the exosomes produced in cells with manipulated RAL proteins (for instance, in regard to influencing exosome permeability). These could be better emphasized.

      9) Almost the entire mechanistic model is based on the work of a single breast cancer cell line, 4T1, which has been described by some as triple negative. This is a significant weakness, as there may be features of that line that make it uncharacteristic of breast cancer in general. At least some of the key conclusions should be functionally confirmed in an additional cell model.

      10) In addition, breast cancer cells fall into multiple different subtypes, which have different metastatic propensities and gene expression patterns. Is the functional relationship between the expression of RALA/B and CD146 in exosomes observed in just triple negative cells, or in other subtypes?

      11) In Fig 5h, K-M analysis of TCGA shows a weak relationship between MCAM expression and survival. Were the tumors analyzed segregated by tumor subtype? Were data corrected for tumor stage? This is important, as the MCAM staining pattern may reflect a propensity for MCAM expression in tumors at a late stage, or subtypes with a poorer prognosis.

      12) The authors provide observations suggesting that RalA and B control primarily the EV secretion pathway involving Multivesicular bodies, hence leading to exosome secretion. This is mainly demonstrated by observation of a decrease in MVBs upon knock-down of RalA/B, demonstrated by thorough electron microscopy analyses. This is correlative, rather than truly demonstrative, but the best one can do so far. In most experiments, the authors use EVs isolated by a relatively crude method, ultracentrifugation, that co-isolates non-specific components, and they do not analyse larger EVs that can be recovered at lower centrifugation speed, thus an effect of RalA/B on these non-exosomal EVs cannot be excluded. The EVs are only characterized by their number (NTA counting), which is not very precise, and not consistent with guidelines of ISEV (MISEV 2018, J Extracell Vesicles 2018, 7: 1535750).

      Maybe the authors can argue that they did perform more complete analyses of their 4T1 EVs in a previous article? Did they use the EV-TRACK website to verify their experimental EV isolation and characterization set up? Of note, the authors also perform quantitative proteomic and RNomic analyses, which gives a better characterization of EVs. The protein composition and its change upon RalA/B KD could have been used to try to confirm (or not) the MVB origin of the EVs controlled by RalA or RalB, but it is not crucial for the message. Alternatively, to demonstrate that the CD146-bearing EVs that carry the prometastatic function are bona fide exosomes, the authors could have shown its localization in the cells, upon or not RalA/B depletion, and show if a drastic change in ratio of localization in MVBs vs the PM occurs, but this would be an additional study, not necessary for the current paper.

      13) The authors insist that RalA/B control exosome secretion, and discuss the bases and limitations of their demonstration properly. The summarizing schemes (fig2f and fig5i) of their model show the release of RalA/B-depent pro-metastatic exosomes, and RalA/B-independent exosomes which are not pro-metastatic. However, EVs that are released in the absence of RalA/B could instead be formed at the plasma membrane, and correspond to ectosomes. Nothing in this study demonstrates the origin of RalA/B-independent EVs, thus PM-derived EVs should be represented in the scheme.

    1. Reviewer #3:

      In this manuscript, Chakravarti and colleagues analyzed the functions of several p53 isoforms in the Drosophila germline. They created novel isoform-specific alleles by CRISPR/Cas9 to untangle the functions of p53A and p53B isoforms. They made use of a Phid-GFP reporter line to follow p53 transcriptional activity. The role of p53 in the development of Drosophila germline has been published several times before with a focus on the silencing of retro-transposons (TEs) and meiotic DNA breaks response (Lu, 2010; Wylie, 2014; Wylie, 2016). Despite this published literature, the authors created novel and very valuable tools, which allowed them to make several novel and interesting observations. My main criticism is that most of these observations remain unexplained and the manuscript feels descriptive as it stands. However, this manuscript has great potential if it could follow up some of these novel observations. Some examples are the following:

      1) In Figure 5C, the authors made the interesting observation that hid-GFP was stronger in region 1 of p53A-B+ than in the wild type p53A+B+. This activity of p53 cannot be explained by meiotic DSBs as previously published, since meiotic DSBs only occur later in region 2. This observation remains unexplained and is not explored further.

      One possibility is that it could relate to transposable elements (TEs) activity in this region. TEs can create DSBs (thus non-meiotic) and p53 has been published to silence TEs in Drosophila (Wylie, 2014; Wylie, 2016). It is also particularly interesting that the silencing of TEs is known to be weakened in this specific region of the germarium even in wild type condition (Dufourt J, NAR, 2013; Theron E, NAR, 2018). Could p53A play a role in silencing TEs in this region when Piwi is downregulated? This would bring novel insights on when and where TEs are silenced in germ cells.

      A transcriptomic analysis of p53A-B+ germ cells could show whether TEs are upregulated in this hid-GFP++ cells. It is probably out of the scope of this manuscript. Another possibility would be to perform FISH for TEs known to be expressed in p53 mutant, such as TAHRE (Wylie, 2016). In addition, do the authors detect DSBs in region 1 in p53A-B+?

      2) On Figure 7 and 8, the authors analyzed the role of p53 in "persistent" meiotic DSBs. I am not convinced that these DSBs are only persistent meiotic DSBs. As discussed by the authors themselves (page 13), the origin of these DSBs could be TEs mobilization. I think it is a very important caveat for their conclusions. Another non-exclusive possibility for DSBs appearing in endoreplicating nurse cells is incomplete replication and associated DNA deletions during repair as shown in (Yarosh and Spradling, GD, 2014).

      To distinguish between these possibilities and strengthen their conclusions, the authors should perform the same experiments in the absence of meiotic DSBs, such as in a meiW68 mutant background (meiW68, p53AB double mutant). meiW68, okra, p53 mutants may be hard to generate but shRNAs against meiW68 are publicly available and effective, while they may also exist for okra or other spindle genes, and could make this combination easier to generate.

      3) The authors showed that p53A and p53B levels are developmentally regulated (Figure 6G): does overexpression of one or both of the isoforms have any phenotype?

      4) I agree with the authors that karyosome defects are part of an array of phenotypes induced by the activation of DNA damage checkpoints. However, I would not equal it to the activation of a pachytene checkpoint and conclude that p53 is part of that checkpoint.

      5) On Figure 7D, in p53A+B-, there seems to be a lot of DNA damages in follicular cells. Is this reproducible?

    2. Reviewer #2:

      The Drosophila genome encodes multiple p53 isoforms. P53 is an important factor in maintaining genome integrity and having multiple isoforms in flies raises an interesting evolutionary concept because humans have a gene family of p53 members. In this paper, the expression and function of the isoforms is compared in the germ line. There are two significant findings based on investigating these two isoforms. First, the apoptotic response depends on the A form, and both have roles in the response to meiotic DSBs. These results represent a significant and important extensions of previous work from another group that showed p53 suppresses transposon activity.

      With one important exception, the data are solid and support the conclusions. The data regarding the apoptotic response is based on TUNEL and a hid-GFP reporter. This data shows that irradiation induces a response in the mitotic region but not later regions. Conversely, there is a milder induction in the meiotic region (region 2a). Both could be in response to DSBs. But it is amazing that there is no HID induction following IR in these meiotic regions. Thus, there is a satisfying correlation between the apoptosis and HID responses to IR, and both are diminished in the meiotic region.

      The most significant concern with this paper is that conclusions that the p53 isoforms respond to meiotic DNA breaks. Indeed, this is the title of the section starting at the end of pg 7, but there are no experiments which lead to this conclusion. Similarly, the sentence "To determine whether p53A or p53B isoforms responds to meiotic DNA breaks" (pg 8), is followed by an experiment which does not do that (it compares HID expression in different p53 genotypes). The data in the paper are correlations between p53 expression and where DSBs occur in the germarium. Two experiments are needed. First, and most important, hid-GFP expression needs to be analyzed in a mei-W68 mutant. In addition, the germarium should be stained for both HID and gH2AV, the latter being the antibody the authors use in later Figures. It would also be satisfying to see the genotypes in Figure 7 performed in a mei-W68 mutant background, to determine if the persistent DNA damage in the p53 mutants depends on meiotic breaks.

    3. Reviewer #1:

      In this manuscript Chakravarti et al build on the previous work from the Calvi lab characterizing specific roles for the p53A isoform. In their 2015 paper Zhang et al showed, using isoform specific loss of function mutants, that p53A is primarily responsible for mediating the apoptotic response to ionizing radiation in the soma and that p53B is very lowly expressed in the cell types studied. They speculated that p53B might function in germline specific roles, such as meiotic checkpoints and DNA repair, identified in mammalian p53 studies.

      Here Chakravarti et al, have further characterized the functions of the p53A and B isoforms in Drosophila. In the ovary, p53A mediates the apoptotic response to IR and is also required for meiotic checkpoint activation. p53B is both necessary and sufficient for repair of meiotic breaks in nurse cells but not oocytes. p53B is required for expression of a hid-GFP reporter in region 2a-2b cells which may be related to a loss of p53B detection in p53A/B nuclear bodies at that stage.

      There are no substantive concerns with this manuscript.

      Minor concerns: CRISPR/Cas9 was used to create isoform-specific mutants for both p53A and p53B. RT-PCR was used to show the mutant alleles are isoform specific and that neither disrupts the expression of the others endogenous protein. The RT-PCR assay can only assess the expression of isoforms, not their function as the authors state.

      The authors noted that, even in the absence of IR, there was low level hid-GFP expression in late region 1/early region 2, the point when meiotic DSBs are induced by Mei-W68. Quantitation of hid-GFP expression in the various p53(A+/-,B+/-) mutant backgrounds showed that hid-GFP expression in the absence of IR requires p53 activity and that both isoforms are capable of activating hid-GFP expression. The authors suggest that the increased and earlier expression of hid-GFP seen in the p53A-/p53B+ mutant is due to precocious hyperactivation by p53B unrelated to meiotic breaks which have yet to occur. The authors then seem to contradict themselves saying that p53 reporter construct expression is dependent on Mei-W68, and both isoforms respond to DSBs. Since p53B is capable of precocious activation of at least one p53 target in the absence of p53A expression it is not clear that meiotic breaks themselves directly regulate p53B activity. From the data presented it seems plausible that p53A responding to DSBs might attenuate p53B activity. Quantitation of p53A and p53B levels across oogenesis shows a transient reduction in p53B levels in regions 2a-2b which coincides with the timing of meiotic breaks. Again, it is unclear whether this is a direct response of p53B to meiotic breaks. The authors suggest this change in p53B concentration in the p53A/B body might be due to transient relocalization from the p53A/B body to the nucleoplasm and back but that variation in fluorescence intensity makes it impossible to accurately assess levels in the nucleoplasm to confirm this. While p53B is undetectable in region 2a-2b cells, its presence is required there for expression of hid-GFP, thus translocation from the p53A/B body to fulfill this function is plausible.

      The figures are well done and appropriate to the message, however, in the fluorescent images the high background in the mCh channel makes it difficult to see the true signal and it is often completely lost in the merged images. Perhaps use of a greyscale panel would be more informative.

      In 2019 Park et al, using Gal4/UAS transgenes in a p53 null background concluded that both p53A and p53B mediated the apoptotic response to IR in the Drosophila ovary. I feel the authors adequately addressed this issue in stating that their current results using loss of function, isoform-specific alleles at the endogenous locus better reflects the true physiological response. Thus, I feel their conclusions on the role of p53 in the ovary have more merit.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.

      This manuscript is in revision at eLife.

      Summary:

      The authors have generated new and useful p53 reagents, which they have employed in four functional assays: apoptosis (TUNEL after 40 Gy irradiation (Figures 2-3), transcriptional induction (monitored by hid-GFP (Figures 4-5)), double stranded DNA breaks (DSB) (monitored by gammaH2AV (Figures 7-8)) and activation of pachytene checkpoint (monitored by synaptonemal complex protein C(3)G (Figure 8F-K)).

      The main findings are: 1) the apoptotic response to ionizing radiation (IR) depends on p53A; 2) expression of hid-GFP in region 2a-2b germ cells requires p53B; 3) DSBs occur at higher rates in both the p53A and the p53B mutants; and 4) p53B can repair of meiotic breaks in nurse cells but in not oocytes.

      Despite the generation of high-quality, new reagents, this paper is currently fairly descriptive. Of 8 figures, two show the expression pattern of the tagged p53 isoforms in various parts of the germarium (Figs 1 and 6). Some of the observations based on functional assays remain unexplained and need further experiments, including points 1 and 2 below.

      1) The authors conclude that the p53 isoforms respond to meiotic DNA breaks, but there are no experiments which lead to this conclusion. If the authors want to conclude this, they need (a) to analyze hid-GFP expression a mei-W68 mutant and (b) stain the germarium with both HID and gammaH2AV. The authors should also examine meiotic breaks in p53A+B+, p53A-B-, p53A-B+ and p53A+B- in a background that is also mei-W68 mutant.

      2) The authors are missing a more detailed analysis of the interesting observation that hid-GFP is stronger in region 1 of p53A-B+ than in the wild type p53A+B+. This observation cannot be explained by meiotic DSBs (which occurs in region 2), but the authors do not provide a mechanism. Is this due to transposable elements? The authors need to supply new data to provide a mechanistic understanding of this observation.

      3) The authors are encouraged to provide better data to support the conclusion that the DNA damage phenotypes of p53 and okra mutants are comparable. The images in Figs. 7, 8B and B' are not sufficient to assess this. The authors could quantify the number of gammaH2AV foci or intensity (rather than measure the number of positive cells). Related to this, it is surprising that p53 mutants lack the DV defects seen in okra mutants, particularly since defects in DSB repair should cause nondisjunction. Okra mutants are sterile. The authors should comment upon the fertility of p53 mutants.

      4) Some experiments have only 2 biological replicates (Figs 4 and 8K). Figs 7 and 8 have "2-3 replicates". The authors need to state specifically for each experiment how many replicates were scored. Ideally, they should have at least 3 replicates for each experiment or explain why that is not necessary.

    1. Reviewer #3:

      The manuscript studies a theoretical model within the framework of reaction-diffusion equations coupled to signalling gradients to possibly explain the emergence of whisker barrels in the cortex.

      1) The model considered by the authors is identical to the one studied by Karbowski and Ermentrout (2004). The only new features are the extension of the original 1D model to 2D and the addition of an extra term to represent competition in axonal branching.

      2) The authors consider 2 guiding fields. What are their explicit spatial profiles? Notice that since these fields essentially guide the emergent pattern and hence their profiles, in relation to the geometry of the 2D domain, are crucial. A different profile would certainly lead to a different pattern. I feel that it is not enough to say '...linear signalling gradients aligned with the anterior-posterior and medial-lateral axes....' since the domain is 2D and of non-rectangular shape.

      3) The justification for the introduction of the extra term for competition amongst axons (eqn (3)) is missing. Why that form? What is the reasoning for introducing axonal competition? What essential features of the resultant patterns are missed out if this term is absent? Or has a different form? In the discussion section, the authors mention, without any justification, that the conservation of branch density in each projection is a key requirement for the emergence of barrel patterns. This is totally unclear.

      4) Related to the above point, the authors mention that the axonal branch density is bounded by their dynamics. I presume that the integrations on the RHS of eqn (4) are spatial integrals over the domain. Then how come a spatial index survives in the LHS of this equation? How did the authors arrive at this equation? Is there a continuous-time version of this equation (like a conservation law), i.e., one that does not make a reference to the discrete time-stepping dynamics?

      5) A typical mathematical modelling study should explore the space of relevant parameters to demonstrate the possible range of behaviours that the model can exhibit. This is usually presented as a phase-diagram. The authors do not explore the parameter space (or the possible spatial profiles of the guiding fields) in their study.

      6) Throughout, the authors emphasize the spatial-locality of their mathematical model and conclude 'Hence the simulations demonstrate how a self-organizing system...'. A mathematical model with spatial-locality alone does not imply self-organized dynamics. With a sufficiently large number of spatio-temporal fields (N=42), and the concomitant parameters, and non-autonomous guiding fields, it is possible to reproduce any desired pattern. As such, it is crucial in the mathematical modelling of living systems to delineate the essential requirements from the incidental.

    2. Reviewer #2:

      This is an interesting paper that with a few assumptions shows that an old model for areal formation in cortex is sufficient to quantitatively reproduce the patterns of barrels observed in mouse S1. It would appear from the model that the key is the parameters gamma_ij that are presumably (hypothesized) to be assigned at the level of the thalamus. I have a few questions about the paper

      1) Does the same model work with respect to projections from the brain stem (barrelettes) to the thalamus (barreloids)? This would be a good way to check the ideas. Related to this, is it true that the barrelettes (barreloids) precede the development of the barreloids (barrels)? It would seem to be necessary? Or perhaps, starting with a double gradient in the thalamus and cortex and a prepattern in the barelettes, would the correct patterns emerge simultaneously?

      2) There seems to be a strong prediction in this concerning the development of the patterns over time. Panel C indicates that early on there are large distortioins in the shape of the barrels particularly in D,E rows. is this known to occur?

      3) It seems to me that without the chi, then possible connections plus axons are conserved which is reasonable. But with the necessary competition, there seems to be a flaw in the model if they have to renormalize at each point. If axons make connections should they not be lost from the pool forever (this is the -dci/dt the model). For example, since the gradient has noflux in the original K&E model, there is conservation of the total number of connections and axons of a given type. (int ai+ci dx = constant). This principle seems to make sense to me. However, the competition term chi_i seems top disrupt this. Is there a way to introduce the axonal competition in a way the prevents the unrealistic (or biologically implausible, at least) renormalization at each step? I'd be more comfortable with the model if there were a more physiological way to renormalize. For example, I dont know if the authos considered something like an additional flux of the form: \chi ai \nabla \frac{1}{N-1} \sum{j\ne i} a_j

      This makes the axons of type i move away from type j while at the same time enforcing conservation without recourse to some sort of postnormalization.

    3. Reviewer #1:

      This compact paper proposes a a self-organization model for formation of whisker barrels. The key idea is that reaction-diffusion dynamics can lead to the observed topology, in the absence of pre-defined centers for the barrels.

      The model is well presented and the motivation of mathematical choices is mostly clear. It may be worth expanding on the motivation for competition for axonal branching (equations 3 and 4).

      It is a little unclear how the misexpression experiment (Shimogori and Grove 2005) in Panel E was done. The simulation approach and outcome for this section is described very tersely.

      The authors also mention another easily modeled experiment, in which capybara brains lack barrels because they are big. It should be a simple matter to do this run.

      Overall I feel this study presents an attractive and compact model for the formation of whisker barrels, which has good biological motivation, and does a good job of reducing assumptions and molecular guidance cues.

    4. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 10, 2020, follows.

      Summary

      This compact paper proposes a a self-organization model for formation of whisker barrels. The key idea is that reaction-diffusion dynamics can lead to the observed topology, in the absence of pre-defined centers for the barrels.

      Essential Revisions

      1) How do the authors obtain 41 pairs of gamma values (line 102)? Are these parameters or were they inferred from experiments? This must be better motivated.

      2) The competition term chi_i requires renormalization, which seems biologically implausible. The authors may wish to try a form such as \chi ai \nabla \frac{1}{N-1} \sum{j\ne i} a_j which does not need renormalization. Several other points about this competition term are unclear as mentioned in the reviewer comments.

      3) There should be more exploration of the model: some parameter exploration and sensitivity analysis, and some more predictions.

    1. Reviewer #3:

      General assessment

      The authors present a cool new idea: using a large parabolic reflector in combination with a macroscopic lens array and rapidly modulated LED array to enable fast image multiplexing between spatially separated samples. I believe that there may be interesting applications that would benefit from this capability, although the authors have not clearly demonstrated one. The paper is short, and light on discussion, details, and data.

      Major comments

      1) The manuscript does not discuss several standard, key topics for any new microscope paper: "objective" numerical aperture, image resolution, optical aberration (other than distortion, which is discussed), and camera sensor size.

      2) Why was an array of low-performance singlet lenses used? With that selection, the image quality cannot be good. Can the system not be paired with an array of objectives or higher performance multielement lenses?

      3) Fluorescence imaging is not discussed or demonstrated but would obviously increase the impact of the microscope. At least some discussion would be helpful.

      4) Actual HTS applications are almost always implemented in microtiter plates (e.g. a 96-well plate) to reduce reagent costs and enable automated pipetting, etc. I do not believe anyone would implement HTS in thousands of petri dishes. The paper would be strengthened substantially by a demonstration of simultaneous recording from all (or a large subset) of the wells in a 96-well plate. It's not clear whether this is possible due to the blind spot in the center of the parabolic mirror's field of view that is blocked by the camera.

      5) One of the primary motivations for this approach is given in the first paragraph as: "wide-field imaging systems [which capture multiple samples in one frame] have poor light collection efficiency and resolution compared to systems that image a single sample at a given time point." With a f = 100 mm singlet lens, the light collection efficiency of the demonstrated microscope is also low (estimated NA = 0.12) and the resolution is unimpressive with the high-aberration lens and 1x magnification. They demonstrated only trans-illumination applications (e.g. phase contrast), where light collection efficiency is not important. I believe a fancy photography lens mounted directly on a many-megapixel camera set to image all or part of a microtiter plate could likely outperform their system in throughput and simplicity, at least for the demonstrated applications of cardiomyocytes and C. elegans.

    2. Reviewer #2:

      Astronomers have spent centuries learning how to image the night sky with limited sensor hardware. Ashraf et al present an ingenious adaptation of a technology developed for telescopes-parabolic reflectors-for imaging biological samples. In principle, the approach seems like it could be incredibly useful across a wide range of applications where multiple samples must be imaged in tandem. By placing multiple samples under a single parabolic reflector, multiplexing of samples and imaging hardware can be accomplished without sample-handling robots or moving cameras. The authors highlight two applications: cardiac cells in culture and free-moving nematodes.

      The authors explain the theory behind their technique in a clear and convincing way. However, the biggest challenge in most imaging projects is making the theory work in practice. In its current form, the manuscript falls far short of demonstrating the practical usefulness of parabolic mirrors for imaging biological samples. The authors include only a small amount of image data-for the nematode work, this consists of eight images collected from two plate regions. Data of this scope cannot provide readers or reviewers with sufficient evidence with which to evaluate the quality of the technique.

      1) The images shown-are they typical or are they the best possible images that can be collected from the device? The authors do not provide any quantitative evaluation of the quality of their images, in absolute terms or relative to existing methods, with which to understand the practical performance of parabolic mirrors. The authors should estimate the spatial resolution and dynamic range that can be obtained in practice with the devices, and evaluate how such image quality metrics vary across the entire field of view. Does performance degrade towards the edge of the mirror? Does performance degrade over time, as devices become de-calibrated with use?

      2) The manuscript is additionally weakened by the absence of a non-trivial measurement made with the device. Pilot experiments are included, demonstrating that images can be collected. However, no evidence is provided to show that these images can be used to compare samples and draw biological conclusions from them. A more convincing proof-of-principle would involve the measurement of some non-trivial biological difference between samples measured with the device, either confirming previous work or discovering something new.

      3) The authors highlight the comparative simplicity of their method: it eliminates the need for motorized samples or cameras. However, this simplicity must come at some: for example a substantially increased use of space or perhaps an increase in delicate calibration required, or equipment price. If a 0.25 meter mirror is required to measure four C. elegans plates, how large a mirror would be required to measure 16 plates-the number that can typically be measured using a flatbed scanner? The authors could also expand greatly on other practical issues: for example, is a dedicated imaging table required to align mirrors and samples? Readers would benefit from a clearer evaluation of the practical trade-offs in deploying parabolic mirrors in a laboratory setting relative to other imaging approaches.

    3. Reviewer #1:

      Ashraf and colleagues describe an approach to perform high throughput screening imaging without moving parts. The setup is original and offers to experimentalists the flexibility to record quasi-simultaneously stacks of images of multiple samples at the full field of resolution of the camera. The optical aberration inherent to the use of a parabolic mirror are mostly overcome by collimating light from the objective lens. The images require a post-processing in two steps for taking into account the image stretching on the detector and the variation in magnification due to the variation of the distance between the mirror and the image. Two applications illustrate the potential of the solid-state HTS.

      To my opinion, the following points need to be clarified:

      1) How homogeneous is the field of illumination with a single LED? Especially for a large field of illumination, a non-homogeneous illumination would compromise the quantifications.

      2) The accuracy of this ssHTS is related to the robustness at keeping the distance F2 constant between samples. In other words, how sensitive is the image acquisition to the potential variation in the F2 distance between samples as well as within a single large field of view?

      3) The magnification Mc must be explained.

      4) Is the post-processing compensation applied only in the y-direction?

      Assuming that such publication aims to disseminate the use of an ssHTS setup to a wide scientific community, I find the description of the setup as well as the applied image post-processing rather succinct, even with the 3D printing and source codes information.

    4. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 18, 2020, follows.

      Summary

      The reviewers all recognised the originality of your solution to perform high throughput imaging without moving parts. They do have some serious reservations, primarily regarding the evaluation of the quality and utility of the technique and in addition to the other points raised, and consider it essential that you address the following:

      1) The standard topics for any new microscope paper: "objective" numerical aperture, image resolution, optical aberration, and camera sensor size, together with the specific aspects related to this technique, including dependence on homogeneous illumination, and sensitivity to maintenance of F2 distance.

      2) A substantial expansion of the scope of the data presented, to provide readers with sufficient evidence with which to evaluate the quality of the technique, including proof of principal with a 96-well plate assay.

      3) A direct quantitative comparison with existing HTS imaging solutions.

    1. Reviewer #3:

      Unlike other ionotropic glutamate receptors, GluD2 is not gated by glutamate. No specific or high-affinity chemical modulators that induce channel activity exist for this receptor--as such, it’s role as a functional channel has been questioned. To address this challenge, the authors have utilized a previously characterized photoswitchable tethered ligand (PTL) called MAGu to target a very non-specific blocker (pentamidine) to a new ion channel target (the GluD2 receptor). This approach (using this exact PTL) has been used to target knock-in cysteine mutants of the GABAA receptor in mouse brain slices and in vivo in an awake, behaving mouse. Based on this precedent, it is not unreasonable to believe that this tool could similarly be used for the GluD2 receptor (which would be a significant advance in the field for understanding the physiological role of this protein in disease), although the authors only characterized MAGu response against GluD2 in heterologous cell culture within this manuscript. Because the GluD2 receptor is not ligand-activated in the traditional sense, the authors have exploited a previously characterized constitutively open point mutant (L654T) as a background to test different photoactivatable GluD2 cysteine mutants and have nicely demonstrated a reversible current block response in the presence of purple (380 nm = "cis-" = channel "on") and green (535 nm = "trans-" = channel "off") light. The authors have numerous publications and experience in the photopharmacology of ion channels, and the characterization data here look solid.

      That said, there are a few questions that should potentially be addressed:

      1) How does MAGu work on the cysteine-engineered receptor that would presumably be used for future in vivo studies? Because the GluD2-I677C point mutant (lacking the L654T background) does not show current, the authors use the known effect of mGlu1 receptor agonism as a readout of GluD2-I677C activity in response to light and only see a 23% decrease in mGlu1 current - is this very small effect physiologically significant or to be expected? It seems like MAGu might be a useful tool to modulate GluD2 in Lurcher mice (which harbor the L654T mutation), but it is hard to know what the probe efficacy and usefulness is for evaluating the physiology of the WT GluD2 receptor in the absence of a way to measure a direct functional effect on the channel. How else might this be addressed?

      2) PTLs have been shown to generate a high local concentration of ligand to accelerate pharmacological response (and in this case, provide some level of specificity for a very non-specific, greasy cation), but it is hard to rationalize "absolute" pharmacological specificity claimed by the authors (line 35, 211). At the mid-micromolar concentrations required to elicit response, it seems unlikely that MAGu will not react with any other extracellular cysteines present in cells. Further, the guanidinium group by itself will certainly not direct the maleimide reactivity towards GluD2 over any other cation channel or electronegative protein surface. The language of this claim should be modified in the absence of other types of specificity assays.

      Minor Comments:

      1) Provide description of the step-by-step protocol for Fig. 2C (or label "washout" of pentamidine)

      2) Why does normalized current plateau at 80% for 535 nm (Fig. 4B)?

      3) There is a current biorxiv paper reporting the GluD2 structure. https://www.biorxiv.org/content/10.1101/2020.01.10.902072v2.full.pdf If this is published during the course of this review, it would be interesting for the authors to comment on how this compares to their homology model and if it makes sense with respect to their mutagenesis experiments.

    2. Reviewer #2:

      The present manuscript investigates the development of a photo-activatable pore blocker to block the glutamate receptor delta receptor (GluD) ion channel as a potential tool to study this receptor in vitro and in vivo. GluD shares structural homology to other members of the family and plays key roles in synapse formation and signaling. However, in contrast to other members of the family, it does not have a clear ionotropic function - complicating defining how it contributes to synaptic function in vivo. Many labs have studied GluD and have provided key insights into its function and role. Still, the availability of new a tool to study and clarify its function has high potential.

      The manuscript lays out quite well, with some minor quibbles (see below), the issues. Proper controls are carried out to define the specificity of the action of the photo-switchable MAGu and how it can alter membrane currents and how it might work. The potential for a photo-switchable pore blocker to study the role of the ion channel in GluD is extremely high. I do have some concerns about signal-to-noise, since the pore block by trans-MAGu is only a fraction of total presumed current through GluD. In addition, how to introduce a specific cysteine in vivo will not be trivial. Still, overall this is an important manuscript that introduces an interesting strategy to study and further clarify GluD in the brain.

      1) Abstract/Introduction. It would be helpful to define early and explicitly what the photoswitchable functional strategy is - that it is working via a pore block mechanism. In the Abstract, for example, instead of calling it '...a photoswitchable ligand.' how about just '...a photoswitchable pore blocker." Once I realized the general functional strategy (at the beginning of description of results, where it was explicitly stated), everything became clearer. The functional strategy, that you are generating a photoswitchable pore blocker, should also be explicitly stated in the Introduction, where right now it is touched on but not explicitly stated.

      2) Figure 2C. The extent of block for photoswitching is being quantified relative to that for pentamidine, which is reasonable. However, for pentamidine, what is the concentration used for the experiments? Where is it at on the concentration-block curve for pentamidine? Presumably, if a complete block the leak current should go to zero and hence the efficacy of the photoswitching blocker would be less (e.g., Figure 4B). Please clarify.

      3) Figure 4A. Would be nice to see difference currents and perhaps to contrast to what is shown in Figure 2A. This would clarify the 'voltage-independence' of action for those unfamiliar.

      4) Figure 4D. Not clear how the 'ion channel' or red/green pore was generated? Is this from the structure or from some modeling? Please add details. This is an interesting figure but it is also somewhat speculative, I think, but needs more details to understand its basis. One question is what is driving the positioning of the trans MAGu? Is it being fixed? And what is driving the change in the coloration - presumed pore blocking by trans MAGu?

      Minor Comments:

      1) Figure 1. Minor point. Technically, there is no transmembrane segment 2 (TM2) in iGluRs. M2 is a pore loop, like the P loop in K+ channels, and enters and exits on the same side of the membrane - and does not span the membrane (and hence not a transmembrane segment). Simple solution would be to just rename TM2 to M2 leaving TM1, TM3, and TM4 as is and just noting somewhere in Figure legend that M2 is a non-membrane spanning pore loop.

      2) Figure 2D. Minor point. Although I understand the intent of figure, it is Very hard to discern what is being shown. Might be helpful to remove the 'red' subunit?

    3. Reviewer #1:

      The study by Lemoine and colleagues demonstrates a novel chemogenetic tool to probe ion channel function of GluD2 in HEK cells. By introducing cysteine mutations and engineering a photoswitchable ligand, ionic current carried by constitutively-open GluD2 mutant channels was reversibly decreased by light. Further, GluD2 current produced by activation of mGluRs was partially reduced by light. This tool has the potential to be very powerful to advance the understanding of GluD2 channel function in neurons.

      Major:

      1) The introduction and abstract are rather general and antiquated, to the disservice of the readers. It may be time to move away from the notion that the ion channel function of GluD is debated. The authors have published many elegant studies demonstrating ion channel function. By appearances of the literature, the interpretation of these studies are not contested. In addition to pharmacology, ion channel function of GluD has been demonstrated using selective genetic strategies (e.g. Ady et al., 2013; Benamer et al., 2018; Gantz et al., 2020). To this end, lines 28-29, 51, 55, & 73-75 should be changed. It does not seem fitting to state "direct evidence for ionotropic activity of GluD in neuronal setting [sic] is lacking" provided the studies referenced above. Broadly, the readers would benefit from restructuring of the introduction and abstract to state the specific issue addressed by the present study (i.e. the lack of specific antagonists/pore blockers to study GluD without affecting other iGluRs) and highlight the potential application of the ligand.

      2) This photoswitchable ligand MAGu has great potential to probe GluD channel function in neurons, although the present study stops short of demonstrating its utility in neurons. Lines 211-212 state that the WT receptor is insensitive to MAGu, but it is not clear where those data are presented. It would be beneficial to show the magnitude of the DHPG-induced current in WT GluD2-expressing cells before and after addition of MAGu to address the possibility that MAGu affects the current irrespective of trans- or cis- conformation. It is also not clear how MAGu will be selective for site-specific conjugation when introduced in a neuronal setting. Is it expected MAGu will react with any available cysteine? It would be helpful to discuss possible limitations going forward towards use in neurons.

      3) The data show convincingly that 380 nm light unblocks MAGu-induced GluD2 block by darkness or 535 nm light. But it is not clear how trans-MAGu affected leak current from GluD2 Lurcher mutant channels. In Figure 2C I677C, there is still substantial leak in 535 nm. The quantification in Figure 2C (% photoswitching) shows the % of I-Blockphoto over I-Blockpenta, but the arrows in the right-hand trace, it would appear I-Blockphoto is actually the current unblocked. It would be helpful to quantify the amount of leak current blocked by trans-MAGu. Additional discussion as the structural basis for incomplete block may also be helpful.

      Minor:

      1) Recommendation to include model system in the title ("in expression systems" or "in HEK cells", vel sim)

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      This manuscript was assessed by three reviewers. After the completion of their reviews, the editor and reviewers discussed the paper and arrived at the following consensus review. For transparency, the individual reviews are also presented.

      Summary:

      Unlike other ionotropic glutamate receptors, GluD2 is not gated by glutamate. No specific or high-affinity chemical modulators that induce channel activity exist for this receptor. To address this challenge, the authors used a previously characterized photoswitchable tethered ligand (PTL) called MAGu to target a very non-specific blocker (pentamidine) to a new ion channel target (the GluD2 receptor). This approach (using this exact PTL) has been used to target knock-in cysteine mutants of the GABAA receptor in mouse brain slices and in vivo in an awake, behaving mouse. Based on this precedent, it is not unreasonable to believe that this tool could similarly be used for the GluD2 receptor, which would be a significant advance in the field for understanding the physiological role of this protein in disease.

      The original reviews, below, reflect the reviewers’ initial enthusiasm for the potential of the approach to study GluD2 channels. In the discussion, all reviewers agreed that the issue of signal-to-noise is critical and that additional experiments are essential to demonstrate that the MAGu response will be sufficient for physiological studies in vivo.

    1. Reviewer #2:

      I very much like the general idea of this paper, but my opinion is that this is not an idea that can/should be applied to these data. As elaborated below, the ABIDE data are from numerous sites with different scanners, imaging acquisition sequences and parameters, sample ascertainment, etc, The methods used in the current paper rely on there not being such heterogeneity; and its presence can either render true ASD-related deviance invisible, or create an illusion of ASD-related deviance where there is none. Such heterogeneity is, of course, problematic for more conventional approaches; but is far more problematic for the methods proposed here.

      Major Issues and Questions:

      1) The authors are critical of case-control models but do not present an alternative to dealing with the heterogeneity in the data. Indeed, linear models are inadequate to deal with the heterogeneity in the ABIDE data given the lack of overlap in the data for different sites. But the normative approach presented here seems to not deal with the problem at all, potentially transforming what would be taken out by a nuisance variable into an alteration in ASD-related deviance.

      2) The sparsity of the data beyond childhood is extremely problematic for this approach. The approach of taking data in one-year bins requires large amounts of data within each bin to make the means and standard deviations reliable. By the teenage years, this is clearly not the case. The authors limit age bins to having at least 3 control points; this is clearly wildly insufficient, and would be even if there were no issues with site heterogeneity. Conventional linear models are to be preferred to normative models under these conditions.

      3) The comparison of results from a case-control model versus a normative model seems misleading. A case-control model approach requires a specification of the age at which the comparison is made. This is not provided, leading one to suspect that the age data were not centered, but were absolute, and thus the differences were essentially projecting backwards to birth. (This is, I believe, a common mistake.) The model specification is also completely lacking. Moreover, a case-control approach does not preclude the possibility of centering the data at different ages (as in e.g. Khundrakpam et al. (2017)). Between this and the problems with heterogeneity for the normative models, it is unclear how to interpret these results.

      4) The idea that individuals that are more than 2 stddevs away from the mean of the controls are outliers and should be eliminated from the analysis seems mistaken. If all individuals with ASD are substantially far from the mean of the controls, they are clearly not to be treated as outliers.

      5) The impact statement claims that "normative modelling has the potential to isolate specific highly deviant subsets of individuals with ASD, which will have implications for understanding the underlying mechanisms and bring clinical impact closer"; there is no indication that that is the case. The normative model has identified primarily children, and has identified nothing in particular about those children. Case-control models have done the same.

      6) It appears to this reviewer that this paper outlines an approach which could be worthwhile in a data set without massive heterogeneity, but within the context of the ABIDE data actually seems harmful.

    2. Reviewer #1:

      This paper describes the impact of outliers in normative cortical thickness (CT) measurements when examining those suffering from autism spectrum disorder (ASD). The authors used the ABIDE sample and binned subjects by age, and assessed outliers as a function of a "w-score" which they estimated across CT parcellations across the entire cortex. They then demonstrate that cortical thickness differences that can ascribed to ASD can essentially be attributed to a small number of outliers within the sample. They also demonstrate that this w-score may be sensitive to clinical variables as well.

      Overall, it is unclear to me what the exact goal of the work is: To describe the anatomy of ASD better? To subtype? Or is there another "take-home" message of this paper? I would imagine that the case-control differences in most neurodevelopmental disorders with high heterogeneity and high variability would demonstrate a similar kind of trend. And thus, at the end of the day, I am not sure how much this technique advanced our understanding of ASD.

      Issues and Questions:

      1) It is unclear from the methods how the authors deal with motion and image quality. Recent work by Pardoe and Bedford demonstrate the importance of dealing with this issue, particularly in the context of the ABIDE sample. This would likely have a significant impact on any of the results. It's unclear if the use of the Euler index at the extremes of the distribution of the dataset being used is sufficient. How did the authors come up with their Euler number cut-off?

      2) The W-score could use a much better explanation. It is not clear to me as to what it is and how this should be interpreted. The lack of information regarding the number of age-bins used also makes interpreting these findings confusing in my mind.

      3) The authors report that, "The median number of brain regions per subject with a significant p-value was 1 (out of 308), indicating that the w-score provides a robust measure of atypicality." I guess this could be true, but given the variation in normative ageing and development, I suspect this would also be true of a large number of TD children. That being the case, would it be worth doing a permutation test to determine the threshold of how man "atypical" areas one could expect by chance?

      4) The authors note "Unfortunately, despite a significant female subgroup, the age-wise binning greatly reduced the number of bins with enough data-points in the female group." I understand that this could indeed be a problem. However, I think it would be good for the authors to provide more details. Potentially a histogram to demonstrate the issue. My feeling is that with sex difference with respect to ASD, the more information that could be provided the better. Overall, it is unclear to me as to how useful a sex-specific analysis may be in this particular context given the sample sizes available in ABIDE.

      5) Results, page 8: "Because we also had computed w-scores from our normative age-modelling approach, we identified specific 'statistical outlier' patients for each individual region with w-scores > 2 standard deviations from typical norms and excluded them from the case-control analysis."

      I'm not sure I agree with the premise of this statement. First, it is hard to know without seeing all of the data, but based on Fig 1, it seems that there are ASD individuals that fall on both sides of this distribution. So if there are effect sizes that can be gleaned, this would be in spite of the variability. Second, it would be paramount to determine how many people are outliers-by-region. This, in and of itself, would be useful information. If a significant proportion of individuals can be identified as outliers, this suggests that variability is the norm rather than an exception. I'm skeptical as to whether you get interesting information from removing these individuals from analyses.

      6) Result, page 9: "While the normative modelling approach can be sensitive to different pathology." I don't think you're capturing anything interesting about pathology with this method, especially as it pertains to CT values.

      7) Result, page 9-10: I'm still confused by this notion of atypicality. Presumably this suggests that 5-10% of all ASDs are more than 2SDs from a normative distribution. But is this at both tails of the distribution? There are significant interpretational issues with this. thus, it is imperative on the authors to do a better job of describing these distributions.

      8) Part of the rationale of this paper is that using the w-score is far more robust than using simple CT values. I'm sure that residualized CT values could have been used for any of these analyses. If that were to be done how would this change the results?

      Minor comments and suggestions on presentation:

      1) While this paper has some merits, I found it hard to read. There is not a clear delineation between the methods and the results, and some methodological considerations are written into the results section and vice-versa.

      2) In the introduction, the authors use the word "deviance" to describe what appears more to me like age-related variation and heterogeneity in ASD. Deviance may be too strong a term and easily mis-interpretable. I would suggest replacing it with something a bit more like variation. Also, the work at the institution of the main author (for example by Baron-Cohen and authors) really champions the use of terms like "neurotypical" rather normally developing. I think, in general, the authors may want to take their cues from this type of language.

      3) This passage in the Introduction need of references. The work by Hong (in Boris Bernhardt's group), Bedford (in Mallar Chakravarty's group), Schuetze (in Signe Bray's group), and Meng-Chuan Lai all come to mind.

      "Even within mesoscopic levels of analysis such as examining brain endophenotypes, heterogeneity is the rule rather than the exception (Ecker, 2017). At the level of structural brain variation, neuroimaging studies have identified various neuroanatomical features that might help identify individuals with autism or reveal elements of a common underlying biology (Ecker, 2017). However, the vast neuroimaging literature is also considerably inconsistent, with reports of hypo- or hyper-connectivity, cortical thinning versus increased grey or white matter, brain overgrowth, arrested growth, etc., leaving stunted progress towards understanding mechanisms driving cortical pathophysiology in ASD."

      4) I found the Discussion missed the mark. It was mostly written as a rehash of the results, with no real biological interpretation. There is not a sufficient examination of the relationship of these findings to other important papers (Kundrakpham, Bedford, Hong, Ecker, Hyde, Lange, etc...).

      5) Figure 3 - The colour bars should be labelled.

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to Version 4 of the preprint: https://www.biorxiv.org/content/10.1101/252593v4

      Summary:

      This paper uses data from the Autism Brain Imaging Data Exchange (ABIDE) to model the relationship between cortical thickness in different brain regions and patients with autism spectrum disorders (ASD) compared to neurotypical controls. The reviewers appreciated the goals and approach of this paper, but, as described below, had questions about the suitability of the data for this analysis, the ways in which the data were processed, the way in which the results were interpreted, and the significance of these findings for understanding autism spectrum disorders.

    1. Reviewer #3:

      The methods used by the authors seem like potentially really useful tools for research on neural activity related to sequences of stimuli. We were excited to see that a new toolbox might be available for these sorts of problems, which are widespread. The authors touch on a number of interesting scenarios and raise relevant issues related to cross-validation and inference of statistical significance. However, given (1) the paucity of code that they've posted, and its specificity to specific exact data and (2) the large literature on latent variable models combined with surrogate data for significance testing, I would hesitate to call TDLM a "framework". Moreover, in trying to present it in this generic way, the authors have made it more difficult to understand exactly what they are doing.

      Overall: This paper presents a novel approach for detecting sequential patterns in neural data however it needs more context. What's the contribution overall? How and why is this analysis technique better than say Bayesian template matching? Why is it so difficult to understand the details of the method?

      Major Concerns:

      The first and most important problem with this paper is that it is intended (it appears) to be a more detailed and enhanced retelling of the author's 2019 Cell paper. If this is the case, then it's important that it also be clearer and easier to read and understand than that one was. The authors should follow the normal tradition in computational papers:

      Present a clear and thorough explanation of one use of the method (i.e., MEG observations with discrete stimuli), then present the next approach (i.e., sequences?) with all the details necessary to understand it.

      The authors should start each section with a mathematical explanation of the X's - the equation(s) that describes how they are derived from specific data. Much of the discussion of cross validation actually refers to this mapping.

      Equation 5 also needs a clearer explanation - it would be better to write it as a sum of matrices (because that is clearer) than with the strange "vec" notation. And TAUTO, TF and TR should be described properly - TAUTO is "the identity matrix", TF and TR are "shift matrices, with ones on the first upper and lower off diagonals".

      The cross validation schemes need a clear description. Preferably using something like a LaTeX "algorithm" box so that they are precisely explained.

      Recognizing the need to balance readability for a general reader and interest, perhaps the details could be given for the first few problems, and then for subsequent results, the detail could go into a Methods section. Alternatively, the methods section could be done away with (though some things, such as the MEG data acquisition methods are reasonably in the methods).

      Usually, we think about latent variable model problems from a generative perspective. The approach taken in this paper seems to be similar to a Kalman filter with a multinomial observation (which would be equivalent to the logistic regression?), but it's unclear. Making the connection to the extensive literature on dynamical latent variable models would be helpful.

      Minor concerns:

      1) Many of the figures, and some of the text are from the 2019 Cell paper. The methods text is copied verbatim without citation.

      2) The TLDM model is presented without context or comparison to other computational approaches employed to identify sequences. Is it also used in the 2016 Kurth-Nelson paper? How does it compare, e.g., to Bayesian template matching (in the case of hippocampal data)?

      3) Cite literature from recent systems neuroscience using hidden Markov models and related discrete state space approaches on neural activity.

      4) How does this method deal with a long sequence for which the intra-sequences have variance in their delta t's? Or data where the observations have some temporal lag relative to each other?

      5) In the "sequences of sequences" section, the authors talk about combining states into meta states. But then the example they give, it appears they just use their vanilla approach. This whole section belongs in a different place than a "supplemental note". The data need proper attribution, an IACUC/ethics statement, etc.

      6) While code can be useful, it is not archival in the same way equations are. Supplementary Note 1 should be in the Methods, and needs to be rewritten in such a way that it explains the steps (i.e., in an algorithm box) rather than just using code. Moreover, when the data generated via this code is used in the text, this section in the methods can be mentioned/linked.

    2. Reviewer #2:

      This paper addresses the important overall issue of how to detect and quantify sequential structure in neural activity. Such sequences have been studied in the rodent hippocampus for decades, but it has recently become possible to detect them in human MEG (and perhaps even fMRI) data, generating much current excitement and promise in bringing together these fields.

      In this paper, the authors examine and develop in more detail the method previously published in their groundbreaking MEG paper (Liu et al. 2019). The authors demonstrate that by aiming their method at the level of decoded neural data (rather than the sensor-level data) it can be applied to a wide range of data types and settings, such as rodent ephys data, stimulating cross-fertilization. This generality is a strength and distinguishes this work from the typically ad hoc (study-specific) methods that are the norm; this paper could be a first step towards a more domain-general sequence detection method. A further strength is that the general linear modeling framework lends itself well to regressing out potential confounds such as autocorrelations, as the authors show.

      However, our enthusiasm for the paper is limited by several overall issues:

      1) It seems a major claim is that the current method is somehow superior to other methods (e.g. from the abstract: "designed to take care of confounds" implying that other methods do not do this, and "maximize sequence detection ability" implying that other methods are less effective at detection). But there is very little actual comparison with other methods made to substantiate this claim, particularly for sequences of more than two states which have been extensively used in the rodent replay literature (see Tingley and Peyrache, Proc Royal Soc B 2020 for a recent review of the rodent methods; different shuffling procedures are applied to identify sequenceness, see e.g. Farooq et al. Neuron 2019 and Foster, Ann Rev Neurosci 2017). The authors should compare their method to some others in order to support these claims, or at a minimum discuss how their method relates to/improves upon the state of the art.

      2) The scope or generality of the proposed method should be made more explicit in a number of ways. First, it seems the major example is from MEG data with a small number of discrete states; how does the method handle continuous variables and larger state spaces? (The rodent ephys example could potentially address this but not enough detail was provided to understand what was done; see specific comments below.) Second, it appears this method describes sequenceness for a large chunk of data, but cannot tell whether an individual event (such as a hippocampal sharp wave-ripple and associated spiking) forms a sequence not. Third, there is some inconsistency in the terminology regarding scope: are the authors aiming to detect any kind of temporal structure in neural activity (first sentence of "Overview of TDLM" section) which would include oscillations, or only sequences? These are not fatal issues but should be clearly delineated.

      3) The inference part of the work is potentially very valuable because this is an area that has been well studied in GLM/multiple regression type problems. However, the authors limit themselves to asking "first-order" sequence questions (i.e. whether observed sequenceness is different from random) when key questions -- including whether or not there is evidence of replay -- are actually "second-order" questions because they require a comparison of sequenceness across two conditions (e.g. pre-task and post-task; I'm borrowing this terminology from van der Meer et al. Proc Royal Soc B 2020). The authors should address how to make this kind of comparison using their method.

      Minor Comments:

      1) Some discussion of grounding the question of what is considered a sequence should be included. What may look like a confound to a modeler may or may not be impacting downstream readout neurons; without access to a neural readout it is not a priori clear what our statistical methods "should" be detecting.

      2) The abstract emphasizes hippocampal replay, but no actual analysis of this is done. I don't think performing such analysis is necessary (although it would be a good way to compare this method to others) but the two should be more aligned.

      3) In the "Statistical Inference" section, the authors stated "Permuting time destroys the temporal smoothness of neural data, creating an artificially narrow null distribution...". Did the authors try shift shuffles, which shifts the time dimension of each row rather than randomly permuting it, hence breaking the relationship between variables but keeping their autocorrelation?

      4) In the "Regularization" section, it is hard to tell how L1 outperforms L2 in terms of detecting sequenceness without benchmarking them with ground truth. Are the authors doing this by quantifying decoding performance on withheld task data? Van der Meer et al. Hippocampus 2017 examine this issue for hippocampal place cell data.

      5) As a rodent ephys person I was excited about the application to hippocampal place cell data, but I couldn't understand Figure 5d and the associated supplementary description. In order for me to evaluate this component of the ms, substantially more explanation is needed on how the data is preprocessed and arranged, and what the analysis pipeline looks like. For instance, Is the left plot in Fig. 5d an average of all pairwise sequences (each decoded location with its neighbors)? And the right plot is the timescale at which this sequence repeats? If so, the repeat frequency should be at rat theta frequency or a little faster (because of phase precession) so I would expect 9 or 10 Hz max -- surprised to see what looks like 12 Hz? In the Supplementary note, I found the discussion about running direction distracting, wouldn't it be simpler and easier to understand to analyze only one direction to start? Also, please clarify if the sequence algorithm was run on the raw decoded probabilities, or on the maximum a posteriori (MAP) locations. What happens if there are no spikes in a given time bin (likely to happen with a small 10 ms window) and were putative interneurons excluded (they should be)? Finally, the authors should note that theta sequences can arise from independent spiking of phase precessing neurons (Chadwick et al. eLife 2015) which seems exactly the kind of issue that the multiple regression framework of TDLM should be able to elucidate; what covariates could be added into the model to test Chadwick et al's claim?

    3. Reviewer #1:

      This paper describes temporal delayed linear modelling (TDLM), a method for detecting sequential replay during awake rest periods in human neuroimaging data. The method involves first training a classifier to decode states from labeled data, then building linear models that quantify the extent to which one state predicts the next expected state at particular lags, and finally assessing reliability by running the analysis with permuted labels.

      This method has already been fruitfully used in prior empirical papers by the authors, and this paper serves to present the details of the method and code such that others may make use of it. Based on existing findings, the method seems extremely promising, with potential for widespread interest and adoption in the human neuroimaging community. The paper would benefit, however, from more discussion of the scope of the applicability of the method and its relationship to methods already available in the rodent and (to a lesser extent) human literature.

      1) TDLM is presented as a general tool for detecting replay, with special utility for noninvasive human neuroimaging modalities. The method is tested mainly on MEG data, with one additional demonstration in rodent electrophysiology. Should researchers expect to be able to apply the method directly to EEG or fMRI data? If not, what considerations or modifications would be involved?

      2) How does the method relate to the state of the art methods for detecting replay in electrophysiology data? What precludes using those methods in MEG data or other noninvasive modalities? And conversely, do the authors believe animal replay researchers would benefit from adopting the proposed method?

      3) It would be useful for the authors to comment on the applicability of the method to sleep data, especially as rodent replay decoding methods are routinely used during both awake rest and sleep.

      4) How does the method relate to the Wittkuhn & Schuck fMRI replay detection method? What might be the advantages and disadvantages of each?

      5) The authors make the point that spatial correlation as well as anti-correlation between state patterns reduces the ability to detect sequences. The x axis for Fig 3c begins at zero, demonstrating that lower positive correlation is better than higher positive correlation. Given the common practice of building one classifier to decode multiple states (as opposed to a separate classifier for each state), it would be very useful to provide a demonstration that the relationship in Fig 3c flips (more correlation is better for sequenceness) when spatial correlations are in the negative range.

      6) In the Results, the authors specify using a single time point for spatial patterns, which would seem to be a potentially very noisy estimate. In the Methods, they explain that the data were downsampled from 600 to 100 Hz to improve SNR. It seems likely that downsampling or some other method of increasing SNR will be important for the use of single time point estimates. It would be useful for the authors to comment on this and provide recommendations in the Results section.

      7) While the demonstration that the method works for detecting theta sequences in navigating rodents is very useful, the paper is missing the more basic demonstration that it works for simple replay during awake rest in rodents. This would be important to include to the extent that the authors believe the method will be of use in comparing replay between species.

      8) The authors explain that they "had one condition where we measured resting activity before the subjects saw any stimuli. Therefore, by definition these stimuli could not replay, but we can use the classifiers from these stimuli (measured later) to test the false positive performance of statistical tests on replay." My understanding of the rodent preplay literature is that you might indeed expect meaningful "replay" prior to stimulus exposure, as existing sequential dynamics may be co-opted to represent subsequent stimulus sequences. It may therefore be tricky to assume no sequenceness prior to stimulus exposure.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 3 of the manuscript.

      Summary:

      The reviewers all felt that the work is extremely valuable: a domain-general replay detection method would be of wide interest and utility. However, as it stands, the paper is lacking context and comparisons to existing methods. Most importantly, the paper would have a larger impact if comparisons with standard replay methods were included. The paper would also benefit from additional detail in the description of the methods and data.

    1. Reviewer #3:

      This manuscript describes measurements of neuronal activity in mice performing a discrimination task, and a new model that links these data to psychophysical performance. The key element of the new model is that sensory neurons are subject to gain modulations that evolve during each trial. They show that the model can produce pure sensory integration, Weber-Fechner performance, or intermediate states that nicely replicate the behavioral observations. This is an interesting and valuable contribution.

      My only significant comment relates to the discussion, which should do more to make sure the reader understands how very different the sensory representation is in this study compared with the great majority of earlier related work in the primate:

      First, choice related signals are not systematically related to stimulus preferences (no Choice Probability). This is mentioned, but only very briefly.

      Second, there appears to be no relationship between stimulus preference (visual field in this case) and noise correlation. Unfortunately, this emerges from the model fits, not an analysis of data. But is an important difference with profound implications for how the coding of information is organized. It really needs a discussion. It should also be supported by an analysis of correlations in the data. I know some people argue that 2 photon measures make this difficult, but if that's true then surely they can’t be used to support a model in which correlations are a key component.

    2. Reviewer #2:

      In this manuscript, the authors present an in-depth analysis of the properties of sensory responses in several visual areas during performance of an evidence-accumulation task for head-fixed running mice (developed and studied by the authors previously), and of how these properties can illuminate aspects of the performance of mice and rats during pulsatile evidence accumulation, with a focus on the effect of "overall stimulus strength" on discriminability (Weber-Fechner scaling).

      The manuscript is very dense and presents many findings, but the most salient ones are a description of how the variability in the large Ca++ transients evoked by the behaviourally-relevant visual stimuli (towers) are related to several low-level behavioural variables (speed, view) and also variables relevant for the task (future choice, running count of accumulated evidence), and a framework based on multiplicative-top down feedback that seeks to explain some aspects of this variability and ultimately the psychophysical performance in the accumulating-towers task. The first topic is framed in the context of the literature on choice-probability, and the second in the context of "Weber-Fechner" scaling, which in the current task would imply constant performance for given ratios of Left/Right counts as their total number is varied.

      Overall, the demonstration of how trial to trial variability is informative about various relevant variables is important and convincing, and the model with multiplicative feedback is elegant, novel, naturally motivated by the neural data, and an interesting addition to a topic with a long-history.

      Main Comments

      1) Non-integrable variability. In addition to 'sensory noise' (independent variability in the magnitude of each pulse), it is critical in the model to include a source of variability whose impact does not decay through temporal averaging (to recover Weber-Fechner asymptotically for large N). This is achieved in the model by positing trial-to-trial variability (but not within-trial) in the dot product of the feedforward (w) and feedback (u) directions. But the way this is done seems to me problematic:

      The authors model variability in wu as LogNormal (pp42 middle). First, the justification for this choice is incorrect as far as I can tell. The authors write: "We model m_R with a lognormal distribution, which is the limiting case of a product of many positive random variables". But neither is the dot product of w and u a product (it's a sum of many products), nor are the elements of this sum positive variables (the vector u has near zero mean and both positive and negative elements allowing different neurons to have opposite preferences on choice - see e.g., fifth line from the end in pp15 where it is stated that u_i<0 for some cells), nor would it have a LogNormal distribution even if the elements of the sum were indeed positive. Without further assumptions, the dot product wu will have a normal distribution with mean and variance dependent on the (chosen) statistics of u and w.

      Two conditions seem to be necessary for uw: it should have a mean positive but close to zero (if it's too large a(t) will explode), and it should have enough variability to make non-integrable noise have an impact in practice. For a normal distribution, this would imply that for approximately half of the trials, wu would need to be negative, meaning a decaying accumulator and effectively no feedback. This does not seem like a sensible strategy that the brain would use.

      The authors should clarify how this LogNormality is justified and whether it is a critical modelling choice (as an aside, although LogNormality in u*w allows non-negativity, low mean and large variability, the fact that it has very long tails sometimes leads to instability in the values of a(t)).

      2) Related to this point, it would be helpful to have more clarity on exactly what is being assumed about the feedback vector u. The neural data suggests u has close to zero mean (across neurons). At the same time, it is posited that u varies across trials (3rd paragraph in pp18: "accumulator feedback is noisy") and that this variability is significant and important (previous comment). However, it would seem like neurons keep their choice preference across trials, meaning the trial to trial variability in each element of u has to be smaller than the mean. The authors only describe variability in uw (LogNormal), but, in addition to the issues just mentioned about this choice, what implications does this have for the variability in u? The logic of the approach would greatly increase if the authors made assumptions about the statistics of u consistent with the neural data, and then derived the statistics of uw.

      3) Overall, it seems like there is an intrinsically hard problem to be solved here, which is not acknowledged: how to obtain large variability in the effective gain of a feedback loop while at the same time keeping the gain "sufficiently restricted", i.e., neither too large and positive (runaway excitation) nor negative (counts are forgotten). While the authors avoid worrying about model parameters by fitting their values from data (with the caveats discussed above), their case would become much stronger if they studied the phenomenology of the model itself, exposing clearly the computational challenges faced and whether robust solutions to these problems exist.

    3. Reviewer #1:

      This study investigates the responses of neurons in the parietal cortex of mice (recorded via two-photon Ca imaging) performing a virtual navigation task, and then relates their activity to the animal's psychophysical performance. It is essentially two studies rolled into one. The analysis of neurophysiological activity in the first part shows that visually driven responses in the recorded "cue cells" are strongly modulated by the eventual choice and/or by the integrated quantity that defines that choice (the difference in left vs right stimulus counts), as well as by other task variables, such as running speed. The model comparison study of the second part shows that, in the context of a sensory-motor circuit for performing the task, this type of feedback may account for subtle but robust psychophysical effects observed in the mice from this study and in rats from previous studies from the lab. Notably, the feedback explains intriguing deviations in choice accuracy from the Weber-Fechner law.

      Both parts are interesting and carefully executed, although both are pretty dense; there are a ton of important technical details at each step. I wonder if this isn't too much for a single study. Had I not been reading it as a reviewer, I probably would have stopped after Fig. 4 or just skimmed the rest. After that, the motivation, methods, and analyses shift markedly. I'm not pushing hard on this issue, but I think the authors should ponder it.

      Other comments:

      1) It wasn't clear to me how the time of a particular cue onset was defined. In a real environment the cues would appear small (from afar) and get progressively bigger as the animal advances (at least if they are 3D objects, as depicted in Fig 1). What would be the cue onset in that case, and does the virtual environment work in the same way? This is probably not a serious issue, but it comes across as a bit at odds with the supposed "pulsatile" nature of the sensory stream, and would seem somewhat different from the auditory case with clicks.

      A related question concerns multiple references to cue timing made in the Intro, as if such timing were very precise. This seems strange given that all time points depend on the running speed of the mice, which is probably variable. So, how exactly is cue position converted to cue time, and why is there an assumption of very low variability? Some of this detail may be in previous reports, but it would be important to make at least a brief, explicit clarification early on.

      2) "positively and negatively choice-modulated cells exhibited gradually increasing effect sizes vs. place/time in the trial (Fig. 4e)" I found Fig. 4e confusing. Some curves are monotonic and some are not, and I'm not sure what is the point of showing the shades (which cover everything). If the key point is to contrast SSA and feedback models/effects, then it would be better to plot their corresponding effects directly, on the same graph, or to show predictions versus actual data in each case, in two graphs.

      3) Fig 6 and the accompanying section of the manuscript investigate a variety of models with different architectures (feedback vs purely feedforward) and noise sources. Here, if I understood correctly, the actual cue-driven responses are substituted with variables that are affected by different types of noise. It is this part that I found a bit disconnected from the rest, and somewhat confusing.

      Here, there's a jump from the actual cells to model responses. I think this needs an earlier and more explicit introduction. It is clear what the objective of the modeling effort is; what's unclear are the elements that initially go into it. This is partly because the section jumps off with a discussion about accumulator noise, but the modeling involves many more assumptions (i.e., simplifications about the inputs to the accumulators).

      What I wondered here was, what happened to all the variance that was carefully peeled away from the cue driven responses in the earlier part of the manuscript? Were the dependencies on running speed, viewing angle, contra versus ipsi sensitivity, etc still in play, or were the modeled cue-driven responses considering just the sensory noise from the impulse responses? I apologize if I missed this. I guess the broader question is how exactly the noise sources in the model relate to all the dependencies of the cue cells exposed in the earlier analyses.

      Overall, my general impression is that this section requires more unpacking (perhaps it should become an independent report?).

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      This manuscript carefully studies the properties of sensory responses in several visual areas during performance of a task in which head-fixed mice run along a virtual corridor and must turn toward the side that has more visual cues (small towers) along the wall. The results provide insight into the mechanisms whereby sensory evidence is accumulated and weighted to generate a choice, and into the sources of variability that limit the observed behavioral performance. All reviewers thought the work was generally interesting, carefully done, and novel.

      However, the reviewers' impression was that the manuscript as it stands is very dense. In fact, it is largely two studies with different methods and approaches rolled into one. The first one (physiology) is still dense but less speculative and with interesting, solid results, and the revisions suggested by the reviewers should be relatively straightforward to address. In contrast, the modeling effort is no doubt connected to the physiology, but it really addresses a separate issue. The general feeling was that this material is probably better suited for a separate, subsequent article, for two reasons. First, because it will require substantial further work (see details below), and second, because it adds a fairly complex chapter to an already intricate analysis of the neurophysiological data.

      We suggest that the authors revise the neurophys analyses along the lines suggested below (largely addressing clarity and completeness), leaving out the modeling study for a later report.

    1. Reviewer #2:

      Arg5, 6, a polyprotein is cleaved to produce two proteins Arg5 and Arg6. The authors report that production of these two proteins is mediated by a mitochondrial protease that is known for its function in N-terminal cleavage.

      The in vitro analysis is interesting, but the possibility of a contaminating activity cannot be ruled out. This needs to be tested by additional experiments, preferably by more data in intact cells.

    2. Reviewer #1:

      This study investigates the biogenesis of Arg5,6 in the yeast S. cerevisiae. Arg5,6 is a polyprotein that was previously established to be proteolytically processed into two proteins (Arg5 and Arg6) that are part of a complex with Arg2. The primary advance reported in the current study is to assign this processing to MPP, a mitochondrial protease known primarily for removing the N-terminal signal peptides from mitochondrial precursors. Additional work showed that the cleavage occurs at an internal sequence that resembles a mitochondrial targeting sequence (MTS), which presumably explains why it is recognised by MPP. This MTS-like internal processing signal is ineffective for directing translocation on its own. Some species contain this polyprotein organisation of Arg5,6, whereas other species encode the two proteins as separate open reading frames. S. cerevisiae Arg5,6 can be replaced effectively by two separately encoded products.

      Specific points:

      1) The authors use purified MPP to show that in vitro synthesized Arg5,6 precursor can be processed to the correct sized products. At that point, the authors "conclude that Arg5,6 is imported into the mitochondrial matrix and processed twice by MPP". This is plausible, but is premature based on the data, which show that MPP is able to process Arg5,6. However, the conclusion that MPP actually does process Arg5,6 in vivo is not documented, and the alternative that something else does this job is not formally excluded. This caveat should be acknowledged unless the authors are able to show necessity of MPP, not just sufficiency.

      2) The experiment showing cleavage with purified MPP (Fig. 1E and S1A) would be strengthened with control experiments using a catalytically inactive mutant of MPP, and a Arg5,6 substrate with a mutated site for cleavage. The first control would rigorously exclude any contaminants, and the second would help verify the site of cleavage.

      3) The conclusion that MPP processes Arg5,6 at the correct site in their in vitro experiments is based on size by SDS-PAGE. The resolution is not sufficient to draw this conclusion, which should be adjusted to say that processing occurs at approximately the correct site (unless the authors perform additional analysis to document the precise cleavage site). Mutagenesis of the putative site (point 2 above) would also be helpful in establishing the site more precisely.

      4) The smaller products seen in Fig. 1E would seem to suggest that MPP exhibits a degree of promiscuity in vitro that is not seen in vivo. This should be noted in the text.

      5) The authors observe that Arg6(1-343) cannot replace Arg6(1-502). They conclude that residues 344-502 are needed for enzyme activity, but this could be for many reasons. For example, Arg6(1-343) might not associate with Arg5. It is premature to imply that catalytic activity is impaired without making such measurements. The conclusion should be adjusted.

      6) It is worth testing whether Arg5(344-862) produced by in vitro translation can be processed by purified MPP. This would help distinguish between some intrinsic problem with access versus a more nuanced issue relating to how import is mediated by the iMTS-L versus a bona fide MTS (e.g., with only the latter recruiting MPP as speculated by the authors).

    3. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      There were some technical concerns regarding the confidence with which the authors draw conclusions about whether the MPP is indeed the responsible protease. It is likely that the authors will be able to address these concerns with relatively straightforward additional experiments.

      We feel that the notion of a polyprotein being processed into multiple functional products by cellular proteases is very well established. Providing an additional example relevant for some species, but not others, is a modest advance in our opinion, as emerged during discussion among the referees and editors. This problem is further compounded by a very similar concept for another mitochondrial protein reported by a subset of these authors recently.

    1. Reviewer #3:

      This is an interesting paper that looks for neural markers of "team flow" experiences compared to individual flow or social interaction using EEG measured during a musical social app game. The approach and analyses are sophisticated, with the main findings being that in a combined beta-low gamma frequency range there was higher power in regions of left temporal cortex for team flow than the other conditions; that other brain regions responded to individual flow or social interaction; that directed analyses found greater information from these other brain regions to the left temporal cortex; and that the left temporal cortices of players engaged in team flow synchronized.

      However, these findings are difficult to interpret as they depend on the behavioural manipulation of the experiment that is purported to separate team flow, individual flow and social interaction, and I don't think these are clearly separated behaviourally. There were 3 conditions. In SyncA, players each tapped on a screen to control one stream of the music. In ScrA the music was scrambled and in Occl the game was as in SyncA, but the players were separated by a barrier. SyncA is supposed to measure team flow, ScrA individual flow but not team flow, and Occl is supposed to reduce social interaction. However, when one examines the ratings that players gave for team flow, individual flow and social interaction, they do not line up exactly with this theoretical manipulation. Specifically (Fig 1), individual flow ratings are higher in SyncA and Occl than ScrA, so SyncA and Occl don't differ in individual flow. Social interaction ratings are higher in SyncA than ScrA, and SrcA is higher than Occl, so Occl disrupts social interaction, but so does ScrA. And Team flow is disrupted by both ScrA and Occl. In other words, there is no clean mapping of the 3 experimental manipulations to the three ratings scales. Also very problematic is that for the rating questions, the three scales of individual flow, team flow and social interaction were not independent (Figure S2). Individual flow was taken as the average of questions 1-6, the social interaction as questions 7-9 and the team flow as questions 1-9! This makes it hard to interpret the findings because team flow is conceptually taken here as the combination of individual flow and social interaction, making the arguments appear circular.

      The "depth of flow state" is a potentially interesting measure, consisting of the mean auditory evoked response (although I note that it is not clear how it was calculated: if it is the average of P1, N1, P2 and N2 or the power in theta) to unexpected task irrelevant beeps. Essentially it measures how distractible the person is from the task. So theoretically, it is not clear exactly how this relates to the complex concept of team flow. People were found to be most distractible for ScrA, not surprisingly, as the scrambled game is probably less fun and engaging, but across subjects, only SyncA was correlated with the individual flow index. Why? I also assume there was no correlation with team flow. Why not? So this is an interesting measure, but conceptually I'm not sure what it tells us about team flow.

      For the analysis of beta-gamma, power at the electrode level at left temporal regions was higher for SyncA - but it was also higher for ScrA than for Occl (Fig 3), so what does that mean? From Fig 1e, team flow ratings were actually lower for ScrA than Occl (although maybe not significantly, but this is in the opposite direction). Also, this difference became exaggerated with high gamma, so why was this not analyzed? And how is this interpreted within the team flow concept?

      For the cluster analysis, some clusters were found with higher beta-gamma power for SyncA, other clusters for ScrA and yet other clusters where power was lower for Occl. However, given as I describe above, that it is not clear exactly how these conditions relate to the concepts of individual flow, team flow and social interaction, I don't think the authors can say as they do that clusters where power is highest for SyncA represent team flow. Clusters where power is lowest for Occl were said to represent social interaction, but this cannot be said because Occl also had high ratings for individual flow (Fig 1) so could be either or both high individual flow and/or low social interaction. Clusters where power is highest for ScrA are interpreted as "flow suppression", but not clear why and whether this refers to individual flow or team flow as both are suppressed behaviourally (Fig 1)?

      The directed connectivity analyses are interesting, but again difficult to interpret in terms of the individual flow, group flow, social interaction model. The regions need to be named more descriptively than GP1, etc. At the very least a table in the main text saying what these regions are would be helpful.

      For the analyses of inter-brain effects, why did they authors go to a new measure, information, rather than using a directed measure as in the previous analysis?

      I am also concerned about the very large number of statistical tests done here - probably experiment-wise error rate control is necessary. The more significant tests will survive this in any case.

      I am also questioning the very detailed brain regions used in the source analysis. It would be difficult I think for EEG to be able to independently separate signals coming from nearby regions so precisely.

      It also seems problematic that many participants were eliminated because they did not prefer to play the game in an interpersonal way over a solo or occlusion setup. Thus it seems that a very selected type of participant was used and I'm not sure if this can generalize. Also, some of the participants were friends, and this may have also influenced how they responded. At least some discussion of these issues is necessary.

    2. Reviewer #2:

      In the present manuscript, the authors introduce a novel task to measure 'team flow'. They test if alignment of brain activity is indicative of a shared experience, similar to mutual understanding (see e.g. work by Stolk et al. TiCS). They utilize a hyperscanning procedure where EEG recordings were obtained for two participants, while they were engaged in a task that requires cooperation.

      While the approach is interesting and the topic timely; all the results rest on a methodological assumption, which has not been accounted for.

      Both participants are presented with the same visual and auditory stimuli, which, when presented simultaneously, elicit the very same evoked response. When applying spectral analysis techniques to these simultaneously evoked responses, one can easily observe 'synchronization', which however, is completely driven by the simultaneous presentation of the external stimuli. This problem is aggravated when rhythmic visual stimuli are presented.

      In addition, several statistical comparisons do not explicitly test the interactions, which are implied by the authors (this problem has been discussed in detail here: https://www.nature.com/articles/nn.2886)

      In addition, several queries apply:

      1) The Flow index needs to be defined earlier in the manuscript (at least prior to Figure 1)

      2) a. Per Fig. 2c: The authors state 'As expected, the mean AEP response was significantly higher in the Inter-ScrA condition more than the other two conditions.' - Why was this expected? This statement is not trivial, why should the violation introduce a stronger response?

      b. Furthermore, it is difficult to reconcile it with the next statement 'Thus, this weaker AEP for the task-irrelevant stimulus in the Inter-SyncA and Occl-SyncA conditions provides neural evidence that the brain has reached a distinct selective-attentional state marking the flow experience.'

      • This is a far stretch from the ERP data

      3) Fig. 2d - The authors need to test for differences in interactions and they cannot claim differences when one test is significant and the other is not. See e.g. https://garstats.wordpress.com/2017/03/01/comp2dcorr/

      This again pertains to Figure 4c

      4) Testing different frequency bands independently is again not valid, since, power values across bands are strongly correlated, see e.g. see work by Donoghue and Voytek (2020) biorxiv or Haller et al. (2018) biorxiv. Fig 3c makes this even more likely that some of the effects are broadband and not band-limited 'oscillations'.

      5) All the differences localize to auditory areas, which makes one very suspicious that we are looking at evoked and therefore synchronized activity, and not alignment of endogenous oscillations, see e.g. a recent commentary: https://doi.org/10.1080/23273798.2020.1758335 The current paradigm basically would show synchrony (mistaken as team flow), when simultaneous spurious 'entrainment' (simultaneous evoked activity) is present in both participants; this confound needs to be accounted for since it confounds subsequent metrics of phase synchrony

      6) Statistics in Fig. 4b, these tests and ROIs are not independent, a data-driven cluster approach could be utilized instead (see Maris and Oostenveld 2007).

      7) Bar plots are deprecated, see Weissgerber et al PLOS Biol 2015.

      8) Analysis for Figure 5a needs a depiction on what is actually analyzed. The hierarchical clustering approach is introduced with clear rationale and explanation.

      Overall, this is an interesting approach. It is a methodological challenge to record EEG data from two interacting participants, but given that this is a relatively young field, some methodological prerequisites need to be established first. Critically, the authors need to present convincing evidence that we are not just facing the results of simultaneously evoked auditory and visual evoked responses.

    3. Reviewer #1:

      In this EEG study, the authors aimed to identify neural correlates of the subjective feeling of "team flow", i.e., a particular feeling of ease, task-related attention and control while doing a task together with someone else. This is a clearly interesting question and with a recent surge of hyperscanning research a timely study. The authors seem to have carefully selected pairs of participants who have similarly good performance in the game and similar music taste to be able to induce feelings of flow in their participants. Unfortunately, there seem to be quite serious problems in their statistical analyses which should be corrected first before the work can be assessed.

      1) Participants:

      a. The methods state that there are 15 participants, of which five were paired twice (p.13). In the Statistical analysis section, the authors state that "the unit of analysis" was participation, i.e., n = 20 (p. 17). This means apparently that five participants took part twice but were considered as independent measures in the statistical analyses. However, these are obviously dependent measures (or, repeated measures). The authors should include 20 (independent) participants in their analyses or need to take into account that five of the recorded 20 participants are identical.

      b. The supplementary material explains in detail the selection of participants. Based on the selection criteria, 38 participants were identified (suppl mat p. 3), but it is not explained what happened to the 23 participants which are not part of the current manuscript. (Also, only the supplementary materials state that preferably friends were selected as pairs and that only those were selected (and called "prosocial") who considered doing the task together more pleasurable than doing the task alone. This should be mentioned in the main text and it seems to bias the subjective evaluation of the conditions presented in Fig 1?)

      2) Statistical analyses:

      Several of the analyses compare the neural data in the three different conditions with one-way ANOVAs. As these are dependent measures from the same participants, this should be analyzed with repeated measures ANOVA. Also, I didn't quite understand the statistics presented on p.8 (on information flow, with two-way ANOVAs with the impressive df of 26 and 494) and on p.9 (F(26,10133) = ... ?), but again the different measures within one subject seem to be considered as independent measures?

      3) At several points of the analyses, it seemed like the analyses were biased. For instance, for the AEP analyses (which I generally considered a nice way to establish an "objective" measure of flow) only those channels were considered which in each resting trial robustly showed an AEP (p.14/15). Does that mean that different channels were considered for each trial and condition? I would suggest selecting the same set of central electrodes and then take these for all AEP analyses. Another case is the clustering analyses in which the number of cluster was selected such that condition differences were significant. Maybe I misunderstood this point but I guess the clustering should be done first and in the second (and independent) step, the condition differences can be assessed.

    4. Preprint Review

      This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

      Summary:

      Your manuscript reports on a sophisticated experimental study in human participants. The study looks for neural markers of "team flow" experiences compared to individual flow or social interaction using EEG measured during a musical social app game. While the approach and analyses are sophisticated, all reviewers individually raised a series of substantial concerns with respect to EEG and statistical analysis. The editors and reviewers hence are unable to share the conclusions the authors would like to draw.

    1. Reviewer 2

      This work investigates the role of autophagy in the migration of neuroblasts in the forebrain of adult mice. The authors provide evidence that activity of the autophagic pathway is related to the ratio of the migratory / stationary phase. They also provide evidence that activity of the autophagolysosomal pathway is related to the ATP/ADP levels and that autophagy targets the focal adhesion molecule paxilin. Autophagy is emerging as a central pathway in the regulation of neuronal development and the manuscript adds interesting and new evidence to this concept. Overall I consider this an important and well designed study.

      Major Comments

      1) In order to support the central notion that ATP/ADP levels control the autophagolysosomal turnover of paxilin, which in turn regulates migration the authors should investigate the localization and expression of paxilin in their experiments using the analysis that was applied in figure 3.

      2) The data presented in Figure 1-4 is sound and directly related to the central point of the paper, i.e., that ATP/ADP levels control the autophagolysosomal turnover of paxilin. The data presented in Figure 5 is circumstantial and except for showing that different pathways that have been linked to migration processes (not specifically migration of neuroblasts in the adult RMS) modulate expression of autophagy-related proteins. The data does not contribute to the key message of the paper and I would suggest to remove the data entirely.

      3) I am convinced that the paper carries an important and novel message but the order of how the results are presented seems not ideal to me. I believe that the order: analysis of autophagolysosomal activity in relation to migratory phases, analysis of metabolism with a focus on ATP/AMP ratios, and finally analysis of Paxilin as a potential target of Autophagy would be more stringent and convincing.

      4) Data presentation: in many instances the authors provide sample images of only one experimental condition e.g. Fig 2 M-O, R, W. While this may provide an impression of how the data was collected I think that it would be more convincing if the authors provided example images of all experimental conditions to illustrate differences. In addition the figure legend for Figure 2 M-O does not clarify which/ whether the sample images are from WT or mutant mice.

      5) The authors write that "...An Atg5 deficiency led to the accumulation of neuroblasts in the RMS close to the SVZ (582.7 {plus minus} 72.5 cell/mm2 in WT mice vs. 846.7 {plus minus} 72.7 cell/mm2 in cKO mice; p<0.05, n = 4 animals per group), with an accompanying decrease in the density of neuroblasts in the rostral RMS (RMS of the OB) and the OB. The graphs in figure 2P do not support the statement. that the density of neuroblasts in the rostral RMS and the OB are lower in ATG5 KO conditions. Please correct the statement and provide an explanation of how the numbers of neuroblasts can be stable if a higher number is observed in the more caudal portions of the RMS.

      6) The authors use CRISPR/Cas9 to knockout Atg12, if possible I would like to ask the authors to confirm the loss of Atg12 protein.

      7) The authors use the RFP GFP LC3 reporter which allows estimation of autophagic flux in vivo. In their analyses of autophagolysosomal activity (Figure 1I) they only estimate the RFP punctae. determining changes in autophagolysosomal activity would be stronger and more convincing if the authors performed the GFP+RFP/RFP punctae ratio.

    2. Reviewer 1

      In the last few years the role of ATP metabolism at the synapse and in neuron development has become a topic of growing interest for the field of neuroscience. Autophagy is a prominent cellular process that has important roles in axon degeneration, cell death and nervous system disease. Importantly, the role of autophagy in neuron development has been less heavily studied than in disease contexts.

      The authors provide extensive quantitative datasets clearly indicating that genetic or pharmacological inhibition of autophagy results in reduced migration from the SVZ to the olfactory bulb. Indeed, a strength of the study is evidence showing that genetic effects on multiple autophagy components and AMPK (which activates autophagy) affect neuron migration. Another high point is extensive, quantitative time-lapse analysis of neuron migration in acute slice. This ex vivo approach is informative and makes for a compelling case regarding the role of autophagy in neuron migration from the SVZ to the olfactory bulb. While some in vivo data is provided more evidence on this front would further strengthen the study.

      While this is an interesting, high-quality study, the authors do not introduce an important body of prior literature on autophagy in neuron migration (Peng et al, 2012 JBC; Petri et al, 2017 EMBO; Gstrein et al, Nat Neuro 2017; Li et al, 2019 Cereb Cortex). As a result, it is unclear to the reviewer whether this is a significant step forward for the field, or a further valuable study solidifying the role of autophagy in neuron migration. At this point, the reviewer leans towards the latter view point. Below, are further details on this issue and several suggestions the reviewer hopes will improve what is a very nice piece of science.

      Major Comments

      1) The Gstrein paper is a very important piece of prior work, but is buried in the Discussion. This needs to be brought up in the introduction and noted appropriately. The manuscript also does not cite two other important papers showing that changes in autophagy can affect neuron migration in the olfactory bulb in adults in vivo (Petri et al, 2017 EMBO), and that molecular perturbations that affect autophagy impact neuronal migration in the cerebral cortex in vivo (Peng et al, 2012 JBC). Further recent work has shown that altered autophagy accompanies impaired neuronal migration in vivo in the cerebral cortex (Li et al, 2019 Cereb Cortex) and in vitro in a neuronal cell line (Li et al, 2019 Front Endocrinol). Placing the existing study's contribution more carefully and thoroughly within the context of this prior body of work on autophagy in neuron migration at the onset of the paper is critical.

      The attempted selling point of conflicting roles for autophagy in cell migration based on other cell-based studies and non-neuronal tissues is not particularly helpful and distracts from a major issue: There are already multiple studies indicating that autophagy affects neuron migration in vivo, and it is unclear how this work represents a major advance.

      2) The introduction does not comment on the role of ATP in neuron development and function; this has been an area of intense study in recent years. This type of background would be helpful for framing the context of the findings here.

      3) In vivo data indicating that autophagy influences neuron migration from SVZ to olfactory bulb is very important. Perhaps the reviewer is mistaken, but it seems like only Figure 2P shows quantitation of vivo data. This indicates loss of Atg5 results in increased cell numbers in the RMS. This an extremely important point. Hence, more evidence that other pharmacological or genetic manipulations of autophagy change cell density/migration in vivo would be valuable.

      The authors state: "with an accompanying decrease in the density of neuroblasts in the rostral RMS (RMS of the OB) and the OB". This is not supported by necessary statistical analysis, and looks likely to be insignificant. Statistics should be run here and commentary adjusted accordingly.

      4) In Figure 4B and C, quantitation of the ATP biosensor Perceval is shown. The authors claim a 20 fold change during migration. However, the Perceval ratio goes from 1.01 to 0.99 during one migration step and then 1.01 to 0.97 in the second step (Fig 4B). How is this a 20x change? To the contrary, this seems like quite a modest decrease in ATP ratio.

      How does this ratio change in a positive control where ATP production is reduced by impairing glycolysis or mitochondrial function?

      How does the role of ATP production by glycolysis versus mitochondrial stores influence migration?

      Presentation of data as a % change in Figure 4C is not ideal and gives the impression of artificially exaggerated effect sizes. Statistics are also notably absent from Figure 4C which makes a critical, quantitative point about migration and ATP consumption.

      5) The study emphasizes the point that AMPK senses changes in ATP levels and also activates autophagy. Both of these concepts are well known. Thus, pharmacologically blocking a known activator of autophagy like AMPK and showing effects on cell migration further supports the idea that autophagy is required for cell migration. This does not tie together that changes in ATP levels are affecting autophagy and, therefore, migration. Is there a way to directly manipulate ATP levels and then looks for impacts on both autophagy and migration? Is there a way to alter AMPK activation by ATP changes?

      6) In Figure 5D, treatments that increase and reduce the speed of migration show the same effects on autophagy as assessed by LC3II levels. How do the authors explain this? Wouldn't one expect opposing outcomes? Is this correct?

      7) Links between paxillin and cell migration are correlative, and not particularly convincing. Does reducing or increasing paxillin function affect migration? Does triggered specific heightened degradation/turnover of paxillin affect migration?

    3. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on March 30, 2020, follows.

      Summary

      In this work you uncover the complex interplay between autophagy and energy consumption to regulate the pace and periodicity of the migratory and stationary phases in a prototypic model of migration in adult brain (the SVZ-OB). Both reviewers considered your work important as you provided evidence that 1) activity of the autophagic pathway is related to the ratio of the migratory / stationary phase, 2) activity of the autophagolysosomal pathway is related to the ATP/ADP levels, and 3) autophagy targets paxillin, a focal adhesion protein that is the direct target of LC3II.

      Essential Revisions

      The original reviews are attached below. Most of the major points can be addressed with minimal new experiments, but may require reanalysis of data or samples you already have. Based on these reviews, it is essential to address the following key points:

      1. Deepen the analysis of paxillin localization and expression
      2. Confirm the impact of Atg5 on the density of neuroblasts in the RMS (increased?) and the OB (unchanged?)
      3. Confirm the loss of Atg12 protein,
      4. Quantify autophagolysosomal activity (Figure 1I) by analyzing GFP+RFP/RFP punctae ratio

      In addition, while the science is strong, claims regarding the advance over previous studies should be toned down, since there is existing literature showing roles of autophagy in neuronal migration. The paper needs rewriting to accurately place your work in the context of prior research in the field. Specific recommendations include : i) rewrite the Introduction and Discussion to cite the literature appropriately: (ii), the second referee suggests that your Results could be presented in a different way in order to make better use of your data.

    1. Reviewer #2

      In their manuscript titled "SOX21 modulates SOX2-initiated differentiation of epithelial cells in the extrapulmonary airways" from Eenjes et al. the authors describe a new role for Sox21 in the developing and adult airway epithelium. Building on prior work they observe a unique distribution of Sox21 expression in the developing airways and through careful and elegant immunostaining and over-expression studies they suggest that Sox21 is downstream of a key airway development transcription factor Sox2. They suggest opposing roles for these genes based on analysis of Sox2 and Sox21 heterozygous knock-out mice whereby Sox21 prevents and Sox2 promotes differentiation of immature airway progenitors into basal cells. This raises interesting questions regarding the role of Sox21 in the biology of airway progenitors and the balance of cell types in the airways. In adult mice no obvious differences are detected in Sox21+/- mice in terms of regenerative capacity. In in vitro assays of adult mouse basal cell differentiation (ALI culture) failed to identify obvious differences in basal cell number of differentiation into specific lineages in Sox21+/- cells. The authors report an overall similar pattern of SOX21 expression in a human fetal lung organoid platform. Finally, in human ALI cultures the authors identify SOX21 expression most abundant in paranasal cells suggesting a similar possible role for this genes in differentiating basal cells in humans. Overall, the molecular mechanisms that control airway stem cell maintenance and differentiation are of great interest to the field.

      Major concerns:

      The development data is clear based on the studies presented. The immunostaining and figures are elegant. It does not appear the Sox21 is necessary for airway development. A key question raised by this work is how important and what precisely is the role of Sox21? For example, overexpression of Sox21 in the developing and adult airways might have been more instructive in answering these key questions rather than under the control of the SPC-promoter.

      The adult mouse data and human data seem to overall suggest that SOX2 and SOX21 are upregulated in differentiating cells in vitro and are expressed at lower levels in basal cells but real mechanism or phenotype related to SOX21 function is observed in the SOX21 het mouse cells.

      The attempt to assess human fetal airways is admirable. However, sections of fetal lungs would be more relevant. Can the authors address how similar fetal lung cultures are to in vivo airways and at what developmental time point ?

      Overall the relevance of SOX21 expression in the adult airways is unclear.

    2. Reviewer #1

      This is an interesting but also a bit confusing manuscript by Evelien Eenjes and colleagues.

      I was wondering whether the seemingly conflicting effect of Sox21 on basal cell differentiation could be due to an increase or decrease in Sox9 positive basal cells. These atypical basal cells have been described in a number of manuscripts.

      For example - https://www.ncbi.nlm.nih.gov/pubmed/26869074 - it appears from the figures that Sox21 may increase Sox9 expression. Can the authors look at the presence of Sox9 positive basal cells in the trachea of the different mutants?

      It would be important to quantify the % of club cells in the corn oil and naphthalene treated tracheas of the different mutants as these are the main cells that should be replaced upon naphthalene injury. I am not sure that people normally look at the number of ciliated cells reappearing after naphthalene injury as ciliated cells are not thought to be killed by naphthalene.

      The authors should also look at Sox9 positive basal cells in the tracheas of the corn oil and naphthalene treated ctrl and mutant mice.

      Are the intermediate para-basal cells in the human airway differentiation experiment club cells? Why do the authors think these cells are on their way to become ciliated cells? Doesn't Sox21 inhibit ciliated cell differentiation?

    3. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on June 1, 2020, follows.

      Summary

      This is a comprehensive analysis of Sox21 in lung development, homeostasis, and injury, using wide array of methods including mouse human cell-based ALI cultures and organoids. The authors describe a new role for Sox21 downstream of the key airway development transcription factor Sox2. They suggest opposing roles for these genes based on analysis of Sox2 and Sox21 heterozygous knock-out mice whereby Sox21 prevents and Sox2 promotes differentiation of immature airway progenitors into basal cells. This raises interesting questions regarding the role of Sox21 in the biology of airway progenitors and the balance of cell types in the airways. In adult mice no obvious differences are detected in Sox21+/- mice in terms of regenerative capacity. In in vitro assays of adult mouse basal cell differentiation (ALI culture) failed to identify obvious differences in basal cell number of differentiation into specific lineages in Sox21+/- cells. In human ALI cultures the authors identify a similar possible role for Sox21 in differentiating basal cells in humans. Overall, the molecular mechanisms that control airway stem cell maintenance and differentiation are of great interest to the field. However, in its current form the studies remain primarily descriptive and several data inconsistencies remain, which makes it difficult to draw impactful conclusions at the current stage. The potential suitability for eLife would significantly increase with additional mechanistic experiments and data.

      Essential Revisions

      In particular, the following major points require additional experimental data (details are outlined below):

      1) The functional Sox21 role in the adult lung is not clear.

      Mouse studies would benefit from inclusion of overexpression of Sox21 in the airways

      In human airways and the models used (organoids and ALI cultures) gain and loss of function studies are not performed and would also behelpful to further determine the phenotype of the intermediate para-basal cells in the human airway.

      2) Evidence for genetic interaction between Sox21 and Sox2 in the adult lung needs to be expanded.

      3) Presence of Sox9 positive basal cells as additional atypical basal cells needs to be further investigated.

      4) Quantification the % of club cells in the corn oil and naphthalene treated tracheas of the different mutants as these are the main cells that should be replaced upon naphthalene injury.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 24 2020, follows.

      Summary

      The current and widely accepted model how nuclei are positioned in the cell proposes that their anchorage is mediated through nuclear membrane spanning SUN and KASH domain containing proteins which connect the cytoskeleton to the nuclear interior. This study presents evidence that this model needs reconsidering. The authors follow up on their previous observation that nuclear positioning differs between mutants in SUN (UNC-84) and KASH proteins (ANC-1), where anc-1 mutants display a more severe phenotype. The study nicely combines structure-function analysis of ANC-1 with the readout of nuclei positioned in the syncytial cell hyp7 in fixed samples and by live imaging, and the mutant analysis correlates well with the fitness of the whole organism. The authors also show that the expected regions (the KASH domain or the actin binding domain) play rather minor roles in nuclear anchorage. Instead, anchorage is mapped to a portion of the transmembrane domain and to spectrin-like repeats in the cytoplasmic localised region of the protein. Mutating those protein domains not only leads to irregularly positioned nuclei but also to disorganisation and fragmentation of the ER, which also appears unanchored as evidenced by in vivo imaging. Consistently, the authors find ANC-1 prominently associated with the ER and that this association is independent of the KASH domain.

      This paper will be very interesting to a wide readership as it challenges current models of nuclear positioning and shows that mis-positioning and loss of connectivity can have severe consequences to the development of the organism. However, the reviewers felt that some of the conclusions required additional data to support the statements made, in particular to strengthen the proposed the ER link, as well as some other points that require clarification as detailed below.

      Essential Revisions

      1) The data shown in Figure 6 suggesting ANC-1 targets to the ER was not convincing and was also very hard to contextualise as no other markers for other organelles or the nucleus were present, and no merged images provided. Given that the authors also repeatedly draw conclusions relating to mitochondria, including a mitotracker in the far-red channel would be important. The ANC1 signal appeared rather diffusely localised in all cases (WT and TK/KASH mutants; noting that the KDEL signal is very low in the TK mutant image compared to the others) and it could be argued any potential overlap might be coincidental rather than specific. No other evidence for ER targeting was provided but would be required to support the current conclusions. The authors should also at minimum calculate the co-localization index and provide examples of line scans to show a correlation between signal intensities. Also the sentence "The slight discrepancy between localization of mKate2::ANC-1b and GFP::KDEL could be explained by the distance between the ANC-1 N terminus and the ER membrane" will need to be modified. This is highly unlikely due to the resolution of the microscope used to acquire this image. The discrepancy might be due to differences in the focus plane of the two light channels. Please remove or edit this speculation.

      2) A further concern is the lack of any cytoskeletal network analysis in the paper; the authors (somewhat extrapolated) conclusions are focused on the concept that nuclei and the ER are detached from any sort of cytoplasmic structural network in anc-1 mutants, but this is not tested in the study. It would be very helpful to provide this evidence, but we recognise that analysing the cytoskeleton in detail may prove impractical given the current restrictions. However, if experimental data cannot be provided, it would be important to tone down such statements given the lack of formal evidence provided for this model.

      3) ANC1b-del6RPS and del5RPs appear to be expressed at much lower levels than other mutants (or WT protein) eg: Fig5; is this the case and if so, might the nuclear positioning defect just be due to this (ie: as the animals are more like anc-1 null)? It would be important to understand relative stability of these mutant proteins to interpret this data. Please provide more information on the relative intensity levels of each mutant and if they are similar, provide more representative example images.

      4) The authors put forward the claim that ANC-1 anchors other organelles besides the ER. There are a lot of markers available for different organelles, and it would be important for the authors to provide some staining in the most significant mutants (delta 6rps, delta tk), for mitochondria or other organelles. On page 11 the authors also write "Nuclear shape changes were observed during live imaging in anc-1 mutants consistent with a model where anc-1 mutant nuclei are susceptible to pressures from the cytoplasm, perhaps crashing into lipid droplets that corresponded with dents in nuclei.". It is not clear why the authors suggest "crashing into lipid droplets" and not any other organelles in the cell (mitochondria, endosomes, etc). Is there any evidence for this statement that can be provided?

      5) p. 14: ...was not enriched at the nuclear envelope...". it would very important to include pictures of the ANC-1 staining together with a marker for the nuclear envelope in Figure 5 to support this statement.

      6) Page 6, Figure 2A,B. it is not clear in the text that two of the four nonsense mutations that were analysed only disrupt isoform a and c (w427 and w621). If these mutations only affect isoform a and c, and not the shorter isoform b, then the authors could explain this better, so that this result goes together with the RNAi data.

      7) Figure 1c and Figure 3C - the dataset of WT and anc-1(e1873) seems to be the same in both these figures. Although duplication of the same data in multiple figures should not typically occur in publications, the duplication should be indicated clearly in the figure legend for transparency. Ideally, the duplicated data should also be shown in a different color/format in the figure so that they are immediately obvious. Other duplications of data in the manuscript should also be indicated. Why was unc-84(n369) data repeated three times on the graph in Fig 3C but with different values?

      8) The percentage of touching nuclei in unc84(n369), KASH mutants (fig 1C), and anc-1(dF1)(fig 3c), are around 20%. However, the authors classify the KASH mutants as "mild nuclear anchorage", whereas anc-1(dF1) on fig 3c was classified as "severe nuclear anchorage". The authors also used the term "intermediate nuclear anchorage" for other phenotypes. The authors should use the same classification (clearly described) throughout the manuscript.

      9) Figure 4A - the authors described the image of GFP:Kdel of hyp7 syncytia as a "branched network". It is very difficult to observe any branches and a network in these images. We recognise the authors want to use the same terminology that is used to characterize ER in other cells, but it is difficult to understand what the authors mean by a branch. Diffuse staining with dark round objects in it is visible, but branches are not clear. The authors should provide a better example, more similar to unc-84(n369) mutant, where branches and a network are more distinct and annotate these properly to support this statement.

      10) Figure 4C and D; movie 1 and 2- In these figures and the authors compare the dynamics of ER in WT and anc-1 mutant. The authors also provide representative movies. The WT animal appears to be crawling significantly less than the anc-1 mutant, making it difficult to directly compare the anchorage of ER in WT during crawling in the representative movies. Moreover, - Supp Movie 1 appears to be far shorter (fewer frames) than Supp movie 2 - ideally, equivalent time scales should be provided for WT animals to be comparable with the mutant.

      11) Figure 6A - in this scheme and through the manuscript, the authors refer to the c-term region of ANC-1, after the TM domain, as the KASH domain. However, by definition, the KASH domain includes part of the neck region, the TM domain and the c-term region after the TM domain (see e.g. Fig4 of Wihelmsen et al. JCS 2006, or PFAM definition of KASH - 10541). Therefore, the neck region of the KASH domain plays a role in nuclear positioning. Furthermore, labelling of Figure 6A should be changed accordingly to standard definition.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 29 2020, follows.

      Summary

      The manuscript by Silva-Garcia and colleagues addresses an interesting aspect of BRCA2 biology: the regulation of its steady state levels in response to DNA damage. It has become clear that BRCA2 level are subjected to change in response to a number of different environmental challenges, including the induction of various types of DNA lesions. This manuscript identifies the serine protease DPP9 as cleaving off the first two N-terminal amino acids of BCRA2 to target BRCA2 towards the N-degron pathway. The DPP9-mediated turnover of BRCA2 regulates the BRCA2-RAD51 stoichiometry and appears to promote RAD51 focus formation.

      Overall Evaluation

      The biochemical data are generally solid and support the conclusions, but the interaction has not been tested with the endogenous proteins and the affinity is low (~17 uM). The cell-based assays reveal a potentially significant problem in the BRCA2 construct. Overall, the physiological relevance is far from certain. To be conclusive, the PLA would need single primary antibody-only controls to ensure specificity. More importantly, although it seems possible that DPP9 has an effect on RAD51 foci, it is not clear from the results whether this is directly connected to BRCA2 or is an indirect effect. In particular, the results with the truncated form and full-length BRCA2 are done in overexpression; however, the levels of the two proteins are different and it is known that BRCA2 overexpression is toxic for the cells. Stable expression or equal transient expression would have been more convincing to draw these conclusions or at least rescuing the phenotype of full-length with the truncated form and/or with WT DPP9 would have been more adequate. The overall model needs to be significantly toned down in light of available data from cBioportal and DepMap that are inconsistent with the overall model that DPP9 cleavage is essential for BRCA2 function.

      Essential Revisions

      1) As the authors recognize, that there is a conceptual problem with the model that suggests that BRCA2 cleavage and degradation is required for RAD51 focus formation, when it is BRCA2 function that is required for RAD51 focus formation, as shown by genetic studies. How can the degradation of an essential factor for RAD51 focus formation be required for RAD51 focus formation? If, as the authors suggest, the BRCA2 : RAD51 ratio is critical, RAD51 overexpression should be detrimental. The authors may want to consult the literature on RAD51 overexpression and its effects and discuss this.

      2) Figure 1: Based on the data presented in Figure 1 panel A, it is concluded that DPP9-BRCA2 PLA signal is detected in the nucleus. However, that is hard to assess, as the visualized dots localizing in the DAPI stained areas could easily be on top of the nucleus. In addition, is there any evidence that DPP is actually a nuclear protein?

      The PLA experiments seem to lack the single antibody negative controls that require to account for nonspecific signal.

      Fig. 1D: This figure reports the results from the SPR as (partial) binding isotherms based on equilibrium analysis.

      Include the SPR sensorgrams (for example in supplemental figure 1) to show that equilibrium analysis can be performed on this dataset, thus if: • Observed response is due to specific binding of peptide to enzyme (methods section does not explain if reference surface subtraction has been performed and how much specific binding occurs). • Binding actually reaches equilibrium after 4.5 minutes association phase (rather than continuing to rise). • Dissociation occurs during dissociation phase of 7 minutes, at least for majority of signal, to make sure reaction is reversible (expected for micromolar range).

      Would it be possible to add inhibitor to the association phase to verify specificity of observed interaction?

      Saturation is far from complete. The uncertainty of the fits themselves (as reported in Graphpad analysis) are probably much larger than the reported variations between triplicate KD determinations, especially for the shorter peptide where not even 30% of the enzyme seems to become occupied at the highest peptide concentration. The uncertainty from extrapolation of maximum observed 50% and 25% binding for the different peptides towards 100% response plateau will have major influence on apparent KD. The uncertainty of the fits has to be considered in deciding whether the shorter peptide binds significantly weaker than the intact peptide.

      The enzyme on the chip surface will cleave the peptide and will retain the dipeptide but release the rest of the peptide (according to the observed electron density in the crystals). The intact peptide (40 amino acids) will results in an approximately 20-fold higher response in the SPR assay than the dipeptide (2 amino acids). Depending on the kinetics of peptide binding, the chemical conversion and product release, the observed signal in the sensorgrams will be mostly substrate, mostly product or a mixture of these, the composition of which changes during the experiment. Even if a steady-state binding level is observed in the raw data, the obtained parameters are probably not reflecting equilibrium binding constants. An active site mutant may be helpful here (and might return a much stronger affinity for the intact peptide). If that is not possible, a more qualitative reporting of the data could be considered rather than trying to obtain affinities.

      Fig. 1 S2: DPP9-BRCA2 PLA signal seems to increase with MMC in siBRCA2 treated cells. This potentially indicates that the PLA signal does not report on the DPP(-BRCA2 interaction.

      Why does removing FLNA reduce the BRCA2-DPP9 interaction? Is the BRCA2 pool bound to FLNA targeted by DPP9?

      Fig. 1B, G. It is unclear what is shown on the Y-axis, total fluorescence, number of foci?

      Fig. 1C, D: The affinity found for the interaction between BRCA2 and DPP9 is ~17 uM which seems too weak for a physiologically relevant interaction. Nonetheless the interaction appears to be real as they find it also with purified fragments. An immunoprecipitation with endogenous proteins with and without MMC treatment would be necessary to complement these findings in cells as the PLA is not sufficient.

      Fig. 1G: Please provide the statistical analysis of the (-) and (+) 1G244 samples under MMC treatment. This is the key control and not the comparison to -MMC.

      3) Figure 2: Fig. 2: If DPP8 and DPP9 crystals with the BCRA2 peptide are so similar, why is there a phenotype of DPP9 mutants? Why is DPP8 then not providing redundancy?

      Fig. 2: While the disorder in the DPP9 crystals is clear from the average B-factors (Table 1), what are the local B-factors for the active sites in both structures? Is the quality of the electron density in the active site of DPP8 actually good enough (~3 Å) to establish that the identity of the two amino acids are methionine and proline? Can it be ruled out that coincidentally one might be observing 2 residues from only partially ordered longer peptide? What other information from previously determine enzyme-inhibitor complexes (Ross et al 2018) and active site geometry is maybe used in concluding that the density corresponds to the N-terminal dipeptide?

      4) Figure 3: Fig. 3 A-D: The DNA damage sensitivity assays lack a control with siBRCA2 cells to show the sensitivity compared to knock-down of DPP9, without that it is difficult to interpret the results. This is especially relevant if the point is to show that DPP9 is required for the function of BRCA2 in DNA repair. The observed sensitivities to DNA damaging agents are very mild (plot axes are 1 log). Given that for example sensitivity of BRCA2 mutants to PARP inhibition is extreme, can the effect of DPP9 on survival be indirect?

      Fig. 3 E. The interpretation and description of these results do not appropriately reflect the data. HeLa DPP9 KD cells start with a higher constitutive level of gammaH2AX but the overall kinetics of the increase and decrease appears very similar. The statistical analysis, hence, is not appropriate to test a repair defect. How do normalized curves look like with 0 hr set to 100% and what is the statistical analysis of normalized curves?

      Fig. 3F. Same problem as in Figure 3E, although the increase in gammaH2AX signal is more dramatic. However, in the present illustration, it remains unclear, whether there is a defect in repair as measured by decrease of the gamma H2AX signal.

      5) Figure 4: Fig. 4A: Please provide statistical analysis of siDPP9 +/- MMC and siNT versus siDPP9 + MMC. The analysis provided -MMC siNT and +MMC siDPP9 is not helpful.

      Fig. 4 C and Fig. 4S1: When comparing the signal labeled BRCA2 in, for example Figure 4 panel C, with that in Figure 4 - figure supplement 1, it is difficult to understand that the signal represents the same protein, as in one blot the signal is a collection of bands, while in the other it appears to be a defined band.

      Fig. 4C: Is vinculin a proper control for quantitation of the levels of BRCA2, given that its signal appears out of linear range?

      Fig. 4I, J, line 373: The levels of BRCA2 1-1000 are different compared to BRCA2 3-1000. As this is a transient transfection the difference in the levels might be due to the quality of DNA of one plasmid versus the other. It is not clear that it can be concluded that BRCA2deltaMP is less stable than the unmodified N terminal BRCA variant. This is because the expression levels of the two variants is so different and both seem to decrease (relatively to their starting signal to the same extent). See overall evaluation.

      Fig. 4S2: What is the rational for using a different control protein to measure levels of RAD51 and BRCA2?

      Fig. 4S2: The scheme in G cannot relate to E and F, as RAD51 is analyzed there. Does it maybe relate to Figure 4 I and J. If so this should be indicated in the figure legend to Figure 4I/J and corrected. I suggest moving Figure 4S2G to Figure 4 as part K. What is the evidence that the ubiquitin is cleaved and how efficient is cleavage?

      6) Figure 5: It was very surprising that the WT BRCA21-3148 plasmid was not functional at all with the level of RAD51 foci being equal to the no plasmid negative control. Is it because of a construct problem? One would expect the WT construct can still rescue RAD51 focus formation but with a lower level than the mutant BRCA23-3148, as in Fig. 5C there are some RAD51 foci in siDPP9 cells under 300 nM MMC treatment as well as in Fig. 5-figure supplement 1F. If WT BRCA21-3148 is not functional at all without DPP9 catalysis (as suggested by Fig. 5F and 5G), how do the authors explain that N-terminally tagged BRCA2 variants are still functional, for example GFP-BRCA2, 2xMBP-BRCA2, which could not be processed by DPP9 as the dipeptide Met-Pro is not at the desired positions? This is a potential red flag and questions the validity of the central conclusion. The complementation activity of the constructs must be tested in a BRCA2-deficient background with proficient DPP9. The experiment in Figure 5F lacks a positive control to evaluate the level of complementation by the 3-3418 BRCA2 construct. This should be easy by omitting BRCA2 siRNA and using a scrambled control and no siRNA.

      Fig. 5A. The difference in RAD51 bound to chromatin is not clear and there is no quantification.

      Fig. 5B, please provide statistical analysis of the comparison of siNT versus siDPP9 + MMC. Also, the description in line 453 does not appropriately reflect the data, as there is a RAD51 focus signal in DPP9 depleted cells.

      Fig. 5B, D, F: it is important to show the real number of foci to determine whether the RAD51 foci per cells correspond to what is known from the literature and to find out the effect of the DPP9 KD alone to the number of RAD51 foci. For Fig. 5F this data is presented in Figure 5 suppl. 1 F.

      Fig. 5C: The detection of RAD51 foci is of poor quality. In addition, why under some conditions there is a strong cytoplasmic signal? The effect of different treatment on nuclear size is perplexing. The scale bar indicates 10 µm in every panel, yet nuclear size varies by ~2-4 fold.

      Fig.5. S1E. The levels of BRCA2 WT vs BRCA2delta MP are different, even if the amount of plasmid transfected is the same this does not mean the quality of the DNA is the same so the differences in the levels could be due to this. The quantitation is missing. Is this result reproducible?

      Fig 5 Suppl. 1F. This panel shows that the number of RAD51 foci is increased in cells complemented with a plasmid expressing 3-3418 compared to the full-length BRCA2. However, there are several possible issues in this experiment. A WB showing the control that the siBRCA2 worked in the same cells that overexpress BRCA2 and BRCA2deltaMP. The levels of BRCA2deltaMP is reduced compared to FL-BRCA2. The BRCA2 cDNA is big and its transfection is rather toxic for the cells so a reduced transfection efficiency could lead to higher survival and possibility of repair leading to increased RAD51 foci. What are the levels of gH2AX foci in both cells? If there is a difference in the amount of damage this could also lead to differences in RAD51 foci. The number of RAD51 foci in cells depleted of BRCA2 seems rather high compare to the ones reported in the literature. It remains unclear if the small difference in RAD51 foci between the two BRCA variants could be related to difference in their expression or heterogeneity in expression in the cell population.

      To assess the relationship of DPP9 and BRCA2 in DNA repair the phenotype of BRCA2 1-3418 should be rescued by WT DPP9.

      There is no convincing evidence to suggest that DPP9 regulates RAD51 filament formation by processing the BRCA2 N-terminus as stated in line 474. The authors examine RAD51 foci not filaments.

      7) In Figure 6 S1, the authors show a correlation between low DPP9 expression in breast cancer and patient survival. These data support the significance of DPP9 in breast cancer. However, a quick database analysis in cBioportal reveals that DPP9 and BRCA2 deficiency co-occur, which is not expected from the model presented here. Moreover, cBioportal also shows that DPP9 is often deleted in ovarian cancer but amplified in breast cancer. Again, these data are not consistent with the simple model presented here. Finally, analysis in DepMap shows that DPP9 in not essential whereas BRCA2 is an essential gene. These data do not support the model that DPP9 is essential for BRCA2 function. The authors should consider these available resources and refine their interpretation.

      8) There is a concern about the use of fragments. This appears acceptable for the structural analysis but in Figure 4I and J it is problematic for the stability experiments. In Fig. 5S1E, the full-length BRCA2 behaves consistently, but the analysis is very limited and not quantified. Is this finding reproducible? The reason this could be a bigger issue is the concern about the BRCA2 construct and the absence of complementation activity discussed above.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 24 2020, follows.

      Summary

      In this paper, Brunet and coauthors show that various species of choanoflagellates have the capacity to switch from the typical flagellated stage to an amoeboid, non-flagellated stage. They show these amoeboid cells move using blebbing migration. The paper makes four major points:

      1) Several species of choanoflagellates make dynamic protrusions that appear similar to blebs in DIC when confined. This claim is well supported by nice quantitative analysis of blebbing using a diversity of choanoflagellate species.

      2) The blebs made by choanoflagellates, like those of animal cells and Dictyostelium, involve breakage and healing of the actomyosin cortex. This is the weakest part of the paper (see below further details).

      3) Choanoflagellate cells can use this blebbing motility to escape the confinement, a concept supported by movies of cells doing this. This is supported by movies of cells doing this.

      4) Amoeboid motility is homologous in choanoflagellates and animals. In particular, the authors postulate that epithelial and crawling cells in animals differentiated by exploiting the switch from the flagellated to amoeboid stages that existed in unicellular opisthokonts. Although this is an interesting hypothesis, we believe the data is not conclusive and both the implications and the conclusions need to be better explained in the general context of eukaryotic evolution.

      Below we define what we believe should be done to improve the manuscript.

      Essential Revisions

      1) The idea that amoeboid morphology and blebbing motility is older than animals is not particularly controversial: blebbing has been documented in various microbial lineages for some time, and blebbing motility uses a contractile actin cortex, which are also widely distributed. This, combined with a lack of engagement with alternative hypotheses weakens the conceptual significance of the paper. Fleshing out what the alternative hypotheses are, and/or providing context for how this work provides new insight into the evolution of blebbing would improve the paper. Furthermore, the introduction should include more background information to distinguish blebbing motility from actin pseudopod based motility. Also, the use of the term "actin-filled" to discuss bleb retraction is confusing; do the authors mean actin-encased? Also a clear definition of what a "retracted bleb" is should be provided. In general, some information in the results that could be better incorporated in the introduction as it explains the authors' motivation for the work.

      2) Similarly, the idea of homology between the amoeboid cells of choanoflagellates and animal amoeboid cells seems not well supported or at least no more than a potential homology of many eukaryotic amoeboid cells. This needs to be toned-down and/or discussed into the context of the potential ancestral eukaryotic feature. The same with the related idea that epithelial and crawling cells in animals differentiated by exploiting the switch from the flagellated to amoeboid stages that existed in unicellular opisthokonts.

      Finally, the authors say that the switch between flagellated and crawling cells in choanoflagellates is triggered by particular size-related stress. However, it is difficult to imagine that animals evolved under such a type of stress. It may be interesting to discuss whether the authors have tried to see if alternative sources of stress induce transition to amoeboid states or, alternatively, discuss particular hypotheses about which kinds of stress might trigger this response. When discussing this, it may be worth considering an alternative: that choanoflagellates might be a side phylogenetic group having evolved specific characteristics and virtually lost amoeboid stages except for extreme situations like the ones shown here. The ancestor of metazoans would then had simply retained an ancestral (eukaryotic) capability to transition from flagellated to amoeboid states during the opisthokont life cycle without this capability being in any way related to volumetric stress but rather to particular environmental clues.

      Overall, all these ideas should be discussed in the context of eukaryotic evolution and toning down the potential implications.

      3) The authors use the standard definition of blebbing: actomyosin cortex breakage and healing concomitant with production of round protrusions devoid of actin. The paper provides insufficient data to support the claim that choanoflagellate cells defined as "amoeboid" based on the lack of microvilli undergo this form of blebbing. Providing additional examples and/or quantitative analyses of the data would strengthen the paper. Specifically:

      a. Only a single cell with lifeact is shown (Fig. 2C, with what looks to be the same cell in Video 7). Additional examples would be welcome. However, this cell has high levels of septin overexpression: could this be interfering with the native phenotype? Fig. 3K looks very different from Fig. 2q. Is septin localizing to the membrane? The WT cells shown in P do not show cortical actin. It seems likely worthwhile repeating this experiment with lifeact but without septin overexpression. Additionally, linescans to quantitate the cortical actin levels before, during, and after cortical breakage would provide quantitative support, particularly with a membrane stain, or at the very least, the septin localization.

      b. The phalloidin staining in Figures 2 and Figure 2-figure supplement 1 is not particularly convincing in terms of the presence of cortical actin. Why do the cells in Figure 2P and 2W look different? The additional examples also show weak actin staining.

      c. Figure 2, Panels S and V show myosin staining which does not appear to be localized to the cortex. More than a single cell should be shown, along with quantitation by line scan analysis to support the claim of cortical localization. d. Definition of amoeboid seems problematic as many of the images show "amoeboid" cells with what look to be microvilli: 2C-K, 2U-W, 3A-F

      e. The text describing the phalloidin staining is a bit circular as it assumes blebbing to interpret the staining patterns, but then uses the staining pattern as confirmation of blebbing.

      4) The authors indicate that the actin in amoeboid choanoflagellates undergo retrograde flow. The authors show a single cell imaged with widefield fluorescence (Fig. 3 F-G, Video 9). Typically retrograde flow is visualized using TIRF microscopy to show the movement of individual actin filaments within a network. The data presented makes it difficult to evaluate whether this is the retraction of the entire cortical network, or flow of actin filaments within a network. To support a claim of retrograde flow, additional data and analysis should be provided. Moreover, the concept of retrograde flow in the context of blebbing motility needs to be explained more fully in the text.

      5) The microtubule inhibition experiments involve treating cells for 36 hours with a non-standard microtubule inhibitor. Due to the possibility of off-target effects, the authors should repeat this experiment with a second microtubule inhibitor to cross-validate the result. A second, orthogonal approach would be to stain cells and look for anti-correlation between microtubule density and bleb formation.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 10 2020, follows.

      Summary

      The work establishes that mouse osteocalcin is O-glycosylated at Ser-8, independently of processing and gamma-carboxylation. Human osteocalcin was found not to be O-glycosylated, but mutation to Ser-8 allowed that process to occur. If increased stability of osteocalcin in vitro as a result of O-glycosylation is real, it could be of interest. The paper was found to be well written with figures illustrating the findings. With that said, several key experiments are required that would considerably strengthen the conclusions. Notably, without information on the biological outcomes of O-glycosylation, the paper is seen to be of limited interest.

      Essential Revisions

      1) Regarding the ELISA (ref. Ferron et al 2010b), capture antibodies can distinguish between carboxylated and non-carboxylated OCN. For the present work, it is essential that the authors show that OCN with or without O-glycosylation at Ser-8 is measured identically in this ELISA. The authors should specify what capture antibodies were used. This query applies also to data with human OCN and the Ser-8 mutant that is O-glycosylated. Furthermore, does the commercial ELISA kit measure glycosylated mutant and normal hOCN as identical?

      2) Figs 3H-J show a statistically significant difference in levels of glycosylated and non-glycosylated mouse OCN. These experiments however do not measure "half life" as claimed in the title and abstract. The in vivo half life of injected O-glycosylated vs wt ucOCN should therefore be compared using timed estimations during the declining phase.

      3) A feature of OCN that interested the authors was the remarkable difference in circulating amounts - more than 10 times higher in the mouse. This work appears to be part of a search for mechanisms to explain this, although they might consider that these are evolutionary changes, including the fact that there is only 65 % conservation of sequence between mouse and human , and the human OCN is not O-glycosylated, whereas the mouse OCN is. The biological significance of this difference in O-glycosylation thus needs to be established. While knocking-in a mutation to abolish O-glycosylation will provide the most definitive answer, the reviewers consider this not to be feasible during the pandemic. Therefore, at the very least, a cell-based assay should be used to compare biological activity. Examples are the dose-dependent increase in insulin mRNA in mouse pancreatic islets (Ferron. et al, PNAS 105: 5266, 2008). Very low dose ucOCN was shown in that work to promote insulin expression in islets in a dose-dependent manner. An alternative approach, arising from the same paper, would be to show at slightly higher doses, a dose-dependent increase in adiponectin expression in mouse adipocytes. The authors might have other possibilities of cell-based assay.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 20 2020, follows.

      Summary

      The reviewers agree that the paper has been improved and is now easier to read. The findings were judged fascinating but there are still issues. The authors delineate a linear story (one pathway) but some elements could affect the system independently. The reviewers agree on a set of recommendations that should be addressed during the revision of the manuscript.

      Essential Revisions

      1) Resistance to parasitoid wasp.

      The authors provide an extremely important body of work. However, the reviewers have a concern about the physiological significance of the phenotype. It is appropriate to hypothesize that an increase in lamellocyte production will yield a more potent immune response against parasitoids, as seen in other Drosophila species (i.e. D. suzukii). However, genetic perturbation that increase lamellocyte numbers, or perturbs the immune system in any manner, does not necessarily mean that the immune response mounted will be successful. The authors should provide experiments monitoring resistance to parasitoid wasps when the pathway they discovered is perturbated. The should monitor the impact of feeding larvae on WOF on resistance and how disturbing Or49A, Gat and Ssadh affect resistance to parasitoid wasp.

      2) RNAi effectivity and using one line.

      The reviewers questioned the validity of the study as some results are based only one RNAi and their knockdown efficiencies were tested by using a ubiquitous and not in the actual tissues. They however recognize that the model is supported by the fact that they are testing different players affect the pathway. The reviewers however ask to repeat the experiments with Gat and Ssadh using another RNAi line to reinforce their conclusion.

      3) Sima staining.

      Figure 3: There are discrepancies in the Sima staining which put question into the specificity of this staining/back ground. For example, some LGs showed a punctate expression of Sima in the posterior part of the LG (Fig 3f, g, and h which is not seem in the other LGs). Pictures in Fig3b, k and m are not in agreement with quantifications in 3o. The same comment holds for Fig 3f-I and quantifications in j. Expression of Sima in lamellocyte is also not convincing. The specificity of the Sima antibody has to be checked. Sup Fig 7B is the difference in sima mRNA levels significant? The reviewers recommend to address this point or at least to prepare a supplementary figure showing replicated of the picture they use of their graph.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 9 2020, follows. The preprint has been revised in response to the comments below.

      Summary

      This paper is an important extension of the authors' previous publication in eLife (Hawkins et al. 2017) that presented novel data suggesting that CO2/H+-mediated vasoconstriction in the brainstem retrotrapezoid nucleus (RTN) supports chemoreception by a purinergic-dependent mechanism. Here the investigators provide new data indicating that CO2/H+ dilates arterioles in other chemoreceptor regions (cNTS, raphe obscurus- ROb), thus suggesting that the CO2/H+ vascular reactivity in the RTN is unique compared to some other brain regions. The investigators significantly advance their previous work by applying a number of new experimental approaches to provide evidence that P2Y2 receptors in RTN vascular smooth muscle cells are responsible for the purinergic mechanism mediating the vascular reactivity and specifically contribute to RTN chemosensitivity. Importantly, pharmacological blockade or genetic deletion of P2Y2 from smooth muscle cells blunted the in vivo ventilatory response to CO2, and virally-driven re-expression of P2Y2 receptors in RTN smooth muscle cells rescued the ventilatory response to CO2, suggesting that these receptors are required for the normal ventilatory response to CO2. New pharmacological evidence is also presented that activation of RTN astrocytes is involved in purinergic signaling driving the RTN vasomotor responses. Overall these results advance the concept that specialized vasoreactivity to CO2/H+ in the RTN contributes to respiratory chemoreception.

      Essential Revisions

      1) Although authors are given leeway in the format of a Research Advance, this paper would benefit from more structure including delineation of Introduction, Results, and Discussion sections. The manuscript would be substantially improved in particular by including a more thorough, dedicated Discussion section with explicit elaboration on limitations of their experimental methods and conclusions, and including discussion of how the important P2Y2 receptor knockout and re-expression experiments represent a fundamental advance considering that the authors had already implicated (although not completely established) these receptors in their previous publication.

      2) Presentation of the RT-PCR data of purinergic receptor expression profiles can be improved, particularly by providing a more convincing validation of this data such as giving supplemental data of raw numbers for GAPDH levels across areas to prove that GAPDH actually is a valid reference. The authors could also use 3-4 such genes as many investigators do for expression profile calibration. The reviewers note that for the argument it is not necessarily that important how the levels of receptors look in relation to a house keeping gene, but whether P2Y2 is the only receptor which is relatively highly expressed in RTN smooth muscle cells compared to other regions. Looking at Fig. 1B, it seems that relative to the two other areas, P2X1, P2X4 and P2Y14 are also much higher in RTN smooth muscle cells compared to NTS. The reviewers agree that an important aspect is the remarkably low expression of P2Y2 in endothelium which in theory should oppose constriction by possibly releasing NO.

      3) Additional information on measurements of vascular diameters would be useful. Have the authors obtained measurements from multiple vessels at each time point in the chosen field(s) of view for individual experiments? If so, how do such measurements compare to the representative single vessel measurements for a given experiment presented in the figures? How many vessels per experiment are included in the group summary data? Please explain more completely why it was necessary to induce a 20-30% vasoconstriction by the thromboxane A2 receptor agonist before the measurements.

      4) Some additional validation of the specificity of the AAV2 used for the P2Y2 re-expression experiments would be helpful since this is not a well characterized virus and may lead to receptor overexpression. Additional nice clear images with proper co-localization would be good to see and additional details about non-smooth muscle cell expression should be provided.

      5) The experiments showing unstable breathing in vivo produced by injecting a thromboxane A2 receptor agonist vasoconstrictor (U46119) into the cNTS and ROb under conditions of mild hypercapnia (2-3% inspired CO2) are intriguing, but these experiments lack the proper control of U46119 injections into the cNTS and ROb under normocapnic conditions to determine if this alters blood pressure and produces breathing instabilities independent of any "gain-up" of RTN activity. It would also be of interest to know whether the authors have tested if larger instabilities occur with cNTS/ROb vasoconstriction at higher levels of hypercapnia.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 19 2020, follows.

      Summary

      Garcia-Marcos et al describe a method to study the activity of heterotrimeric G-proteins. These switches are usually activated via GPCRs and play very important roles in cellular signalling. Investigating their function is often difficult. Therefore the authors have designed an optogenetic tool that activates Gi proteins by blue light based on an engineered LOV2 domain. They demonstrate that activation is specific and that the dark state has a much lower affinity than the light state. The optimization is quite impressive. Overall, this is an interesting and useful tool but some experimental verifications are required.

      Essential Revisions

      1) Figure 1 shows binding of the G protein to permanently on or off mutant versions of LOV2GIV. Since the G protein is purified, abundant and bound to GST-LOV2GIV, why is it not visible in the ponceau S stained gel?

      2) This figure needs additional controls. Is the interaction with WT LOV2GIV induced by light as shown in the cartoon? Does the interaction lead to increased GTP binding, as shown in the cartoon? Is the binding blocked by GIV residues known to be important for G protein binding as shown in the cartoon structure? Whether or not these controls have been used in the past, they should be done here as well for this particular fusion.

      3) Figure 2A shows binding association (not dissociation as indicated) for the same constructs as in Figure 1. Figure 2B shows GTP hydrolysis but the function of GIV is to stimulate GTP binding, which is just as easy to measure. Again, this figure needs additional controls to show that it is activated by light and relies on key residues.

      4) Figures 3 and 4 shows G protein activation in yeast and HEK293 cells. GIV leads to increased GTP binding but the cell assays do not measure G-alpha-GTP signaling but rather measure release of G-beta-gamma. A direct assay for G-alpha-GTP should be used. The yeast legend and figure do not match and the yeast assays in Figure 3 and 4 use different readouts when both could be used in parallel. A single concentration of a single agonist as a reference is not sufficient when the authors could easily do a concentration-response experiment with an antagonist as a negative control.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 5 2020, follows.

      Summary

      This current study builds on previous work from the same group published in eLife. This past work focused on the mechanism that renders lateral line hair cells of pappaa mutants more susceptible to the ototoxin neomycin. This work found that mitochondrial dysfunction was the underlying cause for neomycin susceptibility. This current study expands on the previous work and suggests that not only defects in mitochondria, but also the ER are involved in neomycin susceptibility. The authors use a variety of approaches including TEM, live imaging, pharmacology and RT-qPCR in their present study. Using TEM the authors show that mitochondria - ER associated are more numerous. Furthermore, similar to disrupting mitochondrial calcium pharmacologically, disrupting ER calcium also renders Pappaa-deficient hair cells more susceptible to neomycin. The authors suggest that this ER dysfunction manifests in several ways. They use live imaging to show that in pappaa mutants hair cells are unable to properly package neomycin into autosomes. In addition, via RT-qPCR they show that pappaa mutants have an increased unfolded protein response (UPR). Currently the relationship between all of these pathological issues is unclear, but this work does reveal additional mechanisms that could render loss of Pappaa detrimental to hair cell health. Although the work is well written and presented and statistically sound, there are several experiments that are needed to strengthen the claims presented in this study.

      Essential Revisions

      1) Location of TEM micrographs in hair cell. The morphology of organelles can vary based on location within the cell. For example, in hair cells the ER near the nucleus can be distinct from the ER present near the contacts made with efferent neurons or afferent neurons. (https://pubmed.ncbi.nlm.nih.gov/1430341/; https://physoc.onlinelibrary.wiley.com/doi/10.1113/jphysiol.2013.267914).

      Can the authors indicate what direction the sections (apical-basal or transverse) were taken, where in the hair cells are the sections were taken and how they determined this location?

      2) Quantification of mitochondrial fragmentation. It is clear from the TEM cross sections that the mitochondria in hair cells (Figure 3 A) are quite different between pappaa mutants and controls. Whether there are mitochondria or ER networks are present is not apparent from these TEM images. Nor is it entirely clear that the networks are fragmented. The authors use plugins developed for confocal imaging to estimate fragmentation base on circularity and area/perimeter measurement. It is unclear if these measurement translate to hair cells or TEM. In addition to fragmentation in TEM images, the fragmented mitochondria in pappaa mutants are also hard to see in the live, max-projected mitoTimer images.

      The mitochondrial networks and fragmentation may be clearer or be better quantified by acquiring super resolution images of hair cells labeled with mitoTracker. In addition, it is possible that the fragmentation may also be visible or more convincing in movies of Z-stacks of mitoTracker label compared to in the max-projected images provided.

      3) Examination of hair cell ER morphology. The previous work on Pappaa in zebrafish hair cells focused extensively on the mitochondria while the currently study the shifted the focus to the hair cell ER. While the ER-mito distances are convincing, a more wholistic picture of the amount or distribution of the ER in wildtype and mutant is lacking.

      This could be accomplished either using a transgenic line that labels the ER or a KDEL antibody (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4007406/).

      4) It qualitatively appears that pappaa mutant hair cells are taking up a greater quantity of fluorescent Neo faster than WT i.e. the fluorescent intensity is greater in more hair cells. Did the authors quantify Neo-TR uptake?

      5) Specificity of the pharmacological treatments. The authors perform numerous pharmacological experiments to disturb ER calcium. The authors suggest that that their pharmacological manipulations trigger hair cell death due to the alteration in the interplay between ER/mito calcium in hair cells. What concentration of either of these drugs does it take to kill WT hair cells? Dose-response curves comparing WT and mutant would help support the idea that hair-cell death observed is a direct effect of the drugs on hair-cell ER-mitochondria calcium signaling.

      Pharmacology is non-cell autonomous and the authors do not present evidence that these compounds specifically impact hair cell ER or mitochondrial calcium. Alternatively, these compound could impact supporting cell ER (https://elifesciences.org/articles/52160) as well as the ER in the innervating afferent or efferent neurons.

      More direct evidence show that hair cell mitochondria or ER calcium (measurements using mitoGCaMP such as in the previous study) are impacted by these treatments would make the author's claims more compelling.

      6) The disconnect between IGFR1 and results in the current study. The identify and location of IGFR1 and the IGFBP are still undefined in this system and therefore it remains unclear exactly how IGRR1 or Pappaa impact sensory hair cells. In previous work on pappaa mutants (enhanced startle response, defects in photoreceptor synapse formation, defects in hair cell mitochondria) the role of IGFR1 in these processes was validated. In the current study, the link with IGFR1 is implied throughout.

      It is true that the relationship between IGFR1 and Pappa is well characterized and that currently the only known substrates of Pappa are IGFBPs. Despite this work, it is still possible that given the range of phenotypes in pappaa mutants, that Pappa has other protein substrates that have not yet been identified, or other has biological functions unrelated to the IGF system.

      To verify IGFR1 in this current study the authors could use NB1-31772 to stimulate IGF1 bioavailability and test whether this rescues either the autophagy or UPR defects in pappaa mutants. Being able to rescue these phenotypes also makes the study more compelling.

      7) The authors state that there is not more spontaneous hair cell death in pappaa mutants compared to controls (line 443). Previous work has shown in zebrafish that Usher mutants (cdh23, ush1c, myo7a) also have an early UPR (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4007406/). Similar to pappaa mutants usher mutants have the same # of hair cells compared to controls, indicating no spontaneous hair cell death. But interestingly Usher mutants do have more TUNEL positive hair cells compared to controls, indicating that more hair cells in Usher mutants are in the process of apoptosis. Based on this new finding implicating the UPR response in pappaa mutants, could pappaa mutants, similar to hair cells in Usher mutants be more fragile (neomycin susceptible) as they are more likely to be in the process of apoptosis? A TUNEL label in pappaa mutants could reveal this. In addition, this paper on UPR in Usher mutant hair cells could be a useful paper to add to the discussion.

      8) Line 445-451: "Together, these findings suggest that Pappaa may regulate ER-mitochondria associations by promoting ER homeostasis. It is important to note that the ER and mitochondria are engaged in a constant feedback loop." This line of reasoning seems rather circular, considering that the previous study showed Pappaa regulates mitochondrial function. If mitochondrial function is impaired, it seems likely that ER homeostasis would be disrupted as well.

      9) Methods: Overall, the methods section needs more detail. All experiments that were not previously performed by the author or the author's lab should have a concise description of what the authors did next to the reference (e.g. fish were imaged under Lab-Tek Chambered Coverglass (Fisher Scientific) where they were immobilized under a nylon mesh and two stainless-steel slice hold-downs (Warner Instruments) per Stawicki et al, 2014) A detailed description of how individual pappaa170 larvae used in experiments were genotyped is needed. A comprehensive description of how mitochondrial circularity was measured using the "mitochondrial morphology" plug-in in ImageJ is needed.

      10) Statistics: how did the authors determine the power of the experiments were sufficient to avoid Type I and Type II error?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 16 2020, follows.

      Summary

      The manuscript entitled "Increasing heart vascularisation using brain natriuretic peptide stimulation of endothelial and WT1+ epicardial cells" by Li et al. reports data of myocardial angiogenesis in mice subjected to experimental myocardial infarction. The study indicates that repeated intraperitoneal injections of synthetic BNP or oral treatment with Entresto, a drug inhibiting neprilysin-mediated degradation of the endogenous natriuretic peptides, possibly improves cardiac vascularization after ischemia.

      Microvascular dysfunction after acute myocardial infarction (MI) is a major clinical problem. Although primary percutaneous coronary intervention (PCI) has markedly improved patients' survival, despite epicardial reperfusion more than 30% of patients show signs of microvascular dysfunction leading to adverse left ventricular remodeling and heart failure. Impaired angiogenesis can contribute to myocardial tissue damage. Based on experimental studies, several clinical trials aimed to improve myocardial angiogenesis via intracoronary administration of vascular growth factors, gene transfer or bone marrow mononuclear cells, in patients who had successful primary PCI, but the results were disappointing. A better knowledge of the cellular pathways regulating myocardial (re)perfusion after ischemia is necessary to search for therapeutic strategies capable to restore the microvascular network and flow. The here presented study aimed to elucidate whether B-type natriuretic peptide (BNP) can improve myocardial postischemic angiogenesis in as well as the potential pharmacological, therapeutic implications. Hence, this experimental, „preclinical" study addresses an important, clinically relevant question.

      Overall this study follows a very original question. But it includes many different data sets in somehow incomplete way, many of them generated with "NMC". It would benefit a lot by concentrating in a clean way on some concrete aspects. Mechanistic studies should be preferentially conducted with sorted or cultured endothelia instead of a mixed cell population (NMC, containing fibroblasts, pericytes, inflammatory cells, besides endothelial cells).

      Essential Revisions

      1) Certain parts of the study should be completed. For example, why don't the authors present a fine and extensive analysis of cardiac function in animals treated with BNP? In the same way, the authors should complement their experimental approaches with an analysis of all parameters of cardiac remodeling and in particular infarct size and interstitial fibrosis.

      2) Conversely, the authors made the effort to analyze cardiac function in animals treated with LCZ696 (Figure 9). However, there is no statistical analysis of these data? or the differences are not significant? in this case, what is the interest of a treatment that increases capillary density without modifying cardiac function? It is however likely that an analysis of cardiac function beyond 10 days post-MI could give significantly different results.

      3) The authors should analyze whether or not LCZ696 directly stimulates the proliferation of resident mature endothelial cells and/or that of WT1+ cells.

      4) Results, page 5, para 1: the authors state that "first they determined whether ip BNP acted directly or indirectly on cardiac cells". But there is no single data set in this manuscript allowing to conclude that the observed effects are directly derived from endothelial actions of BNP. As they mention before, BNP acts on many types of cells and organs, and the observed effects could also be "indirect".

      5) Results, page 5, para 3: plasma cGMP levels are a poor index of cardiac actions of BNP. It would be more meaningful to measure cardiac cGMP levels.

      6) Results, page 5, para 4: it is strange to use the phosphorylation of phospholamban (PLB) as index of BNP activity. This manuscript focuses on angiogenesis. PLB is a regulatory protein in cardiomyocytes. Where is the link to endothelial regeneration?

      7) Page 6, top: BNP increased phosphorylation of PLB by nearly 200-fold in "non-myocyte cells" from the heart. Which cells are these? Is this fraction contaminated by cardiomyocytes? Which non-myocytes have such high PLB levels?

      8) How were BNP plasma levels in BNP versus vehicle treated mice after MI? Did Entresto increase BNP plasma levels and to what extend?

      9) Most in vitro and ex vivo studies were performed with NMCs. How many endothelial cells are contained in such heterogenous populations?

      10) Some basic parameters are missing: how did BNP administration affect cardiac contractile functions as well as the infarct area and area at risk? Did exogenous BNP lower arterial blood pressure?

      11) Page 9: how does BNP, via NPR-A/cGMP-signaling, increase MAPK pp38? What is the signaling pathway and do the authors have any hint that this signaling pathway was also activated by BNP in vivo (in endothelial cells in situ)?

      12) Page 11 presents a section entitled "Increased vascularization in infarcted hearts after LCZ696 treatment". But in the corresponding Figure 9, there is no single data set showing „statistically significant effects of entresto". The figure just shows some preliminary data and trends obtained with very few mice.

      13) Figure 1C: it is surprising that the basal levels of pPLB were so low (-). Normally, after MI in mice the endogenous ventricular expression levels of ANP and BNP significantly raise. Was there a difference in pPLB between sham and MI mice (vehicle treatments)?

      14) Figure 1D: which types of non-myocyte cells express such high pPLB levels and what is the functional meaning?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 9 2020, follows.

      Summary

      Overall this is an interesting paper whose ssRNA seq dataset and experimental analysis of phenotypes provides a valuable resource for investigating gene expression differences associated with key phases of skin development and repair. The enhancement of HF regeneration upon Lef1 overexpression is a striking result and will be of general interest to many fields including developmental, stem cell, and epithelial biologists. The work is well conducted, the results are new, and significant for skin wound healing and HF regeneration, and in sum a good fit for eLife.

      Essential Revisions

      The overall tone of all reviewers is enthusiastic and favorable, however with very important points raised:

      1) Dermo1-Cre seems not specific to fibroblasts (and it is non-inducible). Ideally this should be addressed by using an inducible/more specific Cre mouse line. However, as the enhancement of HF regeneration is an exciting finding by itself and a new mouse model is likely out of scope of a revision, this point could be addressed textually by changing the conclusions to reference stromal cells instead of fibroblasts specifically.

      2) The interpretation of the scRNA data should be bolstered with additional analyses. It is important for the authors to revisit the data and figures (including making some improved analysis), and carefully state the actual results and conclusions supporting their claims and following next steps in the manuscript.

      a) ScRNA-seq analysis was superficial in relation to regeneration versus repair, especially comparison of the time points that model regeneration and scarring. Does velocity analysis predicting Lef1, or other genes, driving differentiation of one population of fibroblasts into a papillary fibroblast or DP-like state? Do multiple fibroblast subsets follow this trajectory? How do these finding compare between the two wounding time points? Does gene ontology suggest differences within one subcluster of fibroblasts between two conditions or are the major differences in the gene expression profile/function associated with each subcluster? A more complete analysis of this could shed more light on the involvement of fibroblast lineages in regenerative versus reparative healing.

      b) From the ssRNA seq analysis the authors state "we identify Developing papillary fibroblasts as a transient cell population that is defined by Lef1 expression.", but this is not clear from the ssRNA seq analysis. In Figure S2, Lef1 expression seems to be largely excluded from cells within the Dpp4 expression cluster (cluster 2), and Dkk1 (Cluster 0), which define the major papillary FB clusters. Can the authors expand upon how the Velocity Analysis identifies different genes than overlaying relative expression levels on the UMAPS?

      3) Surrounding the claim of a transient papillary fibroblast population (which is an important part in their paper), several parts are unclear;

      I.e. they could/should explore the fibroblast populations of all conditions to compare regeneration vs scaring and regeneration vs development (e.g. R2Q2, R3Q5).

      Which of the two papillary fibroblast population(s) is/are transient? How to explain the rather minor overlap of Lef1 expression with these two papillary fib populations? Where are the two populations in situ in developing and in regenerating skin?

      4) Given that the WIHN generates a significant amount of cysts, the authors have to down-tone their statement of "without adverse phenotype". As the authors also refer to Hedgehog-pathway induced de novo HF formation (a model giving rise to tumors and new HFs), they likely meant that their model does not induce apparent tumors (the cysts look different compared to the obvious BCC-like lesions trough Hh-pathway activation) - however the authors totally neglect the fact (don't mention) that the mice apparently develop cysts in addition to HFs in wounds.

      Figure 5e,g,j. The regenerated HFs appear very abnormal and cyst-like. The authors state several times in the paper that Lef1 overexpression enhances regeneration without other adverse phenotypes, but these regenerated structure are very abnormal. Are they cancerous? P90 wounds appear to generate a significant amount of cysts; is this representative for all conditions or something more specific for the P90 timepoint?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on August 7 2020, follows.

      Summary

      This manuscript uncovers an unexpected role for myogenin in muscle stem/progenitor cells in adult zebrafish. Further analysis of a previously characterised mutant makes novel contributions to the field of muscle growth. The authors show that Myog helps keep MuSCs quiescent and provide mechanistic insights into how Myog controls MuSC activation. Intriguingly, their work suggests that Myog mutants have increased differentiation markers compared to wild-type siblings. They also offer new models for MuSC positioning within a fiber.

      Essential Revisions

      1) In their previous publication (Ganassi 2018), the authors showed that the differentiation index of cultured adult myoblasts does not differ between WT and myog-/-, but in the current study myog mutant cells show a much higher degree of differentiation compared to their wild-type siblings. Please clarify why these findings differ.

      2) The authors claim that the pax7a:GFP+VE cells represent bona fide MuSCs, which can only be determined by co-label of GFP and an anti Pax7 (or Anti Pax3+7) antibody. Although the authors do provide this co-label in one panel, they only show one cell per genotype. Please provide quantification of how often the GFP+VE cells are also positive for the anti-Pax3/7 antibody. Unless this co-label is extremely common in both genotypes, or confirmed in each experiment, the authors should soften their language abut the GFP expressing cells being verified MuSCs.

      3) Related to the previous point: The authors show that MPCs behave differently in culture and although they show increased pax7a and pax7b expression they also express higher levels of differentiation markers and enter terminal differentiation. This is puzzling and inconsistent with the reduced number of myonuclei and smaller myofibres that are seen in the mutant fish. Furthermore, it is not clear how the increased number of MuSCs per fibre is reached. A plausible explanation for both observations (fewer myonuclei/smaller fibres & more MPCs), is that these cells are myocytes that do not fuse efficiently. The authors raise this possibility in the discussion, however, this should be better assessed and either excluded or supported.

      4) The surface area domain size measurement in Figure 1 is a strange proxy for myonuclear domain. which is best thought of as a volume, as shown in Figure 2. However, figure 2 omits all pax7:GFP+VE cells, some of which may have fused recently enough to retain their GFP label. Please replace the SADS calculation in Figure 1 with a volumetric calculation. This will be important for interpreting and comparing the two findings.

      5) Culture of mononucleated MPCs from plated fibres was used to investigate whether lack of Myog enhanced MPC proliferation. The relative proliferation rates were not significantly different, however, EdU pulse experiments suggest that mutant MPC are more readily entering S-phase. Overall the authors suggest that lack of myog accelerates MuSC transition into the proliferation phase. At present this is not supported convincingly. Indeed, the data shows reduced proliferation and AUC (4E) in mutants. An additional EdU pulse at an earlier time after plating (day 1) should be included to potentially strengthen this idea. Alternatively the statement should be modified and toned down.

      6) The authors want to assess whether there is an earlier onset of differentiation of MPCs in culture. However, they only show expression of mef2d and mylpfa at day 3 (Fig 4H) and day 2 should be included here as well. Overall the differentiation index of MPCs is increased, can they comment on whether the cells remain mononucleated.

      7) In the discussion the paragraph (292 ff) regarding the niche is very speculative and should be toned down/amended. In particular, (Line 318) the conclusion that that Myog is required for assembly of the MTJ MuSC:niche complex is not well supported, there are no MTJ markers shown.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 6 2020, follows.

      Summary

      Sando et al. extend on previous work by the same lab to delineate the neuronal mechanisms that control UV-light / ROS suppression of feeding and evoked spitting behaviors. They provide a nice characterization of pharyngeal behaviors that are involved in feeding and spitting, showing that upon UV-light stimulation feeding pumps are modulated to evoke spitting instead. M1 neurons are central to the spitting reflex; they sense light, integrate inputs from light sensitive I2 and I4 neurons and transmit the information to the anterior pharyngeal muscles pm1/2 and the anterior part of pm3. The conceptual advances of this paper are twofold:

      1) The hourglass circuit motif as a means to transform ingestion movements into spits.

      2) Local activation of pm3 muscles via a compartmentalized calcium signal that ensures opening of only the anterior part of the alimentary tract.

      Most of the behavioral experiments are well done and the paper could be of potential interest to a broad audience. However, the reviewers raised some concerns that should be addressed prior to publication in eLife.

      Essential Revisions

      1) A major concern is that all three reviewers are not convinced that the data presented here support the conclusion of local calcium dynamics in the anterior pm3 muscles. Since this is one of the major aspects of this study, it is essential to provide more experimental evidence. The authors used a pan-pharyngeal driver to express GCaMP. The imaging resolution seems not good enough to distinguish calcium transients in pm1/2/3 and the most straight forward interpretation of the results is that the anterior calcium transients are derived from pm1/2 but not pm3. It seems otherwise to rest on the claim that pm3 is sufficient for spitting and that, in the absence of pm1/2, local contraction of pm3 is the only way to hold the valve open during expulsion. Same for Fig 4F.

      To substantiate the claim, these experiments should be repeated using a pm3 specific driver.

      Alternatively, if pm3 specific drivers are not available, the experiments could be repeated upon laser ablation of pm1/2, to ensure that the signals are indeed specifically derived from pm3.

      Perhaps, if imaging resolution and interference by emission light scattering permits, an overlay of a good DIC with GCaMP fluorescence may settle this more easily since pm3 stops at the base of the buccal cavity whereas pm1/2 line the cavity.

      Individual recording traces of the different regions along with ethograms of the pharyngeal behaviors should be shown.

      2) The authors use a calcium imaging assay in immobilized worms to record UV-light evoked muscle activity- and pharyngeal neuron activity. While pumping and spitting behaviors occur at a frequency of up to 5Hz in the behavioral assays (e.g. Fig 1D,E), calcium dynamics in muscle and neurons were observed at 1-2 orders of magnitude slower (e.g. Fig 1 H,I; Fig 4H-M). However, the authors state that these dynamics would match well the time-scale at which light evoked pumps are observed. This is confusing. While it is possible that pharyngeal neurons encode the rate of pumping/spitting, muscle activity should correspond to the motor rhythms.

      What is the pumping rate under the imaging/immobilization conditions? Do the animals spit? The behaviors under imaging conditions need to be better characterized and documented.

      Individual traces should be shown throughout (like Fig 4H), importantly next to ethograms of pharyngeal behaviors.

      The image acquisition rate should be stated in the methods? Was this also 2Hz like the flickering rate?

      Only with this information at hand it is possible to properly interpret the imaging results. Are the measurements convoluted by low acquisition rate and slow on/off kinetics of GCaMP, or do light evoked pharyngeal behaviors occur at such a slow frequency in immobilized worms?

      3) The purported movements of the metastomal filter appear to be based solely on the observation of particle flow with a particular concentration and size of beads. At times this may be misleading. For example, the authors report that 25% of normal pumps are associated with openings of the metastomal filter. However, it is possible that the beads do not always become jammed in the buccal cavity, even if the metastomal flaps remain in position. Direct imaging of the metastomal flaps would address this question; if this is not possible the limitations of the assay should at least be acknowledged.

      4) The opening of the metastomal flaps during spitting is interpreted as a "rinsing" of its mouth "in response to a bad taste". This interpretation is problematic since the animal is "rinsing" its mouth with the same particles that have presumably induced the spitting. It would make more sense if the animal increased rather than decreased selectivity of the metastomal filter; this would allow water to enter the pharynx while excluding potentially toxic particles. If the authors insist in their interpretation they should at least discuss this issue.

      5) Line 183 - What is the basis for believing the sufficiency of pm3 is based on "contraction of a subcellular region"? And Line 188 - where is this "uncoupling" shown? There are few figures/data here. Is it deduced that this must be so because the pharyngeal valve is open while the lumen closes during spitting? Is local contraction of pm3 the only possible explanation for this? In the WT condition, for example, could pm1 and/or pm2 contraction overcome a global relaxation of pm3 to hold the valve upen during lumen closing? Although spitting apparently persists after ablation of pm1/pm2, these events should be documented in the same detail as WT events to demonstrate that pm3 is truly sufficient for "normal" spitting (i.e. continued pumping of lumen while the valve and filter are held open, local Ca++ events in anterior portion of pm3). This section seems to take a leap to a precise muscle mechanism based only on the ablation.

      6) At the cellular level, the authors note that calcium waves in muscle can cause local contraction patterns that lead to peristalsis, but that their observations seem to be of a different kind in terms of spatial and temporal patterning (long sustained local Ca++/contraction in one domain while rhythmic Ca/contraction occur in another domain). How input strength might create such a pattern is difficult to envision, given the simplicity of the M1 pm3 innervation pattern. What is the proposed cellular mechanism here?

      7) Figure 4J-L: these panels lack quantifications. Please show also individual traces; is the little initial bump in lite-1 mutants' response consistent across multiple recordings? Is the reduction in lite-1;gur-3 statistically significant?

      Why is this initial transient signal so much stronger when gur-3 is expressed in I2 in the double mutants (Fig 5D)?

      8) Line 422-424: this statement is not supported by data in Fig 6B-F; only I4 ablated animals show a robust defect and there is no synergistic effect in the double ablation.

      9) Fig 6G: this result lacks quantifications. Appropriate statistics should be performed. Show also individual traces.

      10) Line 210 - "data not shown"... the correlation between spatially-restricted contraction / Ca++ signals and spitting is a central claim of the paper...it needs to be quantitatively documented in a figure.

      11) Line 104 - Is the experimenter blinded to strain/condition? If not, what steps were taken to detect or correct experimenter bias? This is a major pitfall of manual behavior coding.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 27 2020, follows.

      Summary

      This work describes a novel approach to address the important and still open question of the extent of negative selection in cancer and the potential implications. The authors use data from the catalogue of somatic mutations (COSMIC) and a straightforward approach comparing synonymous, nonsynonymous and nonsense mutation counts to separate genes into Oncogenes, Tumor suppressors and Essential genes. The authors conclude that negative selection plays an important role during tumor evolution.

      Essential Revisions

      The reviewers agreed that this work is timely and relevant, but also agreed that there are several important aspects that need revision/improvement before it can be accepted for publication in eLife.

      Structure of the paper:

      1) The reviewers agreed that there are various aspects of the structure of the paper that require especial attention. The introduction is a bit lengthy and very focused. It introduces different questions, e.g. hallmarks, prediction of oncogenes and tumor suppressors, prediction of selection, etc and it reads like multiple introductions to different articles. Many parts (e.g. the discussion of cancer hallmarks) could be shortened substantially, which would make it easier to read the paper. One suggestion is to mainly introduce the models of cancer evolution with respect to the SNVs and indels, and the different models and limitations in the estimation of negative selection in cancer and why it is difficult to detect, see e.g. (Zapata et al. 2018, Lopez et al. 2020, Tilk et al. 2019).

      2) Additionally, it will be important to include citations to previous work on the detection of negative selection in cancer that has been omitted. For example, in Line 353 they should add the work from (Zapata et al. 2018, Van den Eynden et al. 2017, Martincorena et al. 2017, Pyatnitskiy et al. 2015).

      3) Both reviewers agreed that the Results section is repetitive and unbalanced with respect to the Methods section. The work would benefit from streamlining the Results part and moving details to the Methods section.

      4) Regarding the discussion, it is also very lengthy and lack focus. The authors should make clearer the main results and take-home messages from their work. At the moment, this is not very clear.

      5) For simplicity and to improve readability of the manuscript, it was suggested that the authors focus on 2 standard deviation through the manuscript, instead of describing repetitively the results with 1SD and 2SD.

      6) Regarding the presentation of the results, the reviewers suggested to redesign the figures in such a way that they describe the methodological approach, present the major results of their analysis, and show a comparison of these results with previous methods, and lastly (currently as a table) show the association between the identified genes and the hallmarks of cancer.

      Comparisons with previous studies:

      7) One of the problems with the present work raised by the reviewers is that the authors did not performed sufficient comparisons of their results with previous studies. The authors used a seemingly simple approach to measure selection, dividing fractions of frequencies of different mutation classes by each other, with relatively arbitrary cutoffs, e.g. 1 or 2 standard deviations from the mean, to define gene sets. The manuscript does not show the advantages of this method over previous approaches. The authors should clearly show that there is an advantage of their approach by comparing with previous approaches.

      8) The authors should also compare their results with previous publications. One of them, which is cited in the manuscript, is Weghorn & Sunyaev. In fact, this work seems to be misquoted. The authors claim that Weghorn & Sunyaev "identified 147 genes with strong negative selection" (line 371), but that study in fact found very few genes under significant negative selection (<10 applying a q-value cutoff of 0.1) and Weghorn & Sunyaev concluded that "the signal of negative selection is very subtle". Zapata et al 2018 identified stronger signals of negative selection. The identified genes and functions were partly the same as in the here presented work (eg GLUT1). The authors should compare their results to these and other previous results.

      9) Furthermore, there is recent evidence that correcting for mutational signatures and nucleotide-context composition has a large impact when quantifying selection (see e.g. Zapata et al. 2018, van den Eynden et al. 2017, Martincorena et al, 2017), and this is a relevant aspect in the current lines of discussion in the context of negative selection in tumor evolution (see for example Van den Eynden et al. Nature Genetics. 2019). The authors should show that their main observations hold when the mutational signatures and/or trinucleotide context is taken into account.

      10) Related to this, the authors described a clustering-based method to detect genes that deviate from an average proportion of mutations (nonsynonymous, nonsense and synonymous) to infer selection. However, by only using the observed mutations (nonsyn, syn, nonsense), the underlying base-pair composition is ignored. Genes that have a high likelihood of acquiring nonsense mutations will show a deviation from the rest of the genes due to their composition and not due to selection. The authors should recalculate their metrics by performing this correction before reaching the conclusion on the number and identity of the genes.

      Use of controls:

      11) The reviewers also indicated the lack of sufficient controls. To improve the robustness of their method, it was suggested to assess the results after varying several of the conditions. For instance, to circumvent the limitation of the lack of mutations to detect negative selection, the authors study only transcripts with more than 100 mutations. The authors should compare their results using different cut-offs for the minimum number of mutations (50,100,500), and check the performance of their method and whether their results are robust.

      12) Other variations that the authors should consider is to stratify data based on tumor type and mutation burden, since mixing samples with different evolutionary histories might confound the signal of negative selection. As an additional control, a reviewer suggested to perform the same analyses using the germline mutations to separate the genes into cancer specific or cell essential.

      13) An additional control to be performed by the authors was related to the origin of the mutations. The file CosmicMutantExport.tsv contains both mutation data from targeted and genome- / exome-wide screens. Targeted data should be excluded (if the authors didn't do so already). Otherwise their analysis will be highly biased towards well characterized cancer genes.

      Statistical tests:

      14) The reviewers also agreed that there is a general lack of statistical tests in the results. For instance, "the mean parameters of TSGs differ markedly from those of passenger genes in that rNS and rNM values are higher" (line 529), but these comparisons should be done with appropriate statistical tests to assess the significance. Similar tests should be performed throughout the manuscript.

      15) A very interesting idea in the paper highlighted by the reviewers is that by combining their proposed metrics they can differentiate between oncogenes and tumor suppressors. It would be convenient to have a visual interpretation on how different genes can be only oncogenic, only tumor suppressors, or both, depending on which sites are hit. It is important to note though that similar classifiers have been developed (Schroeder et al. 2014), so it would strengthen the claims of the study to provide a comparison with those methods.

    1. Reviewer #3

      The focus of the manuscript by Nicolas-Boluda et al. is timely as it has been shown by this team and by others that dense collagen fibers and other features of the matrix architecture surrounding tumors may form a barrier for T cell infiltration into solid tumors. Despite the authors' claims, however, the data in this manuscript fall short of definitively demonstrating that response to anti-PD-1 therapy and T cell migration into tumors is improved upon reduction of collagen cross-linking. I have a number of concerns that would require additional substantive experiments to be adequately addressed. Below I list major and minor points that should be addressed before further consideration for publications.

      Major points:

      1) BAPN is used as a covalent inhibitor of LOX activity however the authors provide no evidence that the drug is having the expected effects in vivo. In order to draw specific conclusions about these studies the authors would need to provide measurements of collagen cross-links (DHLNL, PYP, DPD).

      2) Imbalance between the mechanical characterization of multiple tumor models with little space for defining the effect of tumor stiffness on anti-PD-1 efficacy and T cell distribution, motility and activation.

      3) Rationale for selected tumor models relative to human tumors was unclear.

      4) Sample sizes, # independent experiments and statistical analyses were inadequate across multiple figures.

      5) Measurements of stiffness, collagen structure and T cell speed should be provided for all treatment conditions (control, LOXi, PD1i and combo) rather than just for LOX inhibition.

      6) Lox inhibition was performed in a preventive setting. Do the authors think LOX inhibition would be as effective in changing tumor stiffness and matrix architecture if the treatment started at the same time point as anti-PD-1?

      7) In Figure 1 the correlation of tissue stiffness/collagen accumulation with tumor volume in clinical samples should be provided in order to attribute collagen cross-linking to tumor progression.

      8) The efficacy data in Figure 6 should be accompanied by survival data.

    2. Reviewer #2

      In this manuscript, the authors provide a thorough analysis of the ECM architecture and stiffness in 4 murine tumor models. They then attempt to correlate ECM architecture and mechanics with T-cell migration and PD-1 efficacy. Substantive concerns are as follows:

      1) The study is highly correlative with inadequate sample size to be conclusive. The authors attempts to draw conclusions about when stiffness does and doesn't affect migration by attempting to interpret data across 4 very different tumor types. In two tumors the migration changes with BAPN and with 2 it does not. It is not possible to draw a conclusion based on 2 points.

      2) Data regarding the relationship between collagen organization and stiffness has been reported previously (as cited by the authors).

      3) Sirius Red staining is referred to and described in the text but no images are shown. Likewise, no SWE images are provided to show the relative heterogeneity described in the text. This is important since so much of the conclusions rests on this data.

      4) The results section discussing figure 1 emphasizes heterogeneity in stiffness, however none of the data shown depict spatial stiffness heterogeneities.

      5) The rationale for the choice of cancer models is not clear.

      6) Why is mPDAC measured and reported differently in figure 2A than the other tumor types?

      7) Why is 40kPa chosen as the cut-off for "stiff?"

      8) Mean-squared displacement is the more appropriate metric to describe cell path (and more conventional) rather than "straightness"

      9) How many cells were studied for each parameter in each condition in Table 2?

      10) The authors study migration of cells on slices, but isn't the more appropriate metric to study cell invasion into the tissue?

    3. Reviewer #1

      In their article entitled "Tumor stiffening reversion through collagen crosslinking inhibition improves T cell migration and anti-PD-1 treatment" Alba Nicolas-Boluda and co-authors analyze the stiffness and collagen distribution in different tumor models implanted in mice. They show that treatment with an inhibitor of collagen crosslinking modifies the collagen network in these tumors and that this correlates with changes in their stiffness. They then analyze the motility of T cells in the different models and show that this motility is modified by the treatment and correlates with the stiffness of the tumor. In the last part of their study, the authors show that treatment of the mice with the inhibitor of collagen crosslinking changes the immune infiltrates in the tumors characterized by a more abundant presence of CD8+ T cells. They finally show that interfering with collagen stabilization leads to increased efficacy of anti-PD-1 blockade on tumor growth.

      Relevance of the study: T cells are excluded from a large proportion of solid tumor. This represents an obstacle to T-cell-based immunotherapies. The authors make the hypothesis that this can be, at least partly, due to the organization of the ECM in the tumor that would oppose physical resistance to the infiltration and migration of T cells. The results are sound and important for the community since 1) they describe thoroughly some of the mechanical aspects of several models used in the literature, 2) they thoroughly analyzed the effect of an inhibitor of collagen crosslinking on these mechanical properties 3) study the effects of these modifications in T cell motility and 4) test in one tumor model the effects of the combination of an inhibitor of collagen crosslinking with anti-PD1 immunotherapy. The results are convincing and I only have minor concerns.

      In the first part of their study, the authors analyze the structure heterogeneity of 5 different carcinomas, i.e. subcutaneous model of cholangiocarcinoma (EGI-1), subcutaneous (MET-1) and transgenic model (MMTV-PyMT) of mouse breast carcinoma, orthotopic (mPDAC) and subcutaneous (KPC) models of mouse pancreatic ductal adenocarcinoma.

      They measure the tumor stiffness during tumor growth using Shear Wave Elastography (SWE) and analyze the organization of the collagen fibers in these models. To my knowledge, this represents the first characterization of different tumor models classically used to study tumor immunity and is thus very useful for the scientific community. In particular, the authors show a correlation between high tumor stiffness and accumulation of thick and densely packed collagen fibers.

      Minor modifications: The authors should indicate more clearly the number of mice and tumors investigated.

      In the second part of their study, the authors treat the mice with beta-aminopropionitrile (BAPN), an inhibitor for LOX enzymatic activity in the drinking water and analyze the stiffness of tumors and collagen fiber organization in tumors. They show the heterogeneity of response in the different models in both stiffness modulation and collagen fibers remodeling. Mostly this treatment reduces the stiffness of tumors without affecting their growth.

      Minor modifications: The authors should clarify how "normalized tumor stiffness" indicated in the legend of figure 2 is calculated. Indeed, this is an important point since tumor stiffness is associated to the sizes of tumors. Moreover, they should also indicate more clearly the number of mice and tumors investigated. Concerning collagen fibers orientation, authors should use a dot plot representation instead of bar histograms in order to show the distribution in the different tumors.

      The authors then analyze how BAPN treatment modifies the migration of T lymphocytes in the tumors. Because of the different models used, the authors either added activated purified T cells from human donors (EGI-1model), or mouse activated T cells (MMTV-PyMT tumor model) or followed the motility of human resident T cells in mPDAC and KPC mice tumor models. Although the models are very different, the correlation between tumor stiffness and T cell speed and T cell displacement is specially striking in tumors from BAPN treated mice. It seems that T cell motility responds to two different regimens in tumor from untreated or BAPN treated mice. This might be due to difference of stiffness in untreated and treated mice but might also results from another parameter.

      Minor modifications: The authors should discuss this point. Indeed, the main conclusion of their work and short title of their study is that the main parameter involved in T cell motility and access to the tumor is tumor stiffness but then the slopes should be the same as in the spontaneous MMTV-PyMT tumor model. There are probably other parameters involved in the regulation.

      The authors then investigate the effect of BAPN treatment of tumor bearing mice on response to PD-1 immunotherapy. They perform experiments in KPC tumor bearing mice and show that BAPN treatment alone significantly decreases the number of neutrophils, increases the presence of MHCII+ TAMs. Yet, the combined therapy (BAPN and PD-1) is necessary to expand the percentage of GrzmB CD8+ T cells and the ratio of CD8+ to Treg cells and is associated with an increase in cytokine production. The combined treatment also leads to a decrease in the tumor sizes. Although these results are convincing as they are, confirmation of the results in another model would strengthen the results.

    4. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 12 2020, follows.

      Summary

      The work analyzes the stiffness and collagen distribution in different tumor models implanted in mice and shows that treatment with an inhibitor of collagen crosslinking correlates with changes in their stiffness. This results in a change in the motility of resident T cells. The inhibitor of collagen crosslinking increases the number of tumor-infiltrating CD8+ T cells and leads to increased efficacy of anti-PD-1 blockade on tumor growth. The reviewers have discussed the reviews with one another and the Reviewing Editor and their views concur. Although the work has potential for publication in eLife, it requires essential additional data and statistics to support the central claims of the paper. Each reviewer raised substantive concerns (see below) that need to be resolved experimentally. To quote a few, you should provide a measurement of the collagen crosslinking in mice treated by BAPN to confirm that this drug has the expected effects. The combined BAPN plus anti-PD-1 therapy needs also to be confirmed in another model. Measurements of stiffness, collagen structure and T cell speed should be provided for all treatment conditions (control, LOXi, PD1i and combo) rather than just for LOX inhibition. Importantly, several important conclusions are based on inadequate sample size to be conclusive (see below). Along that line, the number of mice and tumor cells plus corresponding statistics need to be indicated in all the figures.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 19 2020, follows.

      Summary

      This manuscript describes a laboratory evolution experiment designed to explore effects that may shape evolutionary trajectories in a native host environment. The model system is E. coli nitro/quinone reductase NfsA, a promiscuous FMN-dependent oxidoreductase that reduces toxic compounds and has the basal ability to reduce the antibiotic chloramphenicol. This function was used to select for improved detoxification by mass-mutagenizing eight active-site residues and isolating variants with up to tenfold higher tolerance against chloramphenicol. The five best variant proteins were purified and characterized, showing that their kcat/Km was only marginally improved, with worse kcat but improved Km, indicating that the improvements in detoxification were driven by enhanced substrate affinity. For the top two variants, all possible evolutionary trajectories were recreated and their EC50's tested to determine the most likely possible step-wise paths from NfsA to the final variants. The authors found that iterative evolutionary strategies could have generated similar variants, but that there were only few accessible pathways, indicating epistatic effects. The analysis also showed that for both variants, elimination of arginine at position 225 in the first step enabled further improvements to take hold and played a role in the loss of wildtype 1,4-benzoquinone activity. The sensitivity to four out of five tested prodrugs was however unchanged. Turnover of the fifth prodrug, namely reduction of metronidazole, which yields a toxic product, was on the other hand increased in the evolved variants, and could be used as a counter-selectable marker. This was briefly tested showing the potential of such an application.

      Essential Revisions

      This study presents a wealth of data, and is well reasoned, carefully executed and clearly laid out. However, although it states that its aim was to study the evolution of a promiscuous function within the native host environment and thus under metabolic interference of the native substrate, this was not the approach taken. Instead, a fitness peak for the promiscuous function was identified through mass mutagenesis at eight positions followed by selection, and then two potential evolutionary paths leading from the wild type to this peak were inferred based on an analysis of all possible mutant combinations at the mutagenized positions. The authors need to make clear throughout the paper that the variants able to detoxify chloramphenicol were not evolved and did not arise against metabolic interference of the native substrate. This is an important point as the considerable potential of endogenous metabolites to shape evolutionary outcomes (Abstract) is purely inferred from the observation that the first mutation in both reconstructed evolutionary paths appears to have been a mutation at R225, which led to a substantial drop in the turnover rate of the endogenous substrate. From this the authors conclude (very prominently throughout the paper) that the evolution of a new activity is only possible after loss of activity against the original substrate.

      From the data presented, it is however not clear to what extent this conclusion is supported.

      1) According to the data in Figure 4 and Table S1, mutation of R225 alone is accompanied by a ~2-fold increase in kcat/Km for chloramphenicol. This seems to be sufficient to explain the ~2-fold increase in EC50 for chloramphenicol without invoking loss of quinone reductase activity. The control experiment in Figure 5, showing that substitution of R225 has no effect on most promiscuous activities of NfsA, also seems to indicate that the loss of native activity is not required for the evolution of chloramphenicol resistance. It would be important to determine the kinetic parameters of 1,4-benzoquinone reduction for NfsA and the purified R225V and R225D mutants in order to establish the loss of quinone reductase activity in the postulated first step of the evolutionary path. It would also be useful to study the effect of 1,4-benzoquinone competition on the chloramphenicol reductase activity of the mutants, at least the first ones along the proposed path, in order to show that they rapidly become insensitive to the native substrate.

      2) After the initial screening of the transformed library of NfsA variants, 0.05% of gene variants are reported to be more effective in chloramphenicol detoxification than the wild type. In the next steps, this number is reduced to the top 30 variants, as characterized by their improvement of chloramphenicol EC50 values (Fig. 1D). However, it is not clear from the presentation whether these observations were controlled for the expression levels of the different NfsA mutants. Protein variants are often expressed at different levels in vivo, which can have a significant effect on the activity measured. Fig. 1D was used for selection of the "best" variants for the rest of the study and to support this choice and the conclusions of the manuscript, relative enzyme expression levels should be reported (and if significantly different, should be corrected for). Such expression levels are reported later on for the 36_37 and 20_39 variants, but are missing at this early stage.

      3) While mutation of R225 appeared to be required for improved chloramphenicol detoxification in this study, the authors only considered the effects of substitutions at eight positions. This is probably the main weakness of the combinatorial mutagenesis approach used here. It seems plausible that substitutions at other positions could also increase chloramphenicol tolerance, possibly opening a path without loss of quinone reductase activity. If the authors were able to perform one round of error-prone PCR on NfsA with selection for improved chloramphenicol resistance and obtain mainly variants with substitution of R225, this would substantially strengthen their claim that evolution of increased chloramphenicol resistance can only occur through loss of quinone reductase activity.

      4) Even with additional experimental support for the main conclusion of the article, it seems fundamentally problematic to extrapolate from two instances to a general principle of evolution. The authors should tone done the claims that improved chloramphenicol detox activity is ONLY possible after elimination the native activity and instead comment on the two characterized mutant pathways as examples of this phenomenon, within the limitations of the experimental setup.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 11 2020, follows.

      Summary

      In this article, the authors sought to identify interacting partner(s) of Zc3h10, a transcription factor that activates expression of UCP-1 and other brown-adipocyte-specific genes. They identified the H3K79 methyltransferase Dot1L as an epigenetic modifier that interacts with Zc3h10 and facilitates its action on UCP-1 and other brown-adipocyte-specific genes. The strength of the manuscript is that the hypotheses were examined by a wide range of approaches, including protein-protein interaction assays, cell culture studies, and animal models. The present data provide solid evidence for an important role of Dot1L in the regulation of brown-adipocyte-specific genes. At the same time, all of the reviewers believed that the mechanisms involved in the epigenetic regulation of gene expression by Dot1L need to be investigated in more depth. Because the roles and the regulation of H3K79 methylation are still not fully understood, it would advance the field if the authors could provide additional data along these lines.

      Essential Revisions

      1) The protein expression of Dot1L appears disproportionally specific to brown adipose tissue compared with the mRNA expression. Is there a protein-gene difference? Is the identity of the band correct? Can the author see the loss of the band in the BKO tissues?

      2) Most of the experiments appear to have been performed under basal conditions (e.g. unstimulated cells, room temperature mice). Dot1L expression increases with cold exposure and in their previous study that Zc3h10 is recruited to thermogenic gene promoters by p38 MAPK phosphorylation in response to adrenergic activation. The present study would benefit from additional experiments demonstrating a dynamic role for Dot1L in thermogenic gene transcription during cold exposure. Does cold exposure/adrenergic activation modulate Dot1L-Zc3h10 interaction, Dot1L recruitment to thermogenic gene promoters, H3K79 methylation, etc. in a p38 MAPK-dependent manner?

      3) The authors conclude that "these results show that Dot1L methyltransferase activity is required for thermogenic gene expression and other Zc3h10 target genes for the BAT gene program (line 175)" and that "Dot1L enzymatic activity is critical for activating the thermogenic gene program in vivo" (line 255). These statements are not yet strongly supported. The involvement of H3K79 methylation was implied through the use of a chemical inhibitor EPZ5676 (figure 2c and figure 3) and the reduction of H3K79me3 levels in whole-cell lysates (figure 3b). Can the authors confirm the change of H3K79 methylation at brown-adipocyte-specific genes by conducting ChIP-QPCR in the experiments where they modulate either expression or enzymatic activity of Dot1L? In addition the authors could compare the effect of the overexpression of wild-type Dot1L (figure 2d) with a mutant Dot1L in which critical residue(s) are substituted in the enzyme catalytic core.

      4) In Figure 4D, ChIP-qPCR shows the reduction of H3K79me2 and H3K79me3 enrichment on thermogenic gene promoters in Dot1L knockout BAT. It would be informative to examine changes in Dot1L and Zc3h10 binding on the same regions in BAT.

      5) Although not absolutely required, it would be ideal to look at global epigenetic changes following induction or inhibition of Zc3h10 and/or Dot1L. Of particular interest is whether Zc3h10 plays any role in tethering Dot1L and modulating H3K79 methylation at brown-adipocyte-specific genes and whether the epigenetic changes induced by are specific to brown adipocyte function or not. The authors should consider performing ChIP-seq for H3K79me2/3 and Zc3h10. The use of tagged protein is an alternative if the antibody is not available for the latter. H3K79me2/3 reportedly marks a subset of enhancers (2019 Nat Commun, 10.1038/s41467-019-10844-3). It is also of interest whether such H3K79me2/3-marked enhancers are enriched in the vicinity of the brown-adipocyte-specific genes.

      6) The raw ATAC-seq data show very high background (figure 5A). How many peak calls were obtained? Improvement is necessary to discuss the qualitative differences. Please consider the use of culture cells if the technical hurdle is caused by the use of tissue samples. Further, the pattern of the aggregate plots (figure 5A left upper) does not obviously match that of the heatmaps (figure 5A, left lower). Please describe exactly what is shown in the figure. Are the scales the same for each panel or the heatmaps zoom in the area indicated by the red dotted boxes in the aggregate plot? What do the red dotted boxes mean? What does color scale bar in the middle heatmap mean?

      7) Also related the ATAC data, Figure 5C suggests that loss of Dot1L alters chromatin accessibility at far more genes than is represented in the "Shared processes" pathway analysis. It would be helpful to see a more comprehensive analysis of the ATAC-seq data (is thermogenesis one of the top pathways, what other pathways are affected, is the Zc3h10 binding motif overrepresented at sites of increased chromatin accessibility? etc.) and discussion about why the ATAC-seq changes might be more general/less specific than the RNA-seq changes.

    1. Reviewer #3

      This paper reported that estrogen can accelerate mammary involution by exacerbating mammary inflammation, inducing programmed cell death, and promoting adipocytes repopulation, that the effects of estrogen on the expressions of genes during mammary involution are majorly mediated by neutrophils, and that estrogen promotes mammary LM-PCD independent of neutrophils by inducing the expression and activity of lysosomal cathepsins and other pro-apoptotic markers such as Bid and Tnf. These findings are potentially interesting, and could expand the functions of estrogen. However, there is a lack in mechanistic insight into these observations.

      I. The mechanism underlying estrogen-induced cell death needs to be further explored. For example, what kind of player(s) connects estrogen with cell death? Whether TNF-alpha plays a role in linking estrogen to cell death? Is there any enrichment of cell death genes associated with the estrogen treatment in RNA-Seq data? Why the artificial MCF-7/Caspase-3 cells were used? The results about MCF-7/Caspase-3 cells showed that estrogen promoted TNF-alpha-induced apoptosis, rather than lysosomes-associated cell death. Maybe the authors should try MCF-10A cells as the model.

      II. Based on the data on Figure 4, it is not so convincible to conclude that neutrophils are involved in adipocytes repopulation during mammary involution normally, please see also Issues#3. The authors need to re-consider the relationship between these data and the conclusion. Maybe they should re-describe these results or modify the conclusion.

      III. The most interesting finding is that estrogen does not trigger the similar biological actions in age-matched nulliparous mammary tissue. However, this study does not figure out the molecular mechanism underneath the difference between the functions of estrogen in involutional and nulliparous mammary tissues. At least, the author should discuss about the potential possibilities.

      Other issues:

      1) Quantification in Figure 1B should indicate the fractions, for example, No. cells of total or area. The data in Figure 1C, except Csn2, were not described in the content, and these data should be associated with adipogenesis. As for Figure 1D, no any description was presented about Ly6G, and in fact, it was described in the second part of Results section. Supplemental Figure 2 was mentioned in the content before Supplemental Figure 1. The first part of results was very important for readers to understand the paper, but these problems confuse the readers.

      2) In Page 13, Line 219, "E2B treatment alone without the antagonist (E2B+DMSO) lead to an expected 1.57-fold increase (p=0.0082) in mammary neutrophils as compared to the Ctrl+DMSO". Should "1.57-fold" be "2.57-fold" or something other? It is not the case based on the data in Figure 3Ci.

      3) In Figure 4B, upon neutrophil depletion, Cebpb and Cebpd were already increased, which could limit their further enhancement when treated with E2B. As for Adig and Egr2, it seemed that they also apparently increased. In Figure 4D, the data had the similar problem to those in Figure 4B. No description about Figure 4E and 4F was found in the content. Overall, these data put it in question that estrogen-induced adipocyte repopulation is associated with the induction of adipogenic and tissue remodeling genes through neutrophils.

      4) In Page 19, Line 305, "This suggests that the up-regulation of Ctsb expression by E2B is a direct event independent of STAT3 activation". These data in Figure 5B could not demonstrate that Ctsb expression is the direct event of E2B. In Figure 5D, why the lysosomal pellet fractions showed no lysosomal proteins, such as catheptins. In Figure 5A, at least the protein level of TNF-alpha should be measured, because it was very important for the functions of E2B, based on the data in Figure 6.

      5) In Figure 6, TNF induces p-STAT3 while Fig 5A shows E2 induces TNF expression (mRNA), but no p-STAT3 was increased in Fig 5B. The increased mRNA does not mean the increased protein. Please measure the TNFa proteins in Fig 5A (See Issue#4). The MCF-7/Casp3 model seems not to well support the conclusion. The data in Figure 6 are about typical apoptosis not the lysosomes-associated cell death involved in the functions of estrogen as revealed in this study.

    2. Reviewer #2

      In this study, Chew Leng Lim et al determined the diverse effects of Estrogen exposure on neutrophil infiltration, inflammation responses, cell death and adipocytes repopulation in mice models. While the authors revealed some new findings, this study suffers from obvious defects, including overdependence on the use of chemical inhibitors, lack of in-depth mechanistic investigation as well as unfocused research topics.

      Major concerns:

      1) In addition to neutrophil, estrogen exposure also induced macrophage infiltration, while "neutrophil" deletion by using anti-Ly6G antibody obviously reduced the infiltration of macrophage (Fig S1). Therefore, the role of macrophage in Estrogen exposure-induced biological responses should be deeply determined.

      2) Since anti-Ly6G antibody also reduced macrophage infiltration significantly, it is very likely macrophage play a pivotal role in Estrogen exposure-regulated gene expressions and cellular phenotypes. Therefore, the conclusion that 88% of estrogen-regulated genes are mediated through neutrophil is not solid. This point should be addressed by specific deletion of macrophage and neutrophil, individually.

      3) The source of CXCL1/CXCL2 upon estrogen exposure should be further investigated. In Fig 8, the authors indicated that CXCL1 and CXCL2 are produced by existing neutrophil. Further evidence to support this should be provided.

      4) Many biological and small molecule inhibitors, including anti-Ly6G antibody, PAQ (S100a9 inhibitor), CXCR2 antagonist SB225002, etc, have been frequently used in this study. However, the effects and specificity of some of these agents have not been well validated during the study. The use of genetic mice models for critical signaling pathways is highly suggested.

      5) Figure 5, The critical role of CTSB in the activation of CTSD/CTSL and induction of LM-PCD upon E2B treatment should be validated by downregulation of CTSB expression pharmacologically or genetically.

      6) The data presented in Figure 7 based on the analysis in a single cell line is not reliable.

    3. Reviewer #1

      Dr. Valerie Lin discovered some interesting links between estrogen signals and the differentiation and programmed death of mammary cells, as well as the formation of pro-inflammatory microenvironment, which facilitate post-partum mammary involution and presumably parity-associated breast cancer. They also demonstrated that mammary gland-infiltrating neutrophils emerge as a major immune cell participant during this process.

      1) Dr Shengtao Zhou reported that ERβ has potent antitumor effects, which suppress lung metastasis by recruiting antitumor neutrophils to the metastatic niche. It is recommended to carefully address whether estrogen specifically target ERα or ERβ on post-weaning mammary cells and infiltrating neutrophils in this setting.

      2) Perhaps a missing point is whether estrogen-induced mammary cell death subsequently cause inflammation (presumably augmenting neutrophil accumulation), since several cell death modalities has been associated to inflammation via the release of danger molecules.

      3) Besides lysosome-mediated PCD, can estrogen induce other cell death modalities, such as pyroptosis, necroptosis, ferroptosis, which are all caspase 3/7/8-independent? It is important to make a clear conclusion.

      4) Since the synthetic estrogen regulate the differentiation associated genes of fat cells and accelerated LM-PCD. Does estrogen affect lipid metabolic pathways? How does this metabolic remodeling affect cell death and differentiation of adipocytes and the function of neutrophils? By carefully going over the Seq data, the authors may add more important discussions.

    4. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 30, 2020, follows.

      Summary

      Lin and colleagues aim to explore how estrogen promotes post-partum mammary involution and increases the risk of parity-associated breast cancer. Previous studies have unraveled different molecular mechanisms of mammary gland involution (see a previous summary PMID: 30448440). The authors highlighted that estrogen causes mammary involution by stimulating the accumulation of neutrophils, sculpturing the pro-inflammatory microenvironment, facilitating adipocytes differentiation, and causing lysosome-related programmed death of mammary cells. These findings are interesting, but further efforts are needed to nail down some of the major conclusions and to clarify the underlying mechanisms.

      Essential Revisions

      1) A major concern is about several discoveries on neutrophils. The contribution of neutrophils during estrogen-induced mammary involution should be cautiously defined with solid experimental evidence. Do other immune cell populations, such as macrophages, actively participate in this process? Does Ly6G antibody efficiently deplete neutrophils, rather than masking the labeling of Gr1 antibody (for validation)? As neutrophils have short half-life, secret lots of inflammatory mediators, and quickly replenish from BM progenitors, this point is important. The possible coordinations between neutrophils and other immune cells (e.g. macrophages, monocytes), and their relative importance at different stages of mammary involution can be examined and discussed (see a previous summary PMID: 24952477). In addition, more evidences are needed to prove whether the recruitment of neutrophils depends on a positive-feedback loop of CXCL1 and CXCL2. Quality controls are needed for the application of inhibitors.

      2) Another major concern is about estrogen-triggered mammary cell death. How do ERα, ERβ or death receptor-mediated signals contribute to this process? Does estrogen-induced programmed cell death exclusively rely on lysosome leakage and related effector molecules? Have the authors tested the existence of other cell death modalities? Does estrogen-induced cell death augment local inflammation and perhaps the accumulation of immune cell populations?

      3) The authors may consider to rephrase/weaken some of their claims and reorder the display of some of their results.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 22, 2020, follows.

      Summary

      This report provides significant new information about the mechanisms of neurosteroid enhancement and inhibition of GABA-A receptor (GABAR) function. This study builds on an earlier investigation by the same group (Chen et al. PLOS Biology 2019) showing that photoactive NS ligands can bind to three distinct sites on α1β3 GABARs - the canonical intersubunit site at the interface between the transmembrane domains (TMDs) of adjacent subunits and additional intrasubunit sites located within the TMDs of the alpha and beta subunits. In the current study, combining [3H]muscimol radioligand binding assays, site identification by photoaffinity labeling, and electrophysiological analyses of steroid modulation of wildtype and mutant α1β3 GABARs, the authors suggest that the overall functional effect of a given NS molecule is dependent upon which binding sites are targeted, with binding to the intersubunit site causing positive allosteric modulation (PAM), whilst occupancy of the intrasubunit sites appear to promote desensitization and negative allosteric modulation (NAM). Given the physiological significance of neurosteroids, elucidating how these structurally similar compounds can act as positive, negative or null modulators is clearly important.

      Essential Revisions

      1) The electrophysiological data presented (changes in steady state desensitization current magnitudes) is insufficient to conclude that NAM steroids inhibit GABAR function by stabilizing a desensitized state. Additional experiments such as co-application of agonist + NS and monitoring desensitization kinetics would be informative. Measuring the rate of recovery from agonist-induced desensitization in the presence of neurosteroids might also be helpful. While the data presented can be interpreted as changes in desensitization, the authors should discuss that alternative models are also possible. For example, it has been proposed that selectively stabilizing a pre-active state can result in changes in macroscopic desensitization (Gielan and Corringer, J. Physiol. 2018).

      2) Mutant receptors were not assayed for their sensitivities to agonist before measuring effects of neurosteroids. The functional assays and binding experiments need to be done at a consistent fractional EC value for each mutant construct being analyzed. For example, if the apparent Kd for muscimol has shifted substantially, the observed potentiation of muscimol binding by a neurosteroid will be artificially high or low. The is also true for experiments measuring neurosteroid potentiation/inhibition of functional activation by GABA.

      3) In the result section, there are concerns about quantitatively comparing electrophys data and [3H]muscimol data (measured at different agonist concentrations and time periods). Are the methods reliable enough to infer that the small changes in Popen and Pdesensitized are real? In some cases, data are not shown. Inherent methodological limitations of two-electrode voltage clamping (e.g. slow ligand exchange) raises concerns that authors are over interpreting the data. As it stands, the comparison seems to be a bit of a reach and in this reviewers' opinion does not significantly add to the paper.

      4) While having three distinct sites for NS binding to GABARs does fit with aspects of the data, it's noteworthy that with the suggested model, there are three ligands that bind to all three sites, 3a5aP, KK148 & KK150, but each has a distinct functional profile, PAM, NAM via stabilizing desensitization, and competitive antagonist, respectively. This implies that divergence in function is dependent upon differential binding/efficacy at these three sites, presumably due to the ligand sitting in each site in a different orientation. While the observation from the [3H]muscimol binding experiments suggests that 3a5aP binds to the b3 intrasubunit site with lower affinity, the data presented in Fig 6B also suggest that binding of 3a5aP to the intersubunit and a1 intrasubunit sites works synergistically to increase muscimol binding. The reasoning being because with both sites intact, the Emax for muscimol binding is 374%, whereas mutating these sites individually causes similar decreases in Emax (to 159% and 146%). This implies an allosteric interaction between these binding sites, a conclusion which the authors also reach in their previous publication (Chen et al 2019). This makes interpretation of the effects of mutations in these two sites (and possibly also the beta intrasubunit site) difficult to interpret and to use to specifically dissociate a mutations effects on NS actions to binding to one particular site. The authors need to thoroughly discuss this concern/limitation.

      5) The demonstration that steroids apparently enhance [3H]muscimol binding affinity without changing the number of sites (Fig 6 supplement 1) is in contrast to past reports from multiple labs that [3H]muscimol binding (to brain membranes) is characterized by high and low affinity components and that steroids and other GABAR positive allosteric modulators increase the number of high affinity sites with little effect on their binding affinity. Please discuss. In addition, we would like to see presented in supplementary material representative experimentally determined [3H]muscimol binding curves (total and non-specific vs [ 3H]musc concentration, not just the calculated Bspec of fig6 supp fig 1). In their methods (p.25) they say that they determined [3H]muscimol binding isotherms from 0.3 nM to 1 uM [3H]muscimol at a radiochemical specific activity of 2 Ci/mmol. It is surprising that they can go to micromolar concentrations with such small uncertainties, and it is crucial to their claim steroids produce only shifts of affinity, not shifts of Bmax.

      6) [3H]muscimol binding is measured on cell homogenates over a time scale of hours. There seems no reason to "infer" that 3α5αβP increases [3H]muscimol binding by stabilizing an active state while 3β5αP stabilizes a desensitized state. By my reading, the previous studies (13,14) report that the αV256S mutation removes the "inhibitory" effects of sulfated steroids and 3β5αP, not the "desensitizing" effects, this should be more clearly articulated in this manuscript. This report will be strengthened by avoiding unnecessary overinterpretation, and leave it for future studies to determine whether there is any measurable quantity of receptors in an active state under the conditions of the [3H]muscimol equilibrium binding assay.

      7) Given the expectations that some of the neurosteroids stabilize a desensitized state, do they "fit" in the proposed intrasubunit sites in, for example, one of the published presumed desensitized-state structures of the α1β3γ2 receptor?

      8) A more thorough discussion of why recently solved GABAR structures have not resolved intrasubunit neurosteroid binding sites is warranted.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 25, 2020, follows.

      Summary

      The manuscript by Fournier et al. highlights the importance of acetylation in the ChAM domain of PALB2 in regulating nucleosome binding and DNA repair. The text is well-written text and the experiments are well-designed. We read the manuscript, the reviews from Review Commons, as well as the rebuttal and plans for a revision. We believe the revision and the proposed added experiments will be needed to cement the conclusions.

      Of the 5 experiments, #1 and #2 are critical, and we believe #4 and #5 are also important as BRCA1 is a key factor for PALB2 (#5) and the effect on HR (#4) should be documented experimentally. We do not consider experiment #3 (PALB2 foci) as critical. We encourage the authors to plan and execute this revision as they outlined with the above exception of experiment #3. You may want to consider combining KAT2A/B depletion and/or using KDAC inhibitors in the experiments with the 7R and 7K mutants, but we leave that suggestion to them.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 22, 2020, follows.

      Summary

      The revised paper presents a better-fitting analysis, and does a more nuanced job in discussing the results than the original manuscript. However, there are still a few major criticisms that we have for the analysis, detailed below.

      Essential Revisions

      1) Brain-wide, multiple-comparison corrected tests comparing auditory versus visual decoding are still lacking. The authors have now provided vertex-wise Bayes factors within areas that showed significant decoding in each individual condition. Unfortunately, this is not satisfactory, because these statistics are (1) potentially circular because ROIs were pre-selected based on an analysis of individual conditions, (2) not multiple-comparison corrected, and (3) rely on an arbitrary prior that is not calibrated to the expected effect size. Still, ignoring these issues, the only area that appears to contain vertices with "strong evidence" for a difference in neuro-behavioral decoding is the MOG, which wouldn't really support the claim of "largely distinct networks" supporting audio vs. visual speech representation.

      The authors may address these issues, for instance, by (I) presenting additional whole-brain results - e.g. for a direct comparison of auditory and visual classification (in Figure 2) and of perceptual prediction (in Figure 3). (ii) presenting voxel-wise maps of Bayesian evidence values (as in Supplementary Figure 3) for the statistical comparisons shown in Figure 2D, and Figure 3D (iii) in the text included in Figure 2D and 3D making clear what hypotheses correspond to the null hypothesis and to the alternative hypothesis (i.e. auditory = visual, auditory <> visual).

      2) As noted before, the classifiers used in this study do not discriminate between temporal versus spatial dimensions of decoding accuracy. This leaves it unclear whether the reported results are driven by (dis)similarity of spatial patterns of activity (as in fMRI-based MVPA), temporal patterns of activity (e.g., oscillatory "tracking" of the speech signal), or some combination. As these three possibilities could lead to very different interpretations of the data, it seems critical to distinguish between them. For example, the authors write "the encoding of the acoustic speech envelope is seen widespread in the brain, but correct word comprehension correlates only with focal activity in temporal and motor regions," but, as it stands, their results could be partly driven by this non-specific entrainment to the acoustic envelope.

      In their response, the authors show that classifier accuracy breaks down when spatial or temporal information is degraded, but it would be more informative to show how these two factors interact. For example, the methods article cited by the authors (Grootswagers 2017) shows classification accuracy for successive time bins after stimulus onset (i.e., they train different classifiers for each time bin 0-100 ms, 100-200 ms, etc.). The timing of decoding accuracy in different areas could also help to distinguish between different plausible explanations of the results.

      Finally, it is somewhat unclear how spatial and temporal information are combined in the current classifier. Supplemental Figure 5 creates the impression that the time-series for each vertex within a spotlight were simply concatenated. However, this would conflate within-vertex (temporal) and across-vertex (spatial) variance.

      3) The concern that the classifier could conceivably index factors influencing "accuracy" rather than the perceived stimulus does not appear to be addressed sufficiently. Indeed, the classifier is referred to as identifying "sensory representations" throughout the manuscript, when it could just as well identify areas involved in any other functions (e.g., attention, motor function) that would contribute to accurate behavioral performance. This limitation should be acknowledged in the manuscript. The authors could consider using the timing of decoding accuracy in different areas to disambiguate these explanations.

      The authors state in their response that classifying based on the participant's reported stimulus (rather than response accuracy) could "possibly capture representations not related to speech encoding but relevant for behaviour only (e.g. pre-motor activity). These could be e.g. brain activity that leads to perceptual errors based on intrinsic fluctuations in neural activity in sensory pathways, noise in the decision process favouring one alternative response among four choices, or even noise in the motor system that leads to a wrong button press without having any relation to sensory representations at all."

      But, it seems that all of these issues would also effect the accuracy-based classifier as well. Moreover, it seems that intrinsic fluctuations in sensory pathways, or possibly noise in the decision process, are part of what the authors are after. If noise in a sensory pathway can be used to predict particular innacurate responses, isn't that strong evidence that it encodes behaviorally-relevant sensory representations? For example, instrinsic noise in V1 has been found to predict responses in a simple visual task in non-human primates, with false alarm trials exhibiting noise patterns that are similar to target responses (Seidemann, E., & Geisler, W. S. (2018)). Showing accurate trial-by-trial decoding of participants' incorrect responses could similarly provide stronger evidence that a certain area contributes to behavior.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 16, 2020, follows.

      Summary

      Using a clever genetic system in the budding yeast Saccharomyces cerevisiae the authors test whether R-loop can form in trans, meaning that a transcript from locus A could lead to R-loop formation in locus B. Moreover, they test whether R-loop formation is dependent on Rad51, the eukaryotic RecA family recombinase. Using their genetic system and cytological analysis of Rad52 foci and the S9.6 antibody to detect R-loops in wild type and strains with mutations known to affect R-loop, conclusive data are shown that R-loops only form in cis and that R-loops in this genetic system are independent of Rad51. Overall, this work significantly enriches the discussion in the R-loop field and provides an alternative view point of an earlier publication that suggested R-loop formation in trans being catalyzed by Rad51.

      Essential Revisions

      1) The pGal promoter induces very high transcription in the presence of galactose (often 500X or more induction). The level is likely very different (much less) for the tet promoter, which is generally only induced 2-3X upon addition of doxycycline. This could significantly affect the results - e.g. the cis vs trans effects could really be a matter of different transcription levels. Transcription levels from each promoter really need to be determined- this is a very important control. The exact induction conditions used, including concentrations and induction times, need to be spelled out in the methods and should be consistent with those used during the RT-PCR experiment to test transcript levels. In the absence of being able to do the experiment on the constructs used (which would be optimal), at least it could be cited if this lab has used the same promoters and induction conditions in the past, and a caveat inserted if transcription levels are different. It would also be good to switch the promoters and make sure the result holds, as there could be issues of differences in timing of transcription as well.

      2) In Figure 2 the authors relate recombination frequencies in their assays to RNA:DNA hybrid formation without measuring hybrids directly. This is a major weakness that significantly limits data interpretation. For instance, I am very surprised that the "cis" recombination frequency of the inverted LacZ reporter is essentially as high as the regular lacZ construct. This result implies that hybrid formation is insensitive to the orientation of the reporter when in many reported cases, R-loop formation is strongly orientation-dependent. Of course, another hypothesis is that (stalled) transcription itself triggers recombination, not R-loops. Without data on R-loop formation, one cannot disentangle transcription from co-transcriptional R-loop formation. The authors must use DRIP-based assay to quantify R-loop levels in the various sequence contexts and under the various genetic backgrounds to establish that their assay is reflective of R-loop levels. Using bisulfite-based readouts to measure R-loop distributions and lengths across the LacZ region would be even better. Without this data, the claim that this new genetic assay can "infer the formation of recombinogenic DNA:RNA hybrids" is unsubstantiated.

      3) Source data. The source data file should be labeled better. Missing are:

      • what the numbers in the table are (rates of Leu+ x 10^-4?)
      • which data goes with which Figure panel
      • average and SEM numbers should be shown in the data table

      The exact p values not reported and could be added to source data file. N values can be discerned from the source data file but it would be nice for them to be stated in the figure legends.

    1. This manuscript is in revision at eLife

      The manuscript was reviewed by Review Commons. eLife's decision letter, sent to the authors on April 18, 2020, follows.

      Summary

      In this manuscript, the authors study the transcriptional regulation of HOXA9, a transcription factor that plays a central role in homeostasis of immature hematopoietic cell types and in the development of leukemia. They use the CRISPR/Cas9 technique to introduce a fluorescence reporter cassette into the endogenous HOXA9 locus of a human MLL/AF4-rearranged B-ALL cell line. After validating this engineered cell line, they perform multiple genetic screens to identify potential transcriptional regulators of HOXA9 and to delineate essential transcription factors in this cell line. They identify USF2 as new transcription factor that modulates expression of HOXA9.

      Major Revisions

      If the authors can commit to adding the following data, as they indicate in their rebuttal, the manuscript would be greatly strengthened and could be considered acceptable:

      1) The authors should include their data on the independent loss-of-function CRISPR transcription factor screen in SEM HOXA9-P2A-mCherry MLLr reporter line ectopically expressing HOXA9-MEIS1 to overcome the possibility that key regulators could be missed in the CRISPR/Cas9 screen due to survival dropout.

      2) The authors should include supporting data for the key observations in the manuscript in other cell lines; for example, as indicated by the authors, the data gathered using an additional HOXA9 MLLr AML reporter cell line established in OCI-AML2 cells to further support findings from the initial in SEM MLLr ALL reporter line.

      3) The authors should perform USF2 knockout experiments in multiple non-MLLr cell lines according to the reviewer's suggestions. As an example, the authors should repeat the competitive proliferation assay to determine the effects of the single knockout of USF1 and USF2 vs the double KO in SEM cells and other MLLr leukemia cell lines with proper controls.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 7, 2020, follows.

      Summary

      This manuscript presents a new agent-based model of coral reefs that is designed to answer questions about the response of coral reefs to multiple stressors in a mechanistic, bottom-up way. The model uses traits and functional types of corals and algae to represent not only taxonomic but also functional diversity. The manuscript includes a very impressive description of the design, calibration and testing of a coral reef model. The authors have used the ODD protocol (to some degree), calibration of 12 model parameters for three empirical locations in the Caribbean, hierarchically structured validation, and global sensitivity analysis. Spatial interactions between corals and algae are represented in detail and allow to analyze relations between traits and functional responses and thus to depict realistic trajectories of reefs under different scenarios of external forcing.

      Agent-based models are often criticized because of their complexity, which makes them difficult to parameterize, calibrate, test, and understand. This manuscript is an impressive demonstration of how it is possible to combine all relevant existing data in a systematic way, test a model at multiple levels, and thus demonstrate that, yes indeed, trait-based agent-based models allow us to model the role of diversity (see also this review: Zhakarova et al. 2019).

      Essential Revisions

      1) The Introduction takes a lot of space in discussing challenges to coral reefs. I guess virtually all papers about coral reefs start like this. It should be shortened, also because it raises the expectation that you are going to tackle these questions, which is not the case. Rather, this is a methods paper and you should come to this point more directly and perhaps list the challenges to ABMs for exploring diversity (see above) as the key challenge addressed in this manuscript.

      2) If you say, in the Abstract, that the model "provides a virtual platform": Where can we download the software? Is there a manual describing the workflow needed for running the model and all its data scripts? Is the model description in the supplements complete? If not, this article would not really provide a tool. You might have a look at two examples where ABMs were presented, in journal articles, as tools. In both cases there was a full model description, a manual, and a download site: Becher et al. (2014) and Hradsky et al. (2019).

      3) Section 2.1: It is impressive to see all those packages and tools you used, but, ideally, you would also provide all, or the most important, scripts you wrote to run these packages and tools. If others are to use your virtual laboratory, they very likely would fail immediately because they would not know how to actually handle all those tools and data sources. I know that there is no culture yet to provide all relevant scripts, but I think we should go there.

      4) The ODD model description in the main text is not bad, but just a verbal summary description while the intention of ODD is to provide all information that is needed to re-implement the model. I understand that much of these details are in the Supplement, e.g. about Initialization and Submodels? It would be good if this link would be made more explicit by having a full ODD in the supplement, as a separate file. It would contain an augmented copy of the ODD of the main text and then just provide, in all detail, the information required for the seven elements of ODD. Why? Because the point of a standard is to follow it exactly so that readers, who either know the standard or learn about it, can easily find certain kinds of information at certain places in the model description. Currently, this is finding of relevant information is made unnecessarily complex. Examples of complete ODDs of complex model are provided by Ayllón et al (2018) and Nabe-Nielsen et al. (2019).

      For producing a complete ODD, please note that a new version of ODD has been published, which in particular has very detailed guidance, in the supplement, about ODD itself, summary ODDs, model narratives, etc.: Grimm et al. (2020). All that said, please note that we certainly do not require that you use ODD (because I am the main proponent of ODD), but any format, that compiles all information needed so that it is easy to find the kinds of information listed in ODD protocol, would be acceptable.

      5) Scales: The model applications relate to a space of 5x5 m (25m2). I am not sure if such a small space allows for realistic dynamics if single corals grow large (> 2-3 m diameter) as then only a very low number of individuals would be present in the simulations potentially leading to artifacts in results. It is a pity that the spatial output of the model is not shown (except one specific figure in S5). I also see a discrepancy between the very high spatial (1cm) and the low temporal resolution (6 month). The time span within half a year could e.g. cover a mild bleaching event or other disturbances as well as processes of reef recovery leading to a different species composition and thus change the reef trajectory without being considered in the present model. I do not see that it is an argument, that the field data are only available in a low resolution of approx. 6. month. A comparison with model processes stays possible even if it is resolved higher.

      6) It is apparent that all model runs cover only a very short time span of around ten years (21 simulation steps). This is extremely short for coral reefs which frequently undergo dynamics based on larger time scales. Thus, emerging dynamics and states, e.g., resulting from the sensitivity analysis, should be discussed with much care.

      7) Overfitting? The model is very impressive, as it is possible to very closely possible represent the dynamics of measured reefs. However, I am not sure if this actually results from some overfitting. The model (runs) include some very strong and very specific influences of external drivers. For example, at the end of a time step certain values for grazing or sand cover are enforced. At least the impact of grazing results from a feedback with different reef processes. Thus, at least much of the trajectories in the model are the result of external drivers and it becomes difficult to analyze self-organization processes in the reef. In short: you cannot claim that a model is producing realistic dynamics due to a realistic representation of its internal organization if in fact the match between model output and observations is imposed by external drivers. A similar case occurred with honeybee colony models, where often the yearly time series of colony size was compared to data to claim that the model was realistic, but that time series was largely driven by the time series of the queen's egg-laying rate (Becher et al. 2013).

      8) A major question thus is whether the authors believe that their model can better address large scale questions about coral reefs, such as their resilience to regime shifts from disturbances and climate change, than 'minimal' models, such as that of van de Leemput et al. (2016)?

      9) In Carturan, Parrott, and Pither (2018) coral functional traits are classified as 'resistance' and 'recovery'. In the current manuscript, the terms 'stress tolerant', 'ruderal', and 'competitive' species (Grimes' classification) is used. Do 'resistance' species and 'recovery' species of Carturan et al. (2018) correspond to 'stress tolerant' and 'ruderal', respectively?

      10) The Title is suboptimal: "mechanistic" and "spatially explicit" applies to hundreds of model, if not more, including coral reef models. The novelty of you work lies in merging the individual-based and trait-based approaches to represent functional diversity. The title should reflect this (but please observe eLife's guidance on titles).

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 11, 2020, follows.

      Summary

      The muscle spindle is one of the most thoroughly studied sensory receptors in the somatosensory system, yet much is still unknown about how it works. Commendably, the authors have attempted to model the responses of spindle sensory afferents using a biophysical model of intrafusal muscle fibers. The model was shown to mimic experimentally recorded afferent activity in a number of situations. Indeed, it is encouraging to see attention being paid again to the elegant complexities of spindle receptors after years of over-simplification in control models. Nevertheless, there are concerns (detailed in the essential revisions below) about those aspects that were left out.

      Essential Revisions

      1) The assumption that extrafusal muscle force can serve as a proxy for intrafusal fiber force needs to be fully addressed. Indeed, there are well known situations for which an assumed correspondence between extrafusal and intrafusal forces would seem to fail to reproduce experimental results. For example, the classical experimental signature used to identify Ia afferents is a cessation in their discharge during an evoked twitch in the extrafusal muscle fibers. Likewise, the model would seem to fail to reproduce spindle afferent responses during imposed length changes with and without concomitant homonymous extrafusal muscle contractions (e.g. Elek, Prochazka, Hulliger, Vincent. In-series compliance of gastrocnemius muscle in cat step cycle: do spindles signal origin-to-insertion length? J. Physiol., 429, 237-258, 1990). The authors need to include additional simulations of these fundamental experimental phenomena and to fully address the outcome in the Discussion.

      2) The authors suggest that their model provides a unifying biophysical framework for understanding muscle spindle activity, yet there was little attention paid to how intrafusal force or yank is transduced into a receptor potential. Such a unifying framework would need to include mechanisms of transduction by mechanically-gated ion channels. As such, the role that sensory transduction mechanisms play in shaping spindle afferent activity needs to be addressed - either in the model or in the Discussion.

      3) The role that intrinsic properties and associated time-varying conductances (e.g. such as those underlying spike-frequency adaptation) in muscle spindle afferents may play in influencing firing dynamics needs to be addressed in the model or in the Discussion.

      4) There needs to be more clarity in the description of the model and what aspects of the model were original and what aspects were based on previous work, for example, that of Campbell et al. (2014) and MyoSim.

      5) The simulated response (i.e. the driving potential) of the biophysical model depicted in Figure 6A to repeated triangular length changes (without pauses) does not resemble the experimental firing rate data to repeated triangular length changes shown in Figure 2B. In particular, the model exhibits marked abbreviation of the responses to the 2nd and 3rd length changes that are not evident in the experimental data of Figure 2. This disparity between experimental and simulated findings needs to be discussed.

      6) Any general model that aims to account for the activity of spindle afferents during natural activities must account for the well-documented independence among alpha, gamma dynamic and gamma static activation patterns and kinematics, whose different effects on Ia activity have been simulated, measured or inferred in a variety of experiments and integrated into previous models (see Mileusnic et al., 2006). The Discussion should identify which scenarios have not been simulated and which might be problematic for their general thesis.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on April 8, 2020, follows.

      Summary

      Heissenberger et al. study how NSUN-1 impacts rRNA methylation and health in nematodes. Eukaryotic ribosomal RNAs undergo several modifications. Among these, there are two known m5C, located in highly conserved target sequences. Previous work from the authors characterised the mechanism underlying one of these modifications in worms (C2381), as well as its functional consequences on cellular and organismal homeostasis. The current work focuses on the second m5C, at position C2982, and identifies NSUN-1 as the putative rRNA methylase. This is a novel and potentially exciting finding.

      Using RNAi in two worm strains, the authors show that knocking down NSUN-1 expression, the specific C2982 m5C level is in part (not entirely) reduced. This assay proves sufficiency (but not necessity) of NSUN-1 to reduce m5C levels at C2982. While it is not clear why the authors do not use a complete knock out for NSUN-1 (is it lethal?), follow-up work using RNAi explores the phenotypic effects of lowered NSUN-1 levels.

      While somatic and germline reduction of m5C levels do not have an impact on worm lifespan, it does increase resistance to heat stress, slight increase in motor activity. Reducing NSUN-1 expression separately in germline and soma showed allegedly lifespan increase. Somatic reduction of NSUN-1 leads to changes in body size, oocyte maturation and fecundity, and has no effect on global protein translation. Analysis of polysome enrichment for specific mRNAs revealed that worms with low levels of NSUN-1 have altered translation of transcripts involved in cuticle collagen deposition.

      Major Points

      1. We are unconvinced by one of the major claims of this work, which is that C2982 has an impact on worm lifespan when expression is down in the soma. This claim does not seem to be strongly supported by the results shown. Were the replicates analysed separately or data from different assays pooled? Median lifespan appears the same between wt and RNAi worms. The survival raw data should be made available for reanalysis.
      2. It is not clear whether deletion mutants for NSUN-1 (e.g. nsun-1(tm6081)) are viable in C. elegans and if yes, what is their phenotype in the context of this study. If the deletion mutant is not available, can the authors generate a CRISPR line?
      3. Is there a relationship between the mRNAs selectively translated in the NSUN-1 RNAi treatment and in the NSUN-5 RNAi/mutant?
      4. The results shown in figure 1 draw a causal connection between NSUN-1 activity and C2982 based on exclusion: in other words, both NSUN-1 and NSUN-5 depletion lower the m5C peak by over 50%. Hence, since there are two m5C sites and one is written by NSUN-5, the other one must be written by NSUN-1. Is it possible that NSUN-1 may not be the only C2982 writer? Can the authors comment on this?
      5. Figure 4 analyzes the gonad and oocyte maturation. While the images are very convincing, it would be good to know how penetrant the phenotype is after analysis of a larger number of animals in each group.
      6. It is unclear how the observed translational remodeling that affects collagen deposition (demonstrated through the gonad extrusion and cuticle barrier phenotypes) is linked to oocyte maturation, or to heat stress resistance.
      7. The authors should indicate how many times the HPLC experiments were done.
      8. In Figure 3 the authors should indicate on each panel the age of the worms and at which stage the RNAi treatment was performed.
      9. The overall claim about behavior should be toned down as the RNAi line has no ovreall improvement, but only one time point shows a difference among the groups. From the text it is not clear what statistical test was used to analyze the differences in behavior among the groups.
      10. Although it may be hard to downregulate rRNAs by RNAi since they are so highly expressed, can the authors comment on whether 26S rRNA levels are reduced after RNAi and if yes to what degree?
      11. While the authors write that rrf-1 is required for amplification of the dsRNA signal specifically in the somatic tissues, this may not be completely accurate, as the Kumsta et al 2012 paper shows that rrf-1 affects both the soma and the germline. How does this affect the interpretation of the results?
      12. Is there a chance that 26S rRNA expression or differential methylation have a tissue-specific pattern (you use RT-qPCR from whole worms)?
      13. May NSUN-1 have pleiotropic effects independent of C2982 m5C?
    1. Reviewer #2

      The manuscript by Seong et al., describes the development of a sophisticated activity monitoring system that is able to determine with, great accuracy, the timing of major life stage transitions during Drosophila development. Specifically the system relies on time lapse imaging and A.I. based learning to pinpoint three transitions 1) larval to pupal 2) pupal to adult 3) adult to death. The basic principle is to establish the location of a larva, pupa or adult at each time point within either a 96 or 384 well plate and then determine if it has changed at the next time point. Since larva and adults are motile and pupa and dead flies are not then it is conceputually eassy to distinguish the stage transition through location changes by monitoring location changes. The authors demonstrate that the system works, at least for the W1118 genetic background, and is able to replicate known developmental characteristics such as the fact the females typically eclose a few hours before males and that the timing of the larval to pupal transition is diet (sugar) sensitive. They also demonstrate that it can be used to establish Kaplan-Meier lifespan curves which are capable of distinguishing environmental effects on adult lifespan such as the presence of DDT or paraquat in the food. Overall this system appears to have great potential for quantitatively measuring a number of developmental parameters that are presently very tedious to determine manually and are therefore not amenable to high throughput procedures that are needed for genetic and drug screening.

      I do not feel competent to comment on the software development and AI procedures used to train the system other than to say that they appear to work quite reliably as long as the optics are not disturbed. Herein lies the biggest disappointment.

      1) The authors conclude their Results section by saying that they cannot reliably measure lifespan in common strains such as Oregon R and Canton S because of accidental death effects due to such issues as water condensation in the wells and also due to blockage of the optical light path by the spread of food particles and feces on the well lid that obscures detection of the fly's position during imaging. The authors say that additional refinements of the system will be needed to overcome these challenges for adult lifespan analysis. I wonder, however, if the authors have tried something as simple as replacing the lid of the microtiter dish at some frequency during the lifespan measurements. I recognize that the entire chamber will need to be immersed in a C02 chamber or cooled to knock the flies out and that this may influence the lifespan kinetics, but have the authors attempted anything like this as a work around to the degenerating light path and water accumulation issues during aging studies?

      Despite this drawback, I think the system still has significant utility for assaying environmental and genetic effects on larval to pupa and pupal to adult transitions and this makes it is worth communicating to the Drosophila research community.

    2. Reviewer #1

      The paper entitled The Drosophila Individual Activity Monitoring and Detection System (DIAMonDS) highlights a new detection/tracking system which utilizes a flatbed CCD scanner to track and identify multiple life cycle events (pupariation, eclosion, and death) using a newly developed algorithm. In support of this novel monitoring system, the authors provide multiple examples of the tracking system in action, including analysis of larval and adult movement and the detection of pupariation and eclosion at a high temporal resolution. The authors also provide several examples of more complex experiments which can be accomplished in a high-throughput manner using DIAMonDS, including lifespan and stress resistance assays. As described, this system would provide a researcher with an automated tool for measuring the timing of multiple major developmental milestones in Drosophila development- essentially allowing for more accurate and less labor intensive observation.

      While DIAMonDS is certainly valuable in its current incarnation, the authors do note a number of worrying limitations which I believe should be resolved prior to publication, and there are several areas of the manuscript where I believe more detail is warranted.

      Major Critiques

      1) As DIAMonDS detects changes between the static and active phases in the Drosophila lifecycle through changes in motion, it is essential that the authors demonstrate (or provide an explanation of) how they discriminate between less motile stages of development (or death) and normal cessation of motion while alive, such as in grooming or sleep behavior in the adult.

      2) Similar to critique #1, this type of motion detection may not be as effective in animals with some form of locomotor defect. An additional experiment demonstrating that DIAMonDS can reliably detect and classify larvae or adults with reduced locomotion is prudent to demonstrate that it can work, even if the flies are impaired in some way.

      3) It is currently unclear how the DIAMonDS system handles events that occur "off-camera", as can be observed in frame 77-79 of supplementary video #1. This may be a potential sources of error during tracking- for example, if a larvae crawls into an area where it is not observed and pupates, or if an animal dies out of view.

      4) It is unclear how (or if) the system would discriminate between pupation and a dead larvae. A failure to account for this could easily result in a false-positive for pupation.

      5) Line 198: I was unable to locate a description of the semi-automatic (TH) methods presented here in the materials and methods section.

      6) The inability of the methods described in the manuscript to handle Oregon-R or Canton-S strains significantly limits the usefulness of the system. A set of optimal conditions for common laboratory strains should be included with the manuscript.

      7) As admitted by the authors, the size of the wells may adversely affect fly health. It would be worthwhile to see a comparison between the smaller chambers and a larger chamber, so as to allow for future users of the system to make more informed decisions about how to implement it.

    3. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on July 3, 2020, follows.

      We encourage you to carefully read the critiques and address all the issues that are listed below.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on June 30, 2020, follows.

      Summary

      In this manuscript, findings from tomographic datasets of 10 C. elegans meiotic spindles from metaphase and anaphase (early, mid, and late) spindles (6 MI and 4 MII) are presented. The focus of the manuscript is on the observation that the transition from metaphase to anaphase involves a significant reorganization of the structure in which the number of MTs increases and the mean length decreases 2-fold. The authors develop a mathematical model to assess the relative contributions of 1) changes in MT dynamics, and 2) increased MT severing activity to the reorganization phenomenon. The model explains the data by a global change in MT dynamics and, in fact, indicates that MT severing makes hardly any contribution to the MT shortening observed in anaphase. The work is timely and the topic is of great interest; the quality of the EM data is excellent and these data can be expected to become a valuable resources in the field.

      Essential Revisions

      1) To compare the model with the data, the authors "average away" a large amount of detailed information present in the EM data and make additional simplifying assumptions that may be questioned. For example, it may be an oversimplification to assume mono-modal length distributions in the model that can be described by averages. In figure 1B the metaphase spindles look like there are two populations. The situation in anaphase looks even more complicated particularly if there is a surge of nucleation at the start of anaphase generating new short MT. The detailed 3D data sets are simplified down to a single spatial dimension (the spindle axis) and single length estimator (the average). The authors should provide some evidence/do some tests to validate their approach. How sensitive are the predictions of the model to the simplifying assumptions made and to the averaging out of detail?

      2) A major weakness of the manuscript is considered to be the lack of experimental test of the prediction of the model which the authors present as their main conclusion. It should be possible to perform FRAP experiments to test the effect of katanin mutants on microtubule turnover to confirm or contradict the main conclusion that the authors derive from their model and that in part argues against pervious work. There is a well-characterized (and fast acting) ts allele of MEI-1 called mei-1(or642) (O'Rourke et al., PLoS One 2011 and McNally et al., MBoC, 2014) that could be used to test the effect of katanin on microtubule turnover by FRAP.

      3) The authors should please be a bit clearer which EM data sets are new and which ones were re-used from previous work (for example including this information in Table 1). The expectation would be that the information from the new datasets is also used in the theoretical analysis presented in this manuscript. The context to previous work by others could be explained more clearly by being more specific when presenting background in the introduction so that it will be easier to understand what's new and different here compared to previous work (particularly compared to Yu et al. 2019 and Srayko et al. 2006).

      4) Technical concerns: 4.1) FRAP analysis: To which extent does flux versus microtubule polymerization/depolymerization contribute to recovery. Is using a mono-exponential function to fit the recovery curves justified given that flux may contribute to recovery? How are the FRAP data used in the model? Is the contribution from flux to recovery considered separately from the contribution of polymerization/ depolymerization?

      4.2) p.10, 2nd paragraph: Is the observed decrease in average microtubule length really independent of position? What is the factor of decrease as a function of position? Is the notion of global vs local change really fully supported by the data?

      4.3) Model: Are microtubule minus ends considered stable after severing? Alpha is introduced, but the authors do not seem to come back to it later. What is it?

      4.4) Model: Throughout, it would be useful to provide confidence intervals for the values that the authors extract from their model or provide some other statistical measure for the reliability of the prediction.

      4.5) Does the model make the same predictions for meiosis II spindles or is turnover regulation different there?

      4.6) Model: On page 9, lines 15-17 - the authors claim that if all the dynamic parameters except nucleation do not change then the length distribution should not change. However, if there is a change in nucleation, there will be a short-term increase in short MT, thereby shifting the length distribution.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on June 23, 2020, follows.

      Summary

      It has been previously shown that resistance to parasitoid wasps can emerge upon selection in wild-type Drosophila populations, and that this increased resistance correlates with a higher number of hemocytes. This paper combined experimental evolution and single cell transcriptomics to show that increased resistance to parasitoids upon several rounds of selection is caused by the presence of a differentiated subset of hemocytes (pre-lamellocyte) in the unchallenged state, which is usually found only upon wasp infestation. This led the authors to conclude that intense pathogen pressures can shift the immune system from inducible to constitutive, consistent with a theoretical framework indicating that elevated and constant pathogen pressure should lead to the emergence of constitutive defense. The approach is interesting, the paper well-written and the notion tested interesting. An important concern is the degree of advance over previous studies. Initial papers investigating how selection increases resistance to wasps have already shown that this was linked to an increase in hemocyte number. In a certain sense, this could be considered as a demonstration of a change from inducible to constitutive defense, although the emphasis of these papers was not on this point. In addition, the current work provides so far only limited information on this specific population of pre-lamellocytes.

      Essential Revisions

      1) Analysis of single cell RNA-seq (Figure 3). Several RNA-seq papers have been published and it is important that the authors better relate their hemocyte clusters to other scRNA-seq datasets using the nomenclature of some of these papers. Would their data be deposited in a database? It would also be great to better describe the transcriptional profile, and not only focus on two genes, Attila and PPO3.

      2) Discrepancy between the transcriptional and morphological changes in the hemocytes.

      2a) Earlier studies, both on hemocyte flow cytometry and in other scRNA-seq experiments (as cited in the manuscript) revealed that the transdifferentiation into lamellocytes is a dynamic / continuous process, which may derive from several hemocyte lineages and from different hematopoietic organs. The authors here showed a discrepancy in the transcriptional and morphological changes in the hemocytes, and revealed that the plasmatocyte lineage was already starting the resemble the lamellocytes (in gene expression), without needing the induction by infection. Yet, they were not yet fully differentiated hemocytes based on morphology, and still needed infection to reach that stage. Therefore, the conclusion that the selected lines had "hard-wired" the inducible response into a constitutive response is not fully warranted (they do not fully differentiate, but proceed partially towards that state). Also, the differentiation of lamellocytes is fully attributed to originate from lymph glands and as originating from the plasmatocytes, while different organs and hemocyte lineages appear to contribute to the population of lamellocytes. The reviewer feel that all these aspects should be further explored and would deserve some mentioning in the discussion.

      2b) Along this line, the authors could do a better job in characterizing the hemocyte populations of the evolved lines using available antibody, cooking and other melanization assays, phalloidin treatment...

      2c) Third instar hemocytes are found in the sessile state, in circulation or in the lymph gland. It could not be excluded that some of the changes they observed relate more to changes in hemocyte localization rather than differentiation. According to the material and methods, the authors has collected only the circulating hemocytes in the unchallenged state as they did not vortex larvae. It is very important to better compare the lymph gland, sessile and circulating compartments of the evolved and the non-evolved lines. This can be done by using various staining methods. The paper is written in such a way that selection acted only on circulating hemocytes but it could also act on hemocytes localization (decrease sessility), lymph gland maturation....

      3) Gene expression was measured in circulating hemocytes at 48h after infection.

      The authors measured gene expression in circulating hemocytes, 48h after infection, at which stage hemocyte proliferation, lamellocyte differentiation and parasitoid encapsulation is already well underway. The induction of the critical two processes, hemocyte proliferation and lamellocyte differentiation, may not be fully detectable from gene expression of only the circulating hemocytes themselves at this late stage of the immune response. Clearly, the authors do show that differentiation from circulating plasmatocytes can be detected, using pseudotime, and also revealed changes in gene expression in uninfected selected larvae. Yet, how induction in the lymph glands or sessile clusters has changed by experimental evolution, and whether the inducible response had indeed proceeded towards a constitutive response, requires further investigation along a wider time course (e.g. during early larval development) and perhaps in different tissues (e.g. lymph glands). If the author cannot address this, this aspect would need some discussion.

      4) The changes in gene expression after selection can be presented clearer.

      The description of these results (from L111 onwards), and Figure 2, difficult to read and understand, while it is key to the claim that the inducible response has become hardwired into a constitutive response. In the text it starts out with saying that "data was pooled to investigate global changes" (L116-117), but then it refers in Figure 2 to the x-axis which only provides the data for the control lines. This figure 2 is difficult to grasp, as the strong positive correlation in a) means something different (i.e. stronger constitutive response) than the very similar positive correlation in b) (weaker induced response), while c shows that control and selected larvae respond the same to infection. Is there a better way to tease apart these patterns in a figure, and to explain them in the text? Also, the data is all expressed in log2 fold changes (relative to non-infected control line individuals?). Also, for a subset of approximately 170 genes, the authors showed that the increase in expression had already started without the infection in the selection lines. Do the functional annotations of these genes reveal anything of interest for hemocyte proliferation and the differentiation towards lamellocytes?

      5) Other studies came to partially contrasting, partially similar conclusions.

      Transcriptomics on whole larvae after experimental evolution for high parasitism was done for Drosophila, using a different parasitoid species. In this study, they also found the typical increased density of hemocytes in Drosophila selected for increased parasitoid resistance, without being infected. However, contrary to the authors, this study concluded this increase in hemocytes could not be attribute to a pre-activation of the immune response. Additionally, the genes for hematopoiesis and for several effector genes showed opposite patterns to those that would explain the increased density of hemocytes in selected lines or for an pre-activation of the inducible response (Wertheim et al, 2011, Molecular Ecology). However, in line with the findings for the current study, whole-larvae RNAseq after parasitoid infection did not result in substantial gene expression differences between selected lines and control lines (Salazar et al, 2017, BMC Genomics), while substantial differences were reported in uninfected larvae of selection and control line larvae (Wertheim et al, 2011, Molecular Ecology). These whole-body transcriptomics experiments lacked the resolution to measure specifically what changed in hemocytes, but both studies indicate that much of the increased resistance after selection is likely caused by changes in constitutive immunity, not by increasing the acute/inducible immune response.

      6) Another concern is related to the parasitoid species. Leptopilina boulardi is a parasitoid that relies partly on VLPs to overcome the host defense. This is not discussed, not even mentioned. Some older work (Fellowes et al 1999, Evolution), shows that, while resistance evolves readily against L. boulardi, populations resistant against L. boulardi are also cross-resistant to another Leptopilina species. The immune effectors studied in this manuscript are obviously playing a significant role, but how do the evolved flies cope with the VLPs? The paper would benefit from at least discussing this issue.

      7) The selection of larvae for the single cell work warrants some clarification. According to figure 1b just under 50% of parasitoid resistant larvae showed an increased encapsulation response. This is presumably also related to the increase of expression of immune effectors. How is this accounted for in the single cell work? And if not, do you have any way to get an estimate of the variance in the response variables?

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on June 23, 2020, follows.

      Summary

      Based on their previous work showing that cell proliferation and differentiation are associated with distinct tRNA programs and codon usages, the authors employed a CRISPR-Cas9 based approach to deplete families of tRNAs belonging to "proliferation" and "differentiation" groups and test the effects of such manipulation on the fitness of cells in different proliferative states. Using competition assays, the authors provide evidence that "proliferative" tRNAs are more essential in fast-proliferating cells, while "differentiation" tRNAs exert higher essentiality in slower proliferating cells. The authors also determined the essentiality of investigated tRNAs in senescent and quiescent cells which revealed more complex patterns. Overall, it was thought that this study is of broad potential interest inasmuch as it suggests that tRNAs have distinct essentiality in different cells and across distinct proliferative states. Moreover, it was found that this constitutes pioneering work wherein the effects of systematically knocking out tRNA genes are directly studied, an important milestone by itself when considering the abundance and variability of isodecoder species and the homology between isoacceptors. Notwithstanding the overall enthusiasm for the potential importance of the study and uniqueness of the approach, it was found that several major issues should be addressed to corroborate author's conclusions as outlined below.

      Essential Revisions

      1) It was thought that a number of important controls were missing. The potential off-target effects of CRISPR-Cas9 method need further validation. Fig S1B should be extended in order to clarify which sgRNAs are potentially off-targeting which tRNA. The manuscript would also benefit from experimentally testing the off-target effects of some of the sgRNAs, especially those binding to other tRNA families. To accurately compare HeLa cells with fibroblasts, the authors should determine potential tRNA expression and codon usage differences between them. Moreover, the efficacy of tRNA depletion between the cell lines should be assessed. Figure 5-additional controls should be provided to ascertain that cells are indeed in quiescent and senescent states. In Figure 5A, it should be explained why the 3 day time point was used when in the most of the study it is shown that the strongest effects occur after 7 days of induction.

      2) Some experimental conditions remain unclear. For instance, it is noted that sgRNA plasmids were selected by puromycin, whereby WI38 cells appear to already be puromycin resistant. It is also not clear how were competition assays carried out in cell arrested states. In general, it was thought that the authors should be more specific regarding their read-outs (i.e. specify whether proliferation or survival were monitored).

      3) Several issues were raised apropos statistical analyses. In figures 3C and D, to assess whether tested variables are truly independent, the authors should use a linear regression modelling Relative fitness ~ tRNA expression (in C) and Relative fitness ~ fraction CRISPR targeted tRNAs (in D). In addition, it is not clear why is z-transformation applied in figure 5E? The heatmap summarizes tRNA essentiality, which in figures 3 and 5C, is depicted using an untransformed log2FC. Using z-transformed and untransformed values to estimate the same effects was thought not to be advisable. Finally, the authors should also include the number of biological replicates, types of statistical tests and their outcomes in each figure where applicable, as in some cases these are missing.

      4) Several statements were found not to be adequately supported by the data. For example, the statement: "our results show that some tRNAs are essential specifically for cancerous cells and not in differentiated cells ... (and the next sentence)", was found not to be supported by the presented data. To this end, the authors are advised either to provide data corroborating these conclusions or to tone down their statements. Also, in discussion section, given that this work is the first in systematically knocking out tRNA gene families, some comment on the potential and limitations of the method appears to be warranted.

    1. This manuscript is in revision at eLife

      The decision letter after peer review, sent to the authors on June 12, 2020, follows.

      Summary

      All three reviewers agree that the research question under study, the requirement of the cross-talk between two important developmental signaling pathways - retinoic acid and the NO - for amphioxus pharynx development, is in principle interesting and could be suitable for publication in eLife.

      However, at present there are major open concerns especially on the lack of statistical analyses, quality of data presentation and inconsistencies with previously published work, that need to be addressed. Although it is the current policy of eLife to avoid additional experiments in revisions as much as possible, this is unfortunately likely impossible to fulfil with the current manuscript in order to bring it to a level that matches the standards of eLife. However, we think that in many cases an improvement of analyses and data presentation will likely already significantly improve the manuscript.

      1) The presented study is a follow-up on a previous paper by the same lab (Annona et al 2017 ; DOI:10.1038/s41598-017-08157-w). When comparing the work of this previous study with the current manuscript two major discrepancies are apparent:

      In Annona et al the two drugs were used to inhibit NOS production: L-NAME and TRIM, while only one inhibitor was used in the present study. Furthermore, there appear to be discrepancies concerning the developmental time windows during which chemical disruption of NO signaling is effective described in the two publications. This needs to be clarified.

      The timing of NosA,B,C expression, the suggested regulation of NosA and B by retinoic acid (RA) and the detected presumptive RARE regulatory elements in the genome don't match. More specifically, NosA,B,C expression at 24 hours (or around this time point) was investigated by Annona et al, 2017. Based on these data, NosA is not expressed during development, whereas NosB and NosC are expressed. In the submitted manuscript, the authors show that NosA and NosB are upregulated upon RA treatment, whereas NosC shows no changes in expression. They therefore suggest that RA regulates NosA and NosB transcription. Since only NosB is expressed during the relevant timepoints at early development, the transcription of this gene could be under the regulation of RA. However, when the authors look into the retinoic acid response elements (RARE) in the genomic region of NosB, they only find a DR3, which is not the typical RARE. They find DR1 and DR5 (apart from DR3's), which are more typical RARE's, in the genomic region of NosA, but as mentioned this gene is not expressed during development. This makes the hypothesis of a direct regulation of NosA and NosB by RA during normal development unconvincing. Can the author dissolve these apparent discrepancies?

      2) The authors study the open chomatin structure at 8, 15, 36 and 60 hours, thus time points, which do not overlap with the drug treatment period (24-30 hours). They need to analyze the genome architecture at this time period.

      3) The previous work by Annona et al 2017 et al shows that a major peak at NO levels occurs later than the chosen treatment window. How do NO levels during the time window of the experiment compare with other studies, i.e. is there evidence these are relevant levels? This is particularly noteworthy, as there is no control experiment showing that TRIM incubation affects NO levels or NO signaling during the incubation period (e.g. DAF-FM-DA staining or by NO quantification). It is therefore not possible to estimate the specificity of the resulting phenotypes.

      We thus request from the authors to provide ISH patterns of all the Nos genes, as well as NO localisation from at least 2 timepoints (e.g. start and end of window) of the TRIM application window.

      4) One overarching critique is that the general description of the figures and hence also the phenotypes are of poor quality. An improvement of this point will already majorly improve the entire manuscript.

      Fig.1A: Indicate developmental stages (N2, N4, T1, T2, T3, L0) together with the hours-post-fertilization (hpf) to facilitate the understanding of the treatment period with respect to the development of amphioxus.

      Fig.1B: Outline pharyngeal region e.g. with thin, dashed white lines in longitudinal and cross-sections and indicate relevant anatomical structures (club-shaped gland, endostyle, gill slits) e.g. with an arrow. Is the endostyle positioned more ventrally in TRIM treated larva?

      Figure 1C: why are Cyp26.3, Rdh11/12.18 and Crabp shown in triplicates?

      Fig.1B: The 'digital sectioning' method using confocal imaging and reconstruction of nuclear stainings is not suited to characterize the phenotype. Due to the loss of signal in deeper regions, morphological structures (e.g. differences in pharyngeal and gill slit morphology, endostyl, club-shaped glands) are impossible to recognize.

      Fig.3B: the heads of these amphioxus should be annotated to indicate key structures for non-amphioxus specialists. Ideally the images should be higher magnification and resolution as well, as the morphology is currently not very clear.

      Fig.3A and B: Furthermore, the morphological differences between 'altered', 'partially recovered' and 'recovered' is unclear. Fig.3B does not help understanding changes as the pictures are too small to recognize any morphological details without staining, and no structures are indicated. It is also unclear how animals scored as 'altered', 'partially recovered' and 'recovered' differ in their morphological structures. And does 'recovered' mean that these embryos show an initial phenotype that then 'recovers' during development, or do they show a completely normal development?

      5) Missing statistics/statistical information: Lines 85-89 (Fig.1): Where is the evidence that there is reduction in pharynx length? Where is the evidence for a smaller first gill slit? Measurements with a decent sample size and a basic statistical test must be provided.

      The description of ISH pictures in Fig.2A lacks any quantification and thus any information on the penetrance of the respective phenotypes are (as in Fig 3C). The lack of any 'negative control genes' (the large set of genes that, based on the RNASeq dataset, should not be affected) make it difficult to judge how specific changes in AP axis and RA pathway genes are.

      How did the authors obtain the qRT-PCR calculations? They need to clarify how they obtained the Fold changes shown in the histograms .e.g. by showing the maths behind the result when marking the cells in the excel sheet. The raw data for rpl32 is missing for Crabp in Figure 2B. The qPCR results in Fig.2B-E lack significance tests.

      6) The RNA-Seq study needs improvements: The PCA (Fig.S1C) shows no concordance among control samples or treated samples. Also, the histogram shows a clustering of replicates, and NOT of 'treated' and 'control' samples. This casts doubts on the quality and validity of the RNASeq dataset. These doubts are not removed by the current validation experiments, as these experiments tested only significantly upregulated genes by RNA-Seq, while downregulated and non-significant genes as 'controls' are missing. These additional controls are necessary to assess the validity of the RNA-Seq data.

      7) More information about the details of the ATACseq and ChIPseq data used, as well as the general RA responsive elements prediction is required.

      For example, in what amphioxus samples (and treatments if any) are these ATACseq and ChIPseq signals seen? There is some detail provided in the Methods section, but something is odd here and perhaps needs some further explanation. Since the two relevant Nos genes are supposedly not active during development then why do they have ATACseq and ChIPseq signals from embryo and larval samples? Why should these two Nos genes have apparently active regulatory elements focused on RAREs when the genes are not normally expressed under the control of RA, but only become active when exogenous RA is applied? We may well have missed something in the logic here, but this merely shows that the current level of explanation is insufficient.

      The analysis of RA responsive elements lacks statistical analysis and depth. It is left unclear how many RAREs would be expected by chance on a 52kb resp. 25kb locus. In addition, the authors include all ATAC-Seq peaks from stages ranging between 8h and 60hpf, while the window of RA responsiveness has been tightly restricted to the 24h-30hpf window. Also, as NosC expression levels stay constant upon RA incubation, it would be crucial to know if the NosC locus lacks any open RARE sites (as would be expected).

      The authors use NHR-SCAN tool to predict putative direct repeats binding sites in the genomic sequence of NosA and NosB. Which consensus sequence does the program follow? It appears that it does not follow the consensus sequence for typical RARE ((A/G)G(G/T)TCA), since the sequence for DR1 deviate from this sequence? DR1, DR2 and DR5 are the commonly described binding RARE's for the RAR/RXR heterodimers. Further, DR8 has been described as retinoic acid dependent regulation of gene transcription through RAR/RXR (Moutier et al., 2012). The authors need to provide clarification which are the most commonly used RARE's of the DR's detected.

      Please also mention if RAREs fall within an intron in the genomic regions of the Nos genes, since the transcriptional regulation through RARE is often associated to introns.

      8) Information on the concentration dependency of compounds used in the rescue experiment is lacking. Please explain why the BMS009 concentration used here (10exp-6 M) is 10x higher than the highest concentration used in the original publication on amphioxus pharynx development (Escriva et al., Development 2002).

      9) A summary drawing of the regulatory loop between NO and RA would be informative, also indicating the known target genes (from this study).